TW202408245A

TW202408245A - Coding method and apparatus, decoding method and apparatus, and coder, decoder and storage medium

Info

Publication number: TW202408245A
Application number: TW112125658A
Authority: TW
Inventors: 虞露; 金峽鈳; 朱志偉; 戴震宇
Original assignee: 浙江大學; 大陸商Ｏｐｐｏ廣東移動通信有限公司
Priority date: 2022-07-11
Filing date: 2023-07-10
Publication date: 2024-02-16
Also published as: WO2024011386A1

Abstract

The present application provides a coding method and apparatus, a decoding method and apparatus, and a coder, a decoder and a storage medium. For an application scenario that comprises visual media content in one or more expression formats, isomorphic blocks in different expression formats are stitched into a heterogeneous hybrid stitched image, isomorphic blocks in the same expression format are stitched into an isomorphic stitched image, and the obtained stitched images and stitched image information are written into a code stream. An isomorphic stitched image and a heterogeneous hybrid stitched image are simultaneously present in a code stream, such that the coding method and the decoding method are applicable to the application scenarios of visual media content in a plurality of expression formats, thereby expanding the application scope. Moreover, the code stream includes a first syntactic element, such that the efficiency of a decoding end decoding a stitched image can be improved. Since isomorphic blocks in different expression formats are stitched in a heterogeneous hybrid stitched image for coding and decoding, the number of decoders invoked can be reduced, thereby reducing the implementation cost and improving the usability.

Description

A coding and decoding method, device, encoder, decoder, storage medium and code stream

本申請涉及影像處理技術領域，尤其涉及一種編解碼方法、裝置、編碼器、解碼器及儲存媒介。The present application relates to the field of image processing technology, and in particular to a coding and decoding method, device, encoder, decoder and storage medium.

在三維應用場景中，例如虛擬實境(Virtual Reality，VR)、增強現實(Augmented Reality，AR)、混合現實(Mix Reality，MR)等應用場景中，在同一個場景中可能出現表達格式不同的視覺媒體物件。例如在同一個三維場景中，以視訊表達了場景背景與部分人物和物件、以三維點雲或三維網格表達了另一部分人物。In three-dimensional application scenarios, such as virtual reality (VR), augmented reality (AR), mixed reality (Mix reality, MR) and other application scenarios, different expression formats may appear in the same scene. Visual media objects. For example, in the same three-dimensional scene, the scene background and some characters and objects are expressed using video, and another part of the characters are expressed using three-dimensional point clouds or three-dimensional grids.

在壓縮編碼時分別採用多視點視訊編碼、點雲編碼、網格編碼，會比全部投影成多視點視訊編碼更能保持原表達格式的有效資訊，提高觀看時所渲染的觀看視窗的品質，提高碼率-品質的綜合效率。When compressing and encoding, using multi-viewpoint video encoding, point cloud encoding, and grid encoding respectively will maintain the effective information of the original expression format better than projecting all into multi-viewpoint video encoding, improve the quality of the viewing window rendered during viewing, and improve Bit rate-quality overall efficiency.

但是，目前的編解碼技術，對多視點視訊、點雲編碼和網格網格分別進行編解碼，其編解碼過程中需要調用的編解碼器個數較多，使得編解碼代價大。However, the current encoding and decoding technology encodes and decodes multi-view video, point cloud encoding, and grid meshes separately. A large number of codecs need to be called during the encoding and decoding process, making encoding and decoding expensive.

本申請實施例提供了一種編解碼方法、裝置、編碼器、解碼器及儲存媒介。Embodiments of the present application provide a coding and decoding method, device, encoder, decoder and storage medium.

第一方面，本申請提供了一種解碼方法，應用於解碼器，包括：In the first aspect, this application provides a decoding method applied to a decoder, including:

解碼碼流，得到拼接圖和拼接圖資訊，其中，所述拼接圖資訊包括第一語法元素，根據所述第一語法元素確定拼接圖為異構混合拼接圖或者同構拼接圖；Decode the code stream to obtain the spliced image and the spliced image information, wherein the spliced image information includes a first syntax element, and the spliced image is determined to be a heterogeneous hybrid spliced image or a homogeneous spliced image according to the first syntax element;

根據所述第一語法元素確定所述拼接圖為異構混合拼接圖時，根據所述拼接圖的拼接圖資訊對所述拼接圖進行拆分，得到至少兩種同構區塊，其中，所述至少兩種同構區塊對應不同的視覺媒體內容表達格式；When the mosaic is determined to be a heterogeneous hybrid mosaic according to the first syntax element, the mosaic is split according to the mosaic information of the mosaic to obtain at least two types of isomorphic blocks, wherein: The at least two isomorphic blocks correspond to different visual media content expression formats;

根據所述第一語法元素確定所述拼接圖為同構拼接圖時，根據所述拼接圖的拼接圖資訊對所述拼接圖進行拆分，得到一種同構區塊，其中，所述一種同構區塊對應相同的視覺媒體內容表達格式；When it is determined that the spliced graph is a isomorphic spliced graph according to the first syntax element, the spliced graph is split according to the spliced graph information of the spliced graph to obtain a homogeneous block, wherein the isomorphic block is obtained. The building blocks correspond to the same visual media content expression format;

對所述同構區塊進行解碼重建，得到至少一種表達格式的視覺媒體內容。The homogeneous blocks are decoded and reconstructed to obtain visual media content in at least one expression format.

第二方面，本申請提供了一種編碼方法，應用於編碼器，包括：In the second aspect, this application provides an encoding method, applied to the encoder, including:

對至少一種表達格式的視覺媒體內容進行處理，得到至少一種同構區塊，其中，不同種同構區塊對應不同的視覺媒體內容表達格式；Process the visual media content of at least one expression format to obtain at least one isomorphic block, where different types of isomorphic blocks correspond to different visual media content expression formats;

對所述至少一種同構區塊進行拼接，得到至少一個拼接圖和拼接圖資訊，其中，所述拼接圖資訊包括第一語法元素，根據所述第一語法元素確定所述拼接圖為異構混合拼接圖或者同構拼接圖，所述異構混合拼接圖包括至少兩種同構區塊，所述同構拼接圖包括一種同構區塊；The at least one homogeneous block is spliced to obtain at least one spliced graph and spliced graph information, wherein the spliced graph information includes a first syntax element, and the spliced graph is determined to be heterogeneous according to the first syntax element. A hybrid mosaic diagram or a homogeneous mosaic diagram, the heterogeneous hybrid mosaic diagram includes at least two types of isomorphic blocks, and the isomorphic mosaic diagram includes one type of isomorphic block;

對所述至少一個拼接圖和拼接圖資訊進行編碼，得到碼流。The at least one spliced image and the spliced image information are encoded to obtain a code stream.

第三方面，本申請提供了一種解碼裝置，應用於解碼器，其中，包括：In a third aspect, this application provides a decoding device, applied to a decoder, which includes:

解碼單元，用於解碼碼流，得到拼接圖和拼接圖資訊，其中，所述拼接圖資訊包括第一語法元素，根據所述第一語法元素確定拼接圖為異構混合拼接圖或者同構拼接圖；The decoding unit is used to decode the code stream to obtain the splicing image and the splicing image information, wherein the splicing image information includes a first syntax element, and the splicing image is determined to be a heterogeneous hybrid splicing image or a homogeneous splicing image according to the first syntax element. Figure;

第一拆分單元，用於根據所述第一語法元素確定所述拼接圖為異構混合拼接圖時，根據所述拼接圖的拼接圖資訊對所述拼接圖進行拆分，得到至少兩種同構區塊，其中，所述至少兩種同構區塊對應不同的視覺媒體內容表達格式；A first splitting unit, configured to split the spliced image according to the spliced image information of the spliced image to obtain at least two kinds of Isomorphic blocks, wherein the at least two isomorphic blocks correspond to different visual media content expression formats;

第二拆分單元，用於根據所述第一語法元素確定所述拼接圖為同構拼接圖時，根據所述拼接圖的拼接圖資訊對所述拼接圖進行拆分，得到一種同構區塊，其中，所述一種同構區塊對應相同的視覺媒體內容表達格式；The second splitting unit is used to split the spliced image according to the spliced image information of the spliced image to obtain a homogeneous region when it is determined that the spliced image is a isomorphic spliced image according to the first syntax element. Blocks, wherein said one isomorphic block corresponds to the same visual media content expression format;

處理單元，用於對所述同構區塊進行解碼重建，得到至少一種表達格式的視覺媒體內容。A processing unit configured to decode and reconstruct the homogeneous blocks to obtain visual media content in at least one expression format.

第四方面，本申請提供了一種編碼裝置，應用於編碼器，其中，包括：In a fourth aspect, this application provides an encoding device, applied to an encoder, which includes:

處理單元，用於對至少一種表達格式的視覺媒體內容進行處理，得到至少一種同構區塊，其中，不同種同構區塊對應不同的視覺媒體內容表達格式；A processing unit, configured to process visual media content in at least one expression format to obtain at least one isomorphic block, wherein different types of isomorphic blocks correspond to different visual media content expression formats;

拼接單元，用於對所述至少一種同構區塊進行拼接，得到至少一個拼接圖和拼接圖資訊，其中，所述拼接圖資訊包括第一語法元素，根據所述第一語法元素確定所述拼接圖為異構混合拼接圖或者同構拼接圖，所述異構混合拼接圖包括至少兩種同構區塊，所述同構拼接圖包括一種同構區塊；A splicing unit, configured to splice the at least one isomorphic block to obtain at least one spliced image and spliced image information, wherein the spliced image information includes a first syntax element, and the spliced image is determined according to the first syntax element. The mosaic diagram is a heterogeneous hybrid mosaic diagram or a homogeneous mosaic diagram, the heterogeneous hybrid mosaic diagram includes at least two types of isomorphic blocks, and the isomorphic mosaic diagram includes one type of isomorphic block;

編碼單元，用於對所述至少一個拼接圖和拼接圖資訊進行編碼，得到碼流。An encoding unit is used to encode the at least one spliced image and the spliced image information to obtain a code stream.

第五方面，提供了一種解碼器，包括第一記憶體和第一處理器；所述第一記憶體儲存有可在第一處理器上運行的電腦程式，以執行上述第一方面或其各實現方式中的方法。In a fifth aspect, a decoder is provided, including a first memory and a first processor; the first memory stores a computer program that can be run on the first processor to execute the first aspect or each of the above. Methods in the implementation.

第六方面，提供了一種編碼器，包括第二記憶體和第二處理器；所述第二記憶體儲存有可在第二處理器上運行的電腦程式，以執行上述第二方面或其各實現方式中的方法。In a sixth aspect, an encoder is provided, including a second memory and a second processor; the second memory stores a computer program that can be run on the second processor to execute the above second aspect or each of them. Methods in the implementation.

第七方面，提供了一種編解碼系統，包括編碼器和解碼器。編碼器用於執行上述第二方面或其各實現方式中的方法，解碼器用於執行上述第一方面或其各實現方式中的方法。The seventh aspect provides a coding and decoding system, including an encoder and a decoder. The encoder is configured to perform the method in the above second aspect or its implementations, and the decoder is used to perform the method in the above first aspect or its implementations.

第八方面，提供了一種晶片，用於實現上述第一方面至第二方面中的任一方面或其各實現方式中的方法。具體地，該晶片包括：處理器，用於從記憶體中調用並運行電腦程式，使得安裝有該晶片的設備執行如上述第一方面至第二方面中的任一方面或其各實現方式中的方法。An eighth aspect provides a wafer for implementing any one of the above-mentioned first to second aspects or the method in each implementation manner thereof. Specifically, the chip includes: a processor for calling and running a computer program from the memory, so that the device installed with the chip executes any one of the above-mentioned first to second aspects or their respective implementations. Methods.

第九方面，提供了一種電腦可讀儲存媒介，用於儲存電腦程式，該電腦程式使得電腦執行上述第一方面至第二方面中的任一方面或其各實現方式中的方法。A ninth aspect provides a computer-readable storage medium for storing a computer program that enables a computer to execute any one of the above-mentioned first to second aspects or the methods in each implementation thereof.

第十方面，提供了一種電腦程式產品，包括電腦程式指令，該電腦程式指令使得電腦執行上述第一方面至第二方面中的任一方面或其各實現方式中的方法。In a tenth aspect, a computer program product is provided, including computer program instructions that enable a computer to execute any one of the above-mentioned first to second aspects or the methods in each implementation thereof.

第十一方面，提供了一種電腦程式，當其在電腦上運行時，使得電腦執行上述第一方面至第二方面中的任一方面或其各實現方式中的方法。An eleventh aspect provides a computer program that, when run on a computer, causes the computer to execute any one of the above-mentioned first to second aspects or the methods in each implementation thereof.

第十二方面，提供了一種碼流，碼流是基於上述第二方面的編碼方法生成的。A twelfth aspect provides a code stream, which is generated based on the encoding method of the second aspect.

基於以上技術方案，針對包括一種或多種表達格式的視覺媒體內容的應用場景，將不同表達格式的同構區塊拼接成一張異構混合拼接圖，將相同表達格式的的同構區塊拼接成一張同構拼接圖，將得到的拼接圖和拼接圖資訊寫入碼流。碼流中同時存在同構拼接圖(例如多視點拼接圖、點雲拼接圖和網格拼接圖中的至少一個)和異構混合拼接圖，使得該編解碼方法適用於多種表達格式的視覺媒體內容的應用場景，擴大了編解碼方法的應用範圍。而且拼接圖資訊中包括了用於指示拼接圖類型的第一語法元素，提高了解碼端對拼接圖的解碼效率。進一步地，由於將不同表達格式的同構區塊拼接在一張異構混合拼接圖中進行編解碼，能夠減少所需要調用的HEVC，VVC，AVC，AVS等二維視訊編解碼器的個數，降低實現代價，提高易用性。Based on the above technical solution, for application scenarios that include visual media content in one or more expression formats, homogeneous blocks in different expression formats are spliced into a heterogeneous mixed splicing picture, and homogeneous blocks in the same expression format are spliced into a homogeneous splicing picture. Picture, write the obtained splicing picture and the splicing picture information into the code stream. There are both homogeneous splicing images (such as at least one of multi-viewpoint splicing images, point cloud splicing images and grid splicing images) and heterogeneous hybrid splicing images in the code stream, making this encoding and decoding method suitable for visual media of multiple expression formats The application scenarios of content expand the application scope of encoding and decoding methods. Moreover, the splicing image information includes the first syntax element used to indicate the type of the splicing image, which improves the decoding efficiency of the splicing image at the decoding end. Furthermore, since homogeneous blocks of different expression formats are spliced into a heterogeneous hybrid splicing image for encoding and decoding, the number of 2D video codecs such as HEVC, VVC, AVC, and AVS that need to be called can be reduced, reducing Realize value and improve ease of use.

本申請可應用於圖像編解碼領域、視訊編解碼領域、硬體視訊編解碼領域、專用電路視訊編解碼領域、即時視訊編解碼領域等。例如，本申請的方案可結合至音視訊編碼標準(audio video coding standard，簡稱AVS)，例如，H.264/音視訊編碼(audio video coding，簡稱AVC)標準，H.265/高效視訊編碼(high efficiency video coding，簡稱HEVC)標準以及H.266/多功能視訊編碼(versatile video coding，簡稱VVC)標準。或者，本申請的方案可結合至其它專屬或行業標準而操作，所述標準包含ITU-TH.261、ISO/IECMPEG-1Visual、ITU-TH.262或ISO/IECMPEG-2Visual、ITU-TH.263、ISO/IECMPEG-4Visual，ITU-TH.264(還稱為ISO/IECMPEG-4AVC)，包含可分級視訊編解碼(SVC)及多視圖視訊編解碼(MVC)擴展。應理解，本申請的技術不限於任何特定編解碼標準或技術。This application can be applied to the field of image coding and decoding, the field of video coding and decoding, the field of hardware video coding and decoding, the field of dedicated circuit video coding and decoding, the field of real-time video coding and decoding, etc. For example, the solution of the present application can be combined with the audio and video coding standard (AVS), such as H.264/audio video coding (AVC) standard, H.265/high-efficiency video coding (AVS) standard. high efficiency video coding (HEVC) standard and H.266/versatile video coding (VVC) standard. Alternatively, the solution of this application can be operated in conjunction with other proprietary or industry standards, including ITU-TH.261, ISO/IECMPEG-1Visual, ITU-TH.262 or ISO/IECMPEG-2Visual, ITU-TH.263 , ISO/IECMPEG-4Visual, ITU-TH.264 (also known as ISO/IECMPEG-4AVC), including scalable video codec (SVC) and multi-view video codec (MVC) extensions. It should be understood that the technology of this application is not limited to any specific codec standard or technology.

高自由度沉浸式編碼系統根據任務線可大致分為以下幾個環節：資料獲取、資料的組織與表達、資料編碼壓縮、資料解碼重建、資料合成渲染，最終將目標資料呈現給使用者。The high-degree-of-freedom immersive coding system can be roughly divided into the following links according to the task line: data acquisition, data organization and expression, data encoding and compression, data decoding and reconstruction, data synthesis and rendering, and finally presenting the target data to the user.

本申請實施例涉及的編碼主要為視訊編解碼，為了便於理解，首先結合圖1對本申請實施例涉及的視訊編解碼系統進行介紹。The encoding involved in the embodiment of the present application is mainly video encoding and decoding. To facilitate understanding, the video encoding and decoding system involved in the embodiment of the present application is first introduced with reference to Figure 1 .

圖1為本申請實施例涉及的一種視訊編解碼系統的示意性框圖。需要說明的是，圖1只是一種示例，本申請實施例的視訊編解碼系統包括但不限於圖1所示。如圖1所示，該視訊編解碼系統100包含編碼設備110和解碼設備120。其中編碼設備用於對視訊資料進行編碼(可以理解成壓縮)產生碼流，並將碼流傳輸給解碼設備。解碼設備對編碼設備編碼產生的碼流進行解碼，得到解碼後的視訊資料。FIG. 1 is a schematic block diagram of a video encoding and decoding system according to an embodiment of the present application. It should be noted that Figure 1 is only an example, and the video encoding and decoding system in the embodiment of the present application includes but is not limited to what is shown in Figure 1 . As shown in FIG. 1 , the video encoding and decoding system 100 includes an encoding device 110 and a decoding device 120 . The encoding device is used to encode (can be understood as compression) video data to generate a code stream, and transmit the code stream to the decoding device. The decoding device decodes the code stream generated by the encoding device to obtain decoded video data.

本申請實施例的編碼設備110可以理解為具有視訊編碼功能的設備，解碼設備120可以理解為具有視訊解碼功能的設備，即本申請實施例對編碼設備110和解碼設備120包括更廣泛的裝置，例如包含智慧手機、桌上型電腦、移動計算裝置、筆記本(例如，膝上型)電腦、平板電腦、機上盒、電視、相機、顯示裝置、數位媒體播放機、視訊遊戲控制台、車載電腦等。The encoding device 110 in the embodiment of the present application can be understood as a device with a video encoding function, and the decoding device 120 can be understood as a device with a video decoding function. That is, the embodiment of the present application includes a wider range of devices for the encoding device 110 and the decoding device 120. Examples include smartphones, desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, televisions, cameras, display devices, digital media players, video game consoles, and vehicle-mounted computers. wait.

在一些實施例中，編碼設備110可以經由通道130將編碼後的視訊資料(如碼流)傳輸給解碼設備120。通道130可以包括能夠將編碼後的視訊資料從編碼設備110傳輸到解碼設備120的一個或多個媒體和/或裝置。In some embodiments, the encoding device 110 may transmit the encoded video data (such as a code stream) to the decoding device 120 via the channel 130 . Channel 130 may include one or more media and/or devices capable of transmitting encoded video material from encoding device 110 to decoding device 120 .

在一個實例中，通道130包括使編碼設備110能夠即時地將編碼後的視訊資料直接發射到解碼設備120的一個或多個通訊媒體。在此實例中，編碼設備110可根據通訊標準來調製編碼後的視訊資料，且將調製後的視訊資料發射到解碼設備120。其中通訊媒體包含無線通訊媒體，例如射頻頻譜，可選的，通訊媒體還可以包含有線通訊媒體，例如一根或多根物理傳輸線。In one example, channel 130 includes one or more communication media that enables encoding device 110 to transmit encoded video material directly to decoding device 120 in real-time. In this example, the encoding device 110 can modulate the encoded video data according to the communication standard, and transmit the modulated video data to the decoding device 120 . The communication media includes wireless communication media, such as radio frequency spectrum. Optionally, the communication media may also include wired communication media, such as one or more physical transmission lines.

在另一實例中，通道130包括儲存媒介，該儲存媒介可以儲存編碼設備110編碼後的視訊資料。儲存媒介包含多種本地存取式資料儲存媒介，例如光碟、DVD、快閃記憶體等。在該實例中，解碼設備120可從該儲存媒介中獲取編碼後的視訊資料。In another example, the channel 130 includes a storage medium that can store video data encoded by the encoding device 110 . Storage media include a variety of local access data storage media, such as optical discs, DVDs, flash memories, etc. In this example, the decoding device 120 can obtain the encoded video data from the storage medium.

在另一實例中，通道130可包含儲存伺服器，該儲存伺服器可以儲存編碼設備110編碼後的視訊資料。在此實例中，解碼設備120可以從該儲存伺服器中下載儲存的編碼後的視訊資料。可選的，該儲存伺服器可以儲存編碼後的視訊資料且可以將該編碼後的視訊資料發射到解碼設備120，例如web伺服器(例如，用於網站)、檔傳送協議(FTP)伺服器等。In another example, the channel 130 may include a storage server that may store the video data encoded by the encoding device 110 . In this example, the decoding device 120 can download the stored encoded video data from the storage server. Optionally, the storage server can store the encoded video data and can transmit the encoded video data to the decoding device 120, such as a web server (for example, for a website), a File Transfer Protocol (FTP) server wait.

一些實施例中，編碼設備110包含視訊編碼器112及輸出介面113。其中，輸出介面113可以包含調製器/解調器(數據機)和/或發射器。In some embodiments, the encoding device 110 includes a video encoder 112 and an output interface 113. The output interface 113 may include a modulator/demodulator (modem) and/or a transmitter.

在一些實施例中，編碼設備110除了包括視訊編碼器112和輸入介面113外，還可以包括視訊源111。In some embodiments, in addition to the video encoder 112 and the input interface 113, the encoding device 110 may also include a video source 111.

視訊源111可包含視訊採集裝置(例如，視訊相機)、視訊存檔、視訊輸入介面、電腦圖形系統中的至少一個，其中，視訊輸入介面用於從視訊內容提供者處接收視訊資料，電腦圖形系統用於產生視訊資料。The video source 111 may include at least one of a video capture device (for example, a video camera), a video archive, a video input interface, and a computer graphics system, wherein the video input interface is used to receive video data from a video content provider, and the computer graphics system Used to generate video data.

視訊編碼器112對來自視訊源111的視訊資料進行編碼，產生碼流。視訊資料可包括一個或多個圖像(picture)或圖像序列(sequence of pictures)。碼流以位元流的形式包含了圖像或圖像序列的編碼資訊。編碼資訊可以包含編碼圖像資料及相關聯資料。相關聯資料可包含序列參數集(sequence parameter set，簡稱SPS)、圖像參數集(picture parameter set，簡稱PPS)及其它語法結構。SPS可含有應用於一個或多個序列的參數。PPS可含有應用於一個或多個圖像的參數。語法結構是指碼流中以指定次序排列的零個或多個語法元素的集合。The video encoder 112 encodes the video data from the video source 111 to generate a code stream. Video data may include one or more pictures or sequences of pictures. The code stream contains the encoding information of an image or image sequence in the form of a bit stream. Encoded information may include encoded image data and associated data. The associated data may include a sequence parameter set (SPS for short), a picture parameter set (PPS for short) and other syntax structures. An SPS can contain parameters that apply to one or more sequences. A PPS can contain parameters that apply to one or more images. A syntax structure refers to a collection of zero or more syntax elements arranged in a specified order in a code stream.

視訊編碼器112經由輸出介面113將編碼後的視訊資料直接傳輸到解碼設備120。編碼後的視訊資料還可儲存於儲存媒介或儲存伺服器上，以供解碼設備120後續讀取。The video encoder 112 directly transmits the encoded video data to the decoding device 120 through the output interface 113 . The encoded video data can also be stored on a storage medium or storage server for subsequent reading by the decoding device 120 .

在一些實施例中，解碼設備120包含輸入介面121和視訊解碼器122。在一些實施例中，解碼設備120除包括輸入介面121和視訊解碼器122外，還可以包括顯示裝置123。In some embodiments, the decoding device 120 includes an input interface 121 and a video decoder 122 . In some embodiments, in addition to the input interface 121 and the video decoder 122, the decoding device 120 may also include a display device 123.

其中，輸入介面121包含接收器及/或數據機。輸入介面121可通過通道130接收編碼後的視訊資料。The input interface 121 includes a receiver and/or a modem. The input interface 121 can receive the encoded video data through the channel 130 .

視訊解碼器122用於對編碼後的視訊資料進行解碼，得到解碼後的視訊資料，並將解碼後的視訊資料傳輸至顯示裝置123。The video decoder 122 is used to decode the encoded video data to obtain decoded video data, and transmit the decoded video data to the display device 123 .

顯示裝置123顯示解碼後的視訊資料。顯示裝置123可與解碼設備120整合或在解碼設備120外部。顯示裝置123可包括多種顯示裝置，例如液晶顯示器(LCD)、等離子體顯示器、有機發光二極體(OLED)顯示器或其它類型的顯示裝置。The display device 123 displays the decoded video data. Display device 123 may be integrated with decoding device 120 or external to decoding device 120 . Display device 123 may include a variety of display devices, such as a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) display, or other types of display devices.

此外，圖1僅為實例，本申請實施例的技術方案不限於圖1，例如本申請的技術還可以應用于單側的視訊編碼或單側的視訊解碼。In addition, Figure 1 is only an example, and the technical solution of the embodiment of the present application is not limited to Figure 1. For example, the technology of the present application can also be applied to unilateral video encoding or unilateral video decoding.

下面對本申請實施例涉及的視訊編碼框架進行介紹。The video coding framework involved in the embodiment of this application is introduced below.

圖2A是本申請實施例涉及的視訊編碼器的示意性框圖。應理解，該視訊編碼器200可用於對圖像進行失真壓縮(lossy compression)，也可用於對圖像進行無失真壓縮(lossless compression)。該無失真壓縮可以是視覺無失真壓縮(visually lossless compression)，也可以是數學無失真壓縮(mathematically lossless compression)。FIG. 2A is a schematic block diagram of a video encoder according to an embodiment of the present application. It should be understood that the video encoder 200 can be used to perform lossy compression on images, or can also be used to perform lossless compression on images. The lossless compression may be visually lossless compression (visually lossless compression) or mathematically lossless compression (mathematically lossless compression).

該視訊編碼器200可應用於亮度色度(YCbCr，YUV)格式的圖像資料上。例如，YUV比例可以為4:2:0、4:2:2或者4:4:4，Y表示明亮度(Luma)，Cb (U)表示藍色色度，Cr (V)表示紅色色度，U和V表示為色度(Chroma)用於描述色彩及飽和度。例如，在顏色格式上，4:2:0表示每4個像素有4個亮度分量，2個色度分量(YYYYCbCr)，4:2:2表示每4個像素有4個亮度分量，4個色度分量(YYYYCbCrCbCr)，4:4:4表示全像素顯示(YYYYCbCrCbCrCbCrCbCr)。The video encoder 200 can be applied to image data in luminance and chrominance (YCbCr, YUV) format. For example, the YUV ratio can be 4:2:0, 4:2:2 or 4:4:4, Y represents brightness (Luma), Cb (U) represents blue chroma, Cr (V) represents red chroma, U and V represent Chroma, which is used to describe color and saturation. For example, in the color format, 4:2:0 means that every 4 pixels have 4 luminance components and 2 chrominance components (YYYYCbCr), 4:2:2 means that every 4 pixels have 4 luminance components and 4 Chroma component (YYYYCbCrCbCr), 4:4:4 means full pixel display (YYYYCbCrCbCrCbCrCbCr).

例如，該視訊編碼器200讀取視訊資料，針對視訊資料中的每幀圖像，將一幀圖像劃分成若干個編碼樹單元(coding tree unit，CTU)，在一些例子中，CTB可被稱作“樹型塊”、“最大編碼單元”(Largest Coding unit，簡稱LCU)或“編碼樹型塊” (coding tree block，簡稱CTB)。每一個CTU可以與圖像內的具有相等大小的區塊相關聯。每一像素可對應一個亮度(luminance或luma)採樣及兩個色度(chrominance或chroma)採樣。因此，每一個CTU可與一個亮度採樣塊及兩個色度採樣塊相關聯。一個CTU大小例如為128×128、64×64、32×32等。一個CTU又可以繼續被劃分成若干個編碼單元(Coding Unit，CU)進行編碼，CU可以為矩形塊也可以為方形塊。CU可以進一步劃分為預測單元(prediction Unit，簡稱PU)和變換單元(transform unit，簡稱TU)，進而使得編碼、預測、變換分離，處理的時候更靈活。在一種示例中，CTU以四叉樹方式劃分為CU，CU以四叉樹方式劃分為TU、PU。For example, the video encoder 200 reads video data, and divides one frame of image into a number of coding tree units (CTUs) for each frame of the video data. In some examples, the CTB can be It is called "tree block", "largest coding unit" (Largest Coding unit, LCU for short) or "coding tree block" (coding tree block, CTB for short). Each CTU can be associated with an equal-sized block within the image. Each pixel can correspond to one luminance (luminance or luma) sample and two chrominance (chrominance or chroma) samples. Therefore, each CTU can be associated with one block of luma samples and two blocks of chroma samples. A CTU size is, for example, 128×128, 64×64, 32×32, etc. A CTU can be further divided into several coding units (Coding Units, CUs) for encoding. A CU can be a rectangular block or a square block. The CU can be further divided into a prediction unit (PU for short) and a transform unit (TU for short), thus enabling coding, prediction, and transformation to be separated and processing more flexible. In an example, the CTU is divided into CUs in a quad-tree manner, and the CU is divided into TUs and PUs in a quad-tree manner.

視訊編碼器及視訊解碼器可支援各種PU大小。假定特定CU的大小為2N×2N，視訊編碼器及視訊解碼器可支援2N×2N或N×N的PU大小以用於幀內預測，且支持2N×2N、2N×N、N×2N、N×N或類似大小的對稱PU以用於幀間預測。視訊編碼器及視訊解碼器還可支援2N×nU、2N×nD、nL×2N及nR×2N的不對稱PU以用於幀間預測。Video encoders and video decoders support various PU sizes. Assuming that the size of a specific CU is 2N×2N, the video encoder and video decoder can support a PU size of 2N×2N or N×N for intra prediction, and support 2N×2N, 2N×N, N×2N, N×N or similar sized symmetric PU for inter prediction. The video encoder and video decoder can also support 2N×nU, 2N×nD, nL×2N and nR×2N asymmetric PUs for inter prediction.

在一些實施例中，如圖2A所示，該視訊編碼器200可包括：預測單元210、殘差單元220、變換/量化單元230、反變換/量化單元240、重建單元250、環路濾波單元260、解碼圖像緩存270和熵編碼單元280。需要說明的是，視訊編碼器200可包含更多、更少或不同的功能組件。In some embodiments, as shown in FIG. 2A , the video encoder 200 may include: a prediction unit 210, a residual unit 220, a transform/quantization unit 230, an inverse transform/quantization unit 240, a reconstruction unit 250, and a loop filter unit. 260. Decode the image cache 270 and the entropy encoding unit 280. It should be noted that the video encoder 200 may include more, less, or different functional components.

可選的，在本申請中，當前塊(current block)可以稱為當前編碼單元(CU)或當前預測單元(PU)等。預測塊也可稱為預測圖像塊或圖像預測塊，重建圖像塊也可稱為重建塊或圖像重建圖像塊。Optionally, in this application, the current block (current block) may be called the current coding unit (CU) or the current prediction unit (PU), etc. The prediction block may also be called a predicted image block or an image prediction block, and the reconstructed image block may also be called a reconstruction block or an image reconstructed image block.

在一些實施例中，預測單元210包括幀間預測單元211和幀內估計單元212。由於視訊的一個幀中的相鄰像素之間存在很強的相關性，在視訊編解碼技術中使用幀內預測的方法消除相鄰像素之間的空間冗餘。由於視訊中的相鄰幀之間存在著很強的相似性，在視訊編解碼技術中使用幀間預測方法消除相鄰幀之間的時間冗餘，從而提高編碼效率。In some embodiments, prediction unit 210 includes inter prediction unit 211 and intra estimation unit 212. Since there is a strong correlation between adjacent pixels in a video frame, intra-frame prediction is used in video encoding and decoding technology to eliminate spatial redundancy between adjacent pixels. Since there is a strong similarity between adjacent frames in video, the inter-frame prediction method is used in video encoding and decoding technology to eliminate the temporal redundancy between adjacent frames, thereby improving coding efficiency.

幀間預測單元211可用於幀間預測，幀間預測可以包括運動估計(motion estimation)和運動補償(motion compensation)，可以參考不同幀的圖像資訊，幀間預測使用運動資訊從參考幀中找到參考塊，根據參考塊生成預測塊，用於消除時間冗餘；幀間預測所使用的幀可以為P幀和/或B幀，P幀指的是向前預測幀，B幀指的是雙向預測幀。幀間預測使用運動資訊從參考幀中找到參考塊，根據參考塊生成預測塊。運動資訊包括參考幀所在的參考幀列表，參考幀索引，以及運動向量。運動向量可以是整像素的或者是分像素的，如果運動向量是分像素的，那麼需要在參考幀中使用插值濾波做出所需的分像素的塊，這裡把根據運動向量找到的參考幀中的整像素或者分像素的塊叫參考塊。有的技術會直接把參考塊作為預測塊，有的技術會在參考塊的基礎上再處理生成預測塊。在參考塊的基礎上再處理生成預測塊也可以理解為把參考塊作為預測塊然後再在預測塊的基礎上處理生成新的預測塊。The inter-frame prediction unit 211 can be used for inter-frame prediction. Inter-frame prediction can include motion estimation and motion compensation. It can refer to image information of different frames. Inter-frame prediction uses motion information to be found from reference frames. Reference block, a prediction block is generated based on the reference block to eliminate temporal redundancy; the frames used in inter-frame prediction can be P frames and/or B frames, P frames refer to forward prediction frames, and B frames refer to bidirectional predicted frame. Inter-frame prediction uses motion information to find reference blocks from reference frames and generate prediction blocks based on the reference blocks. Motion information includes the reference frame list where the reference frame is located, the reference frame index, and the motion vector. The motion vector can be in whole pixels or sub-pixels. If the motion vector is in sub-pixels, then interpolation filtering needs to be used in the reference frame to make the required sub-pixel blocks. Here, the reference frame found according to the motion vector is A block of whole pixels or sub-pixels is called a reference block. Some technologies will directly use the reference block as a prediction block, and some technologies will process the reference block to generate a prediction block. Reprocessing to generate a prediction block based on a reference block can also be understood as using the reference block as a prediction block and then processing to generate a new prediction block based on the prediction block.

幀內估計單元212只參考同一幀圖像的資訊，預測當前碼圖像塊內的像素資訊，用於消除空間冗餘。幀內預測所使用的幀可以為I幀。The intra-frame estimation unit 212 only refers to the information of the same frame image and predicts the pixel information in the current coded image block to eliminate spatial redundancy. The frames used in intra prediction may be I frames.

幀內預測有多種預測模式，以國際數位視訊編碼標準H系列為例，H.264/AVC標準有8種角度預測模式和1種非角度預測模式，H.265/HEVC擴展到33種角度預測模式和2種非角度預測模式。HEVC使用的幀內預測模式有平面模式(Planar)、DC和33種角度模式，共35種預測模式。VVC使用的幀內模式有Planar、DC和65種角度模式，共67種預測模式。Intra-frame prediction has multiple prediction modes. Taking the international digital video coding standard H series as an example, the H.264/AVC standard has 8 angle prediction modes and 1 non-angle prediction mode. H.265/HEVC has been extended to 33 angle prediction modes. mode and 2 non-angle prediction modes. The intra-frame prediction modes used by HEVC include planar mode (Planar), DC and 33 angle modes, for a total of 35 prediction modes. The intra-frame modes used by VVC include Planar, DC and 65 angle modes, for a total of 67 prediction modes.

需要說明的是，隨著角度模式的增加，幀內預測將會更加精確，也更加符合對高清以及超高清數位視訊發展的需求。It should be noted that with the increase of angle modes, intra-frame prediction will be more accurate and more in line with the development needs of high-definition and ultra-high-definition digital video.

殘差單元220可基於CU的區塊及CU的PU的預測塊來產生CU的殘差塊。舉例來說，殘差單元220可產生CU的殘差塊，使得殘差塊中的每一採樣具有等於以下兩者之間的差的值：CU的區塊中的採樣，及CU的PU的預測塊中的對應採樣。Residual unit 220 may generate a residual block of the CU based on the block of the CU and the prediction block of the PU of the CU. For example, residual unit 220 may generate a residual block of a CU such that each sample in the residual block has a value equal to the difference between the samples in the block of the CU and the PU of the CU. Predict the corresponding sample in the block.

變換/量化單元230可量化變換係數。變換/量化單元230可基於與CU相關聯的量化參數(QP)值來量化與CU的TU相關聯的變換係數。視訊編碼器200可通過調整與CU相關聯的QP值來調整應用於與CU相關聯的變換係數的量化程度。Transform/quantization unit 230 may quantize the transform coefficients. Transform/quantization unit 230 may quantize transform coefficients associated with the TU of the CU based on quantization parameter (QP) values associated with the CU. Video encoder 200 may adjust the degree of quantization applied to transform coefficients associated with the CU by adjusting the QP value associated with the CU.

反變換/量化單元240可分別將逆量化及逆變換應用於量化後的變換係數，以從量化後的變換係數重建殘差塊。Inverse transform/quantization unit 240 may apply inverse quantization and inverse transform to the quantized transform coefficients, respectively, to reconstruct the residual block from the quantized transform coefficients.

重建單元250可將重建後的殘差塊的採樣加到預測單元210產生的一個或多個預測塊的對應採樣，以產生與TU相關聯的重建圖像塊。通過此方式重建CU的每一個TU的採樣塊，視訊編碼器200可重建CU的區塊。Reconstruction unit 250 may add samples of the reconstructed residual block to corresponding samples of one or more prediction blocks generated by prediction unit 210 to produce a reconstructed image block associated with the TU. By reconstructing the sample blocks of each TU of the CU in this manner, the video encoder 200 can reconstruct the blocks of the CU.

環路濾波單元260用於對反變換與反量化後的像素進行處理，彌補失真資訊，為後續編碼像素提供更好的參考，例如可執行消塊濾波操作以減少與CU相關聯的區塊的塊效應。The loop filtering unit 260 is used to process the inversely transformed and inversely quantized pixels to compensate for distortion information and provide a better reference for subsequent encoding of pixels. For example, a deblocking filtering operation can be performed to reduce the number of blocks associated with the CU. block effect.

在一些實施例中，環路濾波單元260包括去塊濾波單元和樣點自我調整補償/自我調整環路濾波(SAO/ALF)單元，其中去塊濾波單元用於去方塊效應，SAO/ALF單元用於去除振鈴效應。In some embodiments, the loop filtering unit 260 includes a deblocking filtering unit and a sample self-adjusting compensation/self-adjusting loop filtering (SAO/ALF) unit, where the deblocking filtering unit is used to remove blocking effects, and the SAO/ALF unit Used to remove ringing effects.

解碼圖像緩存270可儲存重建後的區塊。幀間預測單元211可使用含有重建後的區塊的參考圖像來對其它圖像的PU執行幀間預測。另外，幀內估計單元212可使用解碼圖像緩存270中的重建後的區塊來對在與CU相同的圖像中的其它PU執行幀內預測。The decoded image cache 270 may store reconstructed blocks. Inter prediction unit 211 may perform inter prediction on PUs of other images using reference images containing reconstructed blocks. Additionally, intra estimation unit 212 may use the reconstructed blocks in decoded image cache 270 to perform intra prediction on other PUs in the same image as the CU.

熵編碼單元280可接收來自變換/量化單元230的量化後的變換係數。熵編碼單元280可對量化後的變換係數執行一個或多個熵編碼操作以產生熵編碼後的資料。Entropy encoding unit 280 may receive the quantized transform coefficients from transform/quantization unit 230 . Entropy encoding unit 280 may perform one or more entropy encoding operations on the quantized transform coefficients to generate entropy encoded data.

圖2B是本申請實施例涉及的視訊解碼器的示意性框圖。FIG. 2B is a schematic block diagram of a video decoder related to an embodiment of the present application.

如圖2B所示，視訊解碼器300包含：熵解碼單元310、預測單元320、反量化/變換單元330、重建單元340、環路濾波單元350及解碼圖像緩存360。需要說明的是，視訊解碼器300可包含更多、更少或不同的功能組件。As shown in FIG. 2B , the video decoder 300 includes: an entropy decoding unit 310 , a prediction unit 320 , an inverse quantization/transformation unit 330 , a reconstruction unit 340 , a loop filtering unit 350 and a decoded image buffer 360 . It should be noted that the video decoder 300 may include more, less, or different functional components.

視訊解碼器300可接收碼流。熵解碼單元310可解析碼流以從碼流提取語法元素。作為解析碼流的一部分，熵解碼單元310可解析碼流中的經熵編碼後的語法元素。預測單元320、反量化/變換單元330、重建單元340及環路濾波單元350可根據從碼流中提取的語法元素來解碼視訊資料，即產生解碼後的視訊資料。The video decoder 300 can receive the code stream. Entropy decoding unit 310 may parse the codestream to extract syntax elements from the codestream. As part of parsing the code stream, the entropy decoding unit 310 may parse entropy-encoded syntax elements in the code stream. The prediction unit 320, the inverse quantization/transformation unit 330, the reconstruction unit 340 and the loop filtering unit 350 can decode the video data according to the syntax elements extracted from the code stream, that is, generate decoded video data.

在一些實施例中，預測單元320包括幀間預測單元321和幀內估計單元322。In some embodiments, prediction unit 320 includes inter prediction unit 321 and intra estimation unit 322.

幀內估計單元322可執行幀內預測以產生PU的預測塊。幀內估計單元322可使用幀內預測模式以基於空間相鄰PU的區塊來產生PU的預測塊。幀內估計單元322還可根據從碼流解析的一個或多個語法元素來確定PU的幀內預測模式。Intra estimation unit 322 may perform intra prediction to generate predicted blocks for the PU. Intra estimation unit 322 may use an intra prediction mode to generate predicted blocks for a PU based on blocks of spatially adjacent PUs. Intra estimation unit 322 may also determine the intra prediction mode of the PU based on one or more syntax elements parsed from the codestream.

幀間預測單元321可根據從碼流解析的語法元素來構造第一參考圖像列表(列表0)及第二參考圖像列表(列表1)。此外，如果PU使用幀間預測編碼，則熵解碼單元310可解析PU的運動資訊。幀間預測單元321可根據PU的運動資訊來確定PU的一個或多個參考塊。幀間預測單元321可根據PU的一個或多個參考塊來產生PU的預測塊。The inter prediction unit 321 may construct a first reference image list (List 0) and a second reference image list (List 1) according to syntax elements parsed from the code stream. Additionally, if the PU uses inter-prediction coding, entropy decoding unit 310 may parse the motion information of the PU. Inter prediction unit 321 may determine one or more reference blocks of the PU based on the motion information of the PU. Inter prediction unit 321 may generate a predictive block for the PU based on one or more reference blocks of the PU.

反量化/變換單元330可逆量化(即，解量化)與TU相關聯的變換係數。反量化/變換單元330可使用與TU的CU相關聯的QP值來確定量化程度。Inverse quantization/transform unit 330 may inversely quantize (ie, dequantize) transform coefficients associated with a TU. Inverse quantization/transform unit 330 may use the QP value associated with the CU of the TU to determine the degree of quantization.

在逆量化變換係數之後，反量化/變換單元330可將一個或多個逆變換應用於逆量化變換係數，以便產生與TU相關聯的殘差塊。After inversely quantizing the transform coefficients, inverse quantization/transform unit 330 may apply one or more inverse transforms to the inverse quantized transform coefficients to produce a residual block associated with the TU.

重建單元340使用與CU的TU相關聯的殘差塊及CU的PU的預測塊以重建CU的區塊。例如，重建單元340可將殘差塊的採樣加到預測塊的對應採樣以重建CU的區塊，得到重建圖像塊。Reconstruction unit 340 uses the residual blocks associated with the TU of the CU and the prediction blocks of the PU of the CU to reconstruct the blocks of the CU. For example, the reconstruction unit 340 may add samples of the residual block to corresponding samples of the prediction block to reconstruct the block of the CU to obtain a reconstructed image block.

環路濾波單元350可執行消塊濾波操作以減少與CU相關聯的區塊的塊效應。Loop filtering unit 350 may perform deblocking filtering operations to reduce blockiness of blocks associated with the CU.

視訊解碼器300可將CU的重建圖像儲存於解碼圖像緩存360中。視訊解碼器300可將解碼圖像緩存360中的重建圖像作為參考圖像用於後續預測，或者，將重建圖像傳輸給顯示裝置呈現。The video decoder 300 may store the reconstructed image of the CU in the decoded image buffer 360 . The video decoder 300 may use the reconstructed image in the decoded image cache 360 as a reference image for subsequent prediction, or transmit the reconstructed image to a display device for presentation.

視訊編解碼的基本流程如下：在編碼端，將一幀圖像劃分成塊，針對當前塊，預測單元210使用幀內預測或幀間預測產生當前塊的預測塊。殘差單元220可基於預測塊與當前塊的原始塊計算殘差塊，即預測塊和當前塊的原始塊的差值，該殘差塊也可稱為殘差資訊。該殘差塊經由變換/量化單元230變換與量化等過程，可以去除人眼不敏感的資訊，以消除視覺冗餘。可選的，經過變換/量化單元230變換與量化之前的殘差塊可稱為時域殘差塊，經過變換/量化單元230變換與量化之後的時域殘差塊可稱為頻率殘差塊或頻域殘差塊。熵編碼單元280接收到變化量化單元230輸出的量化後的變化係數，可對該量化後的變化係數進行熵編碼，輸出碼流。例如，熵編碼單元280可根據目標上下文模型以及二進位碼流的概率資訊消除字元冗餘。The basic process of video encoding and decoding is as follows: at the encoding end, an image frame is divided into blocks. For the current block, the prediction unit 210 uses intra prediction or inter prediction to generate a prediction block of the current block. The residual unit 220 may calculate a residual block based on the prediction block and the original block of the current block, that is, the difference between the prediction block and the original block of the current block. The residual block may also be called residual information. The residual block undergoes transformation and quantization processes by the transformation/quantization unit 230 to remove information that is insensitive to human eyes to eliminate visual redundancy. Optionally, the residual block before transformation and quantization by the transformation/quantization unit 230 may be called a time domain residual block, and the time domain residual block after transformation and quantization by the transformation/quantization unit 230 may be called a frequency residual block. or frequency domain residual block. The entropy coding unit 280 receives the quantized variation coefficient output from the variation quantization unit 230, and may perform entropy coding on the quantized variation coefficient to output a code stream. For example, the entropy encoding unit 280 can eliminate character redundancy according to the target context model and probability information of the binary code stream.

在解碼端，熵解碼單元310可解析碼流得到當前塊的預測資訊、量化係數矩陣等，預測單元320基於預測資訊對當前塊使用幀內預測或幀間預測產生當前塊的預測塊。反量化/變換單元330使用從碼流得到的量化係數矩陣，對量化係數矩陣進行反量化、反變換得到殘差塊。重建單元340將預測塊和殘差塊相加得到重建塊。重建塊組成重建圖像，環路濾波單元350基於圖像或基於塊對重建圖像進行環路濾波，得到解碼圖像。編碼端同樣需要和解碼端類似的操作獲得解碼圖像。該解碼圖像也可以稱為重建圖像，重建圖像可以為後續的幀作為幀間預測的參考幀。At the decoding end, the entropy decoding unit 310 can parse the code stream to obtain the prediction information, quantization coefficient matrix, etc. of the current block. The prediction unit 320 uses intra prediction or inter prediction for the current block based on the prediction information to generate a prediction block of the current block. The inverse quantization/transform unit 330 uses the quantization coefficient matrix obtained from the code stream to perform inverse quantization and inverse transformation on the quantization coefficient matrix to obtain a residual block. The reconstruction unit 340 adds the prediction block and the residual block to obtain a reconstruction block. The reconstructed blocks constitute a reconstructed image, and the loop filtering unit 350 performs loop filtering on the reconstructed image based on the image or based on the blocks to obtain a decoded image. The encoding end also needs similar operations as the decoding end to obtain the decoded image. The decoded image may also be called a reconstructed image, and the reconstructed image may be used as a reference frame for inter-frame prediction for subsequent frames.

需要說明的是，編碼端確定的塊劃分資訊，以及預測、變換、量化、熵編碼、環路濾波等模式資訊或者參數資訊等在必要時攜帶在碼流中。解碼端通過解析碼流及根據已有資訊進行分析確定與編碼端相同的塊劃分資訊，預測、變換、量化、熵編碼、環路濾波等模式資訊或者參數資訊，從而保證編碼端獲得的解碼圖像和解碼端獲得的解碼圖像相同。It should be noted that the block division information determined by the encoding end, as well as mode information or parameter information such as prediction, transformation, quantization, entropy coding, loop filtering, etc., are carried in the code stream when necessary. The decoding end determines the same block division information as the encoding end by parsing the code stream and analyzing the existing information, prediction, transformation, quantization, entropy coding, loop filtering and other mode information or parameter information, thereby ensuring the decoding image obtained by the encoding end The image is the same as the decoded image obtained at the decoding end.

上述是基於塊的混合編碼框架下的視訊編解碼器的基本流程，隨著技術的發展，該框架或流程的一些模組或步驟可能會被優化，本申請適用於該基於塊的混合編碼框架下的視訊編解碼器的基本流程，但不限於該框架及流程。The above is the basic process of the video codec under the block-based hybrid coding framework. With the development of technology, some modules or steps of this framework or process may be optimized. This application is applicable to this block-based hybrid coding framework. The basic process of the video codec, but is not limited to this framework and process.

在一些應用場景中，在同一個三維場景中同時出現多種異構內容，例如出現多視點視訊和點雲。對於這種情況，目前的編解碼方式至少包括如下兩種：In some application scenarios, multiple heterogeneous contents appear simultaneously in the same three-dimensional scene, such as multi-viewpoint video and point clouds. For this situation, the current encoding and decoding methods include at least the following two:

方式一，對於多視點視訊採用MPEG(Moving Picture Experts Group，動態圖像專家組)沉浸式視訊(MPEG Immersive Video ，簡稱MIV)技術進行編解碼，對於點雲則採用點雲視訊壓縮(Video based Point Cloud Compression，簡稱VPCC)技術進行編解碼。Method 1 uses MPEG (Moving Picture Experts Group, MIV) immersive video (MIV) technology for encoding and decoding multi-viewpoint videos, and uses point cloud video compression (Video based Point) for point clouds. Cloud Compression (VPCC for short) technology for encoding and decoding.

下面對MIV技術和VPCC技術進行介紹。MIV technology and VPCC technology are introduced below.

MIV技術：為了降低傳輸像素率的同時盡可能保留場景資訊，以便保證有足夠的資訊用於渲染目標視圖，MPEG-I採用的方案如圖3A所示，選擇有限數量視點作為基礎視點且盡可能表達場景的可視範圍，基礎視點作為完整圖像傳輸，去除剩餘非基礎視點與基礎視點之間的冗餘像素，即僅保留非重複表達的有效資訊，再將有效資訊提取為子塊圖像與基礎視點圖像進行重組織，形成更大的矩形圖像，該矩形圖像稱為拼接圖像，圖3A和圖3B給出拼接圖像的生成示意過程。將拼接圖像送入編解碼器壓縮重建，並且子塊圖像拼接資訊有關的輔助資料也一併送入編碼器形成碼流。MIV technology: In order to reduce the transmission pixel rate while retaining scene information as much as possible to ensure that there is enough information for rendering the target view, the scheme adopted by MPEG-I is shown in Figure 3A. A limited number of viewpoints are selected as the basic viewpoints and as much as possible To express the visible range of the scene, the base viewpoint is transmitted as a complete image, and the redundant pixels between the remaining non-base viewpoints and the base viewpoint are removed, that is, only the effective information of non-repeated expressions is retained, and then the effective information is extracted into sub-block images and The basic viewpoint image is reorganized to form a larger rectangular image, which is called a spliced image. Figure 3A and Figure 3B show the schematic process of generating the spliced image. The spliced image is sent to the codec for compression and reconstruction, and auxiliary data related to the sub-block image splicing information is also sent to the encoder to form a code stream.

VPCC的編碼方法是將點雲投影成二維圖像或視訊，將三維資訊轉換成二維資訊編碼。圖3C是VPCC的編碼框圖，碼流大致分為四個部分，幾何碼流是幾何深度圖編碼產生的碼流，用來表示點雲的幾何資訊；屬性碼流是紋理圖編碼產生的碼流，用來表示點雲的屬性資訊；佔用碼流是佔用圖編碼產生的碼流，用來指示深度圖和紋理圖中的有效區域；這三種類型的視訊都使用視訊編碼器進行編解碼，如圖3D至圖3F所示。輔助資訊碼流是子塊圖像的附屬資訊編碼產生的碼流，即V3C標準中的patch data unit相關的部分，指示了每個子塊圖像的位置和大小等資訊。The encoding method of VPCC is to project point clouds into two-dimensional images or videos, and convert three-dimensional information into two-dimensional information encoding. Figure 3C is the coding block diagram of VPCC. The code stream is roughly divided into four parts. The geometric code stream is the code stream generated by geometric depth map encoding, which is used to represent the geometric information of the point cloud; the attribute code stream is the code generated by texture map encoding. Stream, used to represent the attribute information of point cloud; occupancy code stream is the code stream generated by occupancy map encoding, used to indicate the effective area in the depth map and texture map; these three types of video are encoded and decoded using video encoders , as shown in Figure 3D to Figure 3F. The auxiliary information code stream is the code stream generated by encoding the auxiliary information of the sub-block image, which is the part related to the patch data unit in the V3C standard, indicating the position and size of each sub-block image.

方式二，多視點視訊和點雲均使用可視體視訊編碼(Visual Volumetric Video-based Coding，簡稱V3C)中的幀打包(frame packing)技術進行編解碼。Method 2: Both multi-viewpoint video and point cloud use the frame packing technology in Visual Volumetric Video-based Coding (V3C) for encoding and decoding.

下面對frame packing技術進行介紹。The frame packing technology is introduced below.

以多視點視訊為例，示例性的，如圖4所示，編碼端包括如下步驟：Taking multi-view video as an example, as shown in Figure 4, the encoding end includes the following steps:

步驟1，對獲取的多視點視訊進行編碼時，經過一些前處理，生成多視點視訊子塊(patch)，接著，將多視點視訊子塊進行組織，生成多視點視訊拼接圖。Step 1: When encoding the acquired multi-view video, perform some pre-processing to generate multi-view video sub-blocks (patch). Then, organize the multi-view video sub-blocks to generate a multi-view video splicing image.

例如，圖4所示，將多視點視訊輸入TIMV中進行打包，輸出多視點視訊拼接圖。TIMV為一種MIV的參考軟體。本申請實施例的打包可以理解為拼接。For example, as shown in Figure 4, multi-viewpoint video is input into TIMV for packaging, and a multi-viewpoint video splicing image is output. TIMV is a reference software for MIV. Packaging in the embodiment of this application can be understood as splicing.

其中，多視點視訊拼接圖包括多視點視訊紋理拼接圖、多視點視訊幾何拼接圖，即只包含多視點視訊子塊。Among them, the multi-viewpoint video mosaic image includes a multi-viewpoint video texture mosaic image and a multi-viewpoint video geometric mosaic image, that is, it only contains multi-viewpoint video sub-blocks.

步驟2，將多視點視訊拼接圖輸入幀打包器，輸出多視點視訊混合拼接圖。Step 2: Input the multi-viewpoint video splicing image into the frame packer and output the multi-viewpoint video mixed splicing image.

其中，多視點視訊混合拼接圖包括多視點視訊紋理混合拼接圖，多視點視訊幾何混合拼接圖，多視點視訊紋理與幾何混合拼接圖。Among them, the multi-viewpoint video mixed mosaic image includes the multi-viewpoint video texture mixed mosaic image, the multi-viewpoint video geometric mixed mosaic image, and the multi-viewpoint video texture and geometry mixed mosaic image.

具體的，如圖4所示，將多視點視訊拼接圖進行幀打包(frame packing)，生成多視點視訊混合拼接圖，每個多視點視訊拼接圖佔用多視點視訊混合拼接圖的一個區域(region)。相應地，在碼流中要為每個區域傳輸一個標誌pin_region_type_id_minus2，這個標誌記錄了當前區域屬於多視點視訊紋理拼接圖還是多視點視訊幾何拼接圖的資訊，在解碼端需要利用該資訊。Specifically, as shown in Figure 4, the multi-viewpoint video splicing image is frame packed to generate a multi-viewpoint video hybrid splicing image. Each multi-viewpoint video splicing image occupies a region of the multi-viewpoint video hybrid splicing image. ). Correspondingly, a flag pin_region_type_id_minus2 must be transmitted for each region in the code stream. This flag records the information whether the current area belongs to a multi-viewpoint video texture splicing map or a multi-viewpoint video geometric splicing map. This information needs to be used at the decoding end.

步驟3，使用視訊編碼器對多視點視訊混合拼接圖進行編碼，得到碼流。Step 3: Use a video encoder to encode the multi-viewpoint video mixed splicing image to obtain a code stream.

示例性的，如圖5所示，解碼端包括如下步驟：For example, as shown in Figure 5, the decoding end includes the following steps:

步驟1，在多視點視訊解碼時，將獲取的碼流輸入視訊解碼器中進行解碼，得到重建多視點視訊混合拼接圖。Step 1: During multi-viewpoint video decoding, input the acquired code stream into the video decoder for decoding to obtain a reconstructed multi-viewpoint video mixed splicing image.

步驟2，將重建多視點視訊混合拼接圖輸入幀解打包器中，輸出重建多視點視訊拼接圖。Step 2: Input the reconstructed multi-viewpoint video mixed splicing image into the frame depackager and output the reconstructed multi-viewpoint video splicing image.

具體的，首先，從碼流中獲取標誌pin_region_type_id_minus2，若確定該pin_region_type_id_minus2是V3C_AVD，則表示當前區域是多視點視訊紋理拼接圖，則將該當前區域拆分並輸出為重建多視點視訊紋理拼接圖。Specifically, first, the flag pin_region_type_id_minus2 is obtained from the code stream. If it is determined that the pin_region_type_id_minus2 is V3C_AVD, it means that the current region is a multi-viewpoint video texture mosaic, and then the current region is split and output as a reconstructed multi-viewpoint video texture mosaic.

若確定該pin_region_type_id_minus2是V3C_GVD，則表示當前區域是多視點視訊幾何拼接圖，將該當前區域拆分並輸出為重建多視點視訊幾何拼接圖。If it is determined that the pin_region_type_id_minus2 is V3C_GVD, it means that the current region is a multi-viewpoint video geometric mosaic, and the current region is split and output as a reconstructed multi-viewpoint video geometric mosaic.

步驟3，對重建多視點視訊拼接圖進行解碼，得到重建多視點視訊。Step 3: Decode the reconstructed multi-viewpoint video splicing image to obtain the reconstructed multi-viewpoint video.

具體是，對多視點視訊紋理拼接圖和多視點視訊幾何拼接圖進行解碼，得到重建多視點視訊。Specifically, the multi-viewpoint video texture splicing image and the multi-viewpoint video geometric splicing image are decoded to obtain the reconstructed multi-viewpoint video.

上面以多視點視訊為例對frame packing技術進行解析介紹，對於點雲進行frame packing編解碼方式，與上述多視點視訊基本相同，參照即可，例如使用TMC(一種VPCC的參考軟體)對點雲進行打包，得到點雲拼接圖，對點雲拼接圖輸入幀打包器進行幀打包，得到點雲混合拼接圖，對點雲混合拼接圖進行拼接，得到點雲碼流，在此不再贅述。The above uses multi-viewpoint video as an example to analyze and introduce frame packing technology. The frame packing encoding and decoding method for point clouds is basically the same as the above-mentioned multi-viewpoint video. You can refer to it. For example, use TMC (a VPCC reference software) to point cloud Packing is performed to obtain a point cloud splicing image. The point cloud splicing image is input into the frame packer for frame packaging to obtain a point cloud hybrid splicing image. The point cloud hybrid splicing image is spliced to obtain a point cloud code stream, which will not be described again here.

下面對標準中與frame packing相關的語法進行介紹。The following is an introduction to the syntax related to frame packing in the standard.

V3C 單元頭語法如表1所示：表1 v3c_unit_header( ) { 描述 vuh_unit_type u(5) if( vuh_unit_type == V3C_AVD || vuh_unit_type == V3C_GVD || vuh_unit_type == V3C_OVD || vuh_unit_type == V3C_AD || vuh_unit_type == V3C_CAD || vuh_unit_type == V3C_PVD ) vuh_v3c_parameter_set_id u(4) if( vuh_unit_type == V3C_AVD || vuh_unit_type == V3C_GVD || vuh_unit_type == V3C_OVD || vuh_unit_type == V3C_AD || vuh_unit_type == V3C_PVD ) vuh_atlas_id u(6) if( vuh_unit_type == V3C_AVD ) { vuh_attribute_index u(7) vuh_attribute_partition_index u(5) vuh_map_index u(4) vuh_auxiliary_video_flag u(1) } else if( vuh_unit_type == V3C_GVD ) { vuh_map_index u(4) vuh_auxiliary_video_flag u(1) vuh_reserved_zero_12bits u(12) } else if( vuh_unit_type == V3C_OVD || vuh_unit_type == V3C_AD || vuh_unit_type == V3C_PVD ) vuh_reserved_zero_17bits u(17) else if( vuh_unit_type == V3C_CAD ) vuh_reserved_zero_23bits u(23) Else vuh_reserved_zero_27bits u(27) } The V3C unit header syntax is shown in Table 1: Table 1 v3c_unit_header( ) { describe vuh_unit_type u(5) if( vuh_unit_type == V3C_AVD || vuh_unit_type == V3C_GVD || vuh_unit_type == V3C_OVD || vuh_unit_type == V3C_AD || vuh_unit_type == V3C_CAD || vuh_unit_type == V3C_PVD ) vuh_v3c_parameter_set_id u(4) if( vuh_unit_type == V3C_AVD || vuh_unit_type == V3C_GVD || vuh_unit_type == V3C_OVD || vuh_unit_type == V3C_AD || vuh_unit_type == V3C_PVD ) vuh_atlas_id u(6) if( vuh_unit_type == V3C_AVD ) { vuh_attribute_index u(7) vuh_attribute_partition_index u(5) vuh_map_index u(4) vuh_auxiliary_video_flag u(1) } else if( vuh_unit_type == V3C_GVD ) { vuh_map_index u(4) vuh_auxiliary_video_flag u(1) vuh_reserved_zero_12bits u(12) } else if( vuh_unit_type == V3C_OVD || vuh_unit_type == V3C_AD || vuh_unit_type == V3C_PVD ) vuh_reserved_zero_17bits u(17) else if( vuh_unit_type == V3C_CAD ) vuh_reserved_zero_23bits u(23) Else vuh_reserved_zero_27bits u(27) }

V3C 單元頭語義，如表2所示：表2：V3C 單元類型單元類型識別字 V3C 單元類型描述 0 V3C_VPS V3C 參數集 V3C 等級參數 1 V3C_AD 圖集數據圖集訊息 2 V3C_OVD 佔用視訊資料佔用訊息 3 V3C_GVD 幾何視訊資料幾何訊息 4 V3C_AVD 屬性視訊資料屬性資訊 5 V3C_PVD 打包視訊資料拼接訊息 6 V3C_CAD 通用圖集資料 ISO/IEC 23090-12中指定的通用圖集資料 CVS 中圖集的通用資訊 7…31 V3C_RSVD 保留 - V3C unit header semantics, as shown in Table 2: Table 2: V3C unit type Unit type Identification word V3C unit type describe 0 V3C_VPS V3C parameter set V3C level parameters 1 V3C_AD Atlas data Album message 2 V3C_OVD occupy video data occupation message 3 V3C_GVD Geometry video data Geometry information 4 V3C_AVD Attributed video data Property information 5 V3C_PVD Package video data splicing messages 6 V3C_CAD General atlas information General information for atlases specified in ISO/IEC 23090-12 General information for atlases in CVS 7…31 V3C_RSVD reserve -

目前，如果在同一個三維場景中同時出現多種不同表達格式的視覺媒體內容時，則對多種不同表達格式的視覺媒體內容分別進行編解碼。例如，對於同一個三維場景中同時出現點雲和多視點視訊的情況，目前的打包技術是，對點雲進行壓縮，形成點雲壓縮碼流(即一種V3C碼流)，對多視點視訊資訊壓縮，得到多視點視訊壓縮碼流(即另一種V3C碼流)，然後由系統層對壓縮碼流進行複接，得到融合的三維場景複接碼流。解碼時，對點雲壓縮碼流和多視點視訊壓縮碼流分別進行解碼。由此可知，現有技術在對多種不同表達格式的視覺媒體內容進行編解碼時，使用的編解碼器多，編解碼代價高。Currently, if multiple visual media contents with different expression formats appear simultaneously in the same three-dimensional scene, the visual media content with multiple different expression formats will be encoded and decoded separately. For example, for the situation where point cloud and multi-viewpoint video appear simultaneously in the same three-dimensional scene, the current packaging technology is to compress the point cloud to form a point cloud compression code stream (i.e. a V3C code stream), and the multi-viewpoint video information Compress to obtain a multi-view video compressed code stream (that is, another V3C code stream), and then the system layer multiplexes the compressed code stream to obtain a fused three-dimensional scene multiplexed code stream. During decoding, the point cloud compression code stream and the multi-view video compression code stream are decoded separately. It can be seen from this that when encoding and decoding visual media content in multiple different expression formats, the existing technology uses many codecs and the encoding and decoding cost is high.

為了解決上述技術問題，本申請實施例通過將不同表達格式的同構區塊拼接在一張異構混合拼接圖中，將相同表達格式的的同構區塊拼接在一張同構拼接圖中，對得到的異構混合拼接圖和/或同構拼接圖進行編碼寫入碼流，碼流中可以同時存在同構拼接圖(例如多視點拼接圖、點雲拼接圖和網格拼接圖中的至少一個)和異構混合拼接圖，擴大編解碼方法的應用場景。而且拼接圖資訊中包括了用於指示拼接圖類型的第一語法元素，能夠提高解碼端對拼接圖的解碼效率。In order to solve the above technical problems, the embodiments of the present application splice homogeneous blocks with different expression formats into a heterogeneous mixed splicing diagram, and splice homogeneous blocks with the same expression format into a homogeneous splicing diagram. The resulting Heterogeneous hybrid splicing images and/or homogeneous splicing images are encoded and written into the code stream. Homogeneous splicing images (such as at least one of multi-viewpoint splicing images, point cloud splicing images and grid splicing images) can coexist in the code stream. and heterogeneous hybrid splicing images to expand the application scenarios of encoding and decoding methods. Moreover, the splicing image information includes the first syntax element used to indicate the type of the splicing image, which can improve the decoding efficiency of the splicing image at the decoding end.

下面結合圖6，以編碼端為例，對本申請實施例提供的視訊編碼方法進行介紹。The video encoding method provided by the embodiment of the present application will be introduced below with reference to Figure 6, taking the encoding end as an example.

圖6為本申請實施例提供的編碼方法的流程示意圖，如圖6所示，該編碼方法包括：Figure 6 is a schematic flow chart of the encoding method provided by the embodiment of the present application. As shown in Figure 6, the encoding method includes:

步驟601：對至少一種表達格式的視覺媒體內容進行處理，得到至少一種同構區塊，其中，不同種同構區塊對應不同的視覺媒體內容表達格式；Step 601: Process the visual media content of at least one expression format to obtain at least one isomorphic block, where different types of isomorphic blocks correspond to different visual media content expression formats;

在三維應用場景中，例如虛擬實境(Virtual Reality，VR)、增強現實(Augmented Reality，AR)、混合現實(Mix Reality，MR)等應用場景中，在同一個場景中可能出現表達格式不同的視覺媒體物件，例如在同一個三維場景中存在，以視訊表達場景背景與部分人物和物件、以三維點雲或三維網格表達了另一部分人物。In three-dimensional application scenarios, such as virtual reality (VR), augmented reality (AR), mixed reality (Mix reality, MR) and other application scenarios, different expression formats may appear in the same scene. Visual media objects, for example, exist in the same three-dimensional scene, expressing the scene background and some characters and objects in video, and expressing another part of the characters in three-dimensional point clouds or three-dimensional grids.

在一些實施例中，視覺媒體內容包括多視點視訊、點雲和網格等至少一種表達格式的視覺媒體內容。其中，多視點視訊的一種特例是單一視點視訊，即該多視點視訊可以包括多個視點視訊和/或單一視點視訊。In some embodiments, the visual media content includes visual media content in at least one expression format such as multi-viewpoint video, point cloud, and grid. A special case of multi-viewpoint video is single-viewpoint video, that is, the multi-viewpoint video may include multiple viewpoint videos and/or single-viewpoint video.

其中，一種同構區塊對應一種表達格式。示例性的，至少一種同構區塊對應的表達格式包括以下至少一種：多視點視訊、點雲、網格。至少兩種同構區塊對應至少兩種不同的表達格式，例如本申請實施例的至少兩種同構區塊包括多視點視訊、點雲、網格等至少兩種不同表達格式的同構區塊。Among them, one isomorphic block corresponds to one expression format. Exemplarily, the expression format corresponding to at least one isomorphic block includes at least one of the following: multi-view video, point cloud, and grid. At least two isomorphic blocks correspond to at least two different expression formats. For example, the at least two isomorphic blocks in the embodiment of the present application include isomorphic areas of at least two different expression formats such as multi-view video, point cloud, grid, etc. block.

需要說明的是，本申請實施例中每種同構區塊中可以包括具備相同表達格式的至少一個同構區塊。示例性的，點雲格式的同構區塊中包括一個或者多個點雲區塊，多視點視訊格式的同構區域中包括一個或多個多視點視訊區塊，網格格式的同構區塊包括一個或多個網格區塊。It should be noted that in the embodiment of the present application, each isomorphic block may include at least one isomorphic block with the same expression format. For example, the isomorphic blocks in the point cloud format include one or more point cloud blocks, the isomorphic area in the multi-viewpoint video format includes one or more multi-viewpoint video blocks, and the isomorphic area in the grid format includes one or more point cloud blocks. A block consists of one or more grid blocks.

在一些實施例中，步驟601可以為：對一種表達格式的視覺媒體內容進行處理，得到一種同構區塊。在一些實施例中，步驟601可以為：對至少兩種表達格式的視覺媒體內容進行處理，得到至少兩種同構區塊，其中，不同視覺媒體內容對應的表達格式不同。具體地，對第一表達格式的視覺媒體內容進行處理，得到第一表達格式的同構區塊；對第二表達格式的視覺多媒體內容進行處理，得到第二表達格式的同構區塊。其中，第一表達格式為多視點視訊、點雲、網格中的一個，第二表達格式為多視點視訊、點雲、網格中的一個，第一表達格式和第二表達格式為不同表達格式。In some embodiments, step 601 may be: processing visual media content in an expression format to obtain a homogeneous block. In some embodiments, step 601 may include: processing visual media content in at least two expression formats to obtain at least two isomorphic blocks, where different visual media content corresponds to different expression formats. Specifically, the visual media content in the first expression format is processed to obtain isomorphic blocks in the first expression format; the visual multimedia content in the second expression format is processed to obtain isomorphic blocks in the second expression format. Wherein, the first expression format is one of multi-view video, point cloud, and grid, the second expression format is one of multi-view video, point cloud, and grid, and the first expression format and the second expression format are different expressions. Format.

也就是說，上述視覺媒體內容包括多視點視訊、點雲、網格等至少一種表達格式的視覺媒體內容。當包含一種表達格式的視覺媒體內容，對視覺媒體內容進行處理，得到一種表達格式的同構區塊。當包含多種表達格式的視覺媒體內容，對視覺媒體內容進行處理，得到多種表達格式的同構區塊。That is to say, the above-mentioned visual media content includes visual media content in at least one expression format such as multi-viewpoint video, point cloud, grid, etc. When visual media content of an expression format is included, the visual media content is processed to obtain isomorphic blocks of an expression format. When visual media content of multiple expression formats is included, the visual media content is processed to obtain isomorphic blocks of multiple expression formats.

需要說明的是，區塊也可以稱為條帶(tile)，即點雲區塊也可以稱為點雲條帶，多視點視訊區塊也可以稱為多視點視訊條帶，網格區塊也可以稱為網格條帶。區塊可以為具有特定的形狀的拼接圖，例如為具有特定的長度和/或高度的矩形區域的拼接圖。例如，可以對至少一個子圖塊進行有序拼接，如按照子圖塊的面積從大到小，或者按照子圖塊的長度和/或高度從大到小進行拼接，得到視覺媒體內容對應的區塊。可選的，一個區塊可以精確映射到一個地圖集(atlas)圖塊。It should be noted that blocks can also be called tiles, that is, point cloud blocks can also be called point cloud strips, multi-viewpoint video blocks can also be called multi-viewpoint video strips, and grid blocks. Also called grid strips. The block may be a mosaic of a specific shape, for example, a mosaic of a rectangular area with a specific length and/or height. For example, at least one sub-tile can be spliced in an orderly manner, such as from large to small according to the area of the sub-tiles, or from large to small according to the length and/or height of the sub-tiles, to obtain the visual media content corresponding to block. Optionally, a tile can be mapped exactly to an atlas tile.

在一些實施例中，區塊中的各子圖塊可以具有塊標識(patchID)，以對同一區塊中的不同子圖塊進行區別。例如，同一區塊中可以包括子圖塊1(patch1)、子圖塊2(patch2)和子圖塊3(patch3)。In some embodiments, each sub-tile in a block may have a patch ID (patchID) to distinguish different sub-tiles in the same block. For example, the same block may include sub-patch 1 (patch1), sub-patch 2 (patch2), and sub-patch 3 (patch3).

進一步的，同構區塊中每個子圖塊對應的表達格式均相同，例如，同構區塊中的各子圖塊均為多視點視訊子圖塊，或者均為點雲子圖塊等同一表達格式的子圖塊。同構區塊中每個子圖塊對應的表達格式即該同構區塊對應的表達格式。Furthermore, the expression format corresponding to each sub-block in the isomorphic block is the same. For example, each sub-block in the isomorphic block is a multi-viewpoint video sub-block, or is a point cloud sub-block. A subtile for the expression format. The expression format corresponding to each sub-block in the isomorphic block is the expression format corresponding to the isomorphic block.

在一些實施例中，同構區塊可以具有區塊標識(tileID)，以對相同表達格式的不同區塊進行區分。例如，點雲區塊可以包括點雲區塊1或點雲區塊2。例如，多個視覺媒體內容中包括點雲和多視點視訊，對點雲進行處理，得到點雲區塊，點雲區塊1中包括點雲子圖塊1至子圖塊3；對多視點視訊進行處理，得到多視點視訊區塊，多視點視訊區塊中包括多視點視訊子圖塊1至子圖塊4。In some embodiments, homogeneous tiles may have tile identifiers (tileIDs) to distinguish different tiles of the same expression format. For example, the point cloud block may include point cloud block 1 or point cloud block 2. For example, multiple visual media contents include point clouds and multi-viewpoint videos. The point clouds are processed to obtain point cloud blocks. Point cloud block 1 includes point cloud sub-blocks 1 to 3; for multi-view points The video is processed to obtain a multi-viewpoint video block, which includes multi-viewpoint video sub-blocks 1 to 4.

當需要對一個表達格式的視覺媒體內容進行處理時，得到一個表達格式的同構區塊。當需要對至少兩個視覺媒體內容進行處理時，得到至少兩個表達格式的同構區塊。為了提高壓縮效率，本申請實施例對這至少兩個視覺媒體內容進行處理，例如打包(也稱為拼接)處理，得到至少兩個視覺媒體內容中每個視覺媒體內容對應的區塊。例如，可以對至少兩個視覺媒體內容對應的子圖塊(patch)進行拼接得到區塊。應注意，本申請實施例對至少兩個視覺媒體內容分別進行處理，得到區塊的方式不做限制。When visual media content of an expression format needs to be processed, a homogeneous block of the expression format is obtained. When at least two visual media contents need to be processed, at least two isomorphic blocks of expression formats are obtained. In order to improve compression efficiency, embodiments of the present application process the at least two visual media contents, such as packaging (also called splicing) processing, to obtain blocks corresponding to each visual media content in the at least two visual media contents. For example, the block can be obtained by splicing sub-tiles (patches) corresponding to at least two visual media contents. It should be noted that the embodiment of the present application processes at least two visual media contents separately, and the method of obtaining blocks is not limited.

在一種可能的實現方式中，視覺媒體內容包括多視點視訊和點雲兩個表達格式的視覺媒體內容，所述對至少一種表達格式的視覺媒體內容進行處理，得到至少一種同構區塊，包括：對獲取的多視點視訊進行投影和去冗餘處理後，將不重複像素點連通成視訊子圖塊，且將視訊子圖塊拼接成多視點視訊區塊；以及對獲取的點雲進行平行投影，將投影面中的連通點組成點雲子圖塊，且將點雲子圖塊拼接成點雲區塊。In a possible implementation, the visual media content includes visual media content in two expression formats: multi-view video and point cloud. The visual media content in at least one expression format is processed to obtain at least one isomorphic block, including : After projecting and removing redundancy on the acquired multi-viewpoint video, connect non-repeating pixel points into video sub-blocks, and splice the video sub-blocks into multi-viewpoint video blocks; and parallelize the acquired point clouds. Projection, the connected points in the projection surface are composed of point cloud sub-blocks, and the point cloud sub-blocks are spliced into point cloud blocks.

具體的，對於多視點視訊，以MPEG-I為例，選擇有限數量視點作為基礎視點且盡可能表達場景的可視範圍，基礎視點作為完整圖像傳輸，去除剩餘非基礎視點與基礎視點之間的冗餘像素，即僅保留非重複表達的有效資訊，再將有效資訊提取為子塊圖像與基礎視點圖像進行重組織，形成更大的條帶形圖像，該條帶形圖像稱為多視點視訊區塊。Specifically, for multi-viewpoint video, taking MPEG-I as an example, a limited number of viewpoints are selected as base viewpoints and express the visible range of the scene as much as possible. The base viewpoints are transmitted as complete images, and the gaps between the remaining non-base viewpoints and the base viewpoints are removed. Redundant pixels, that is, only the effective information of non-repeated expressions is retained, and then the effective information is extracted into sub-block images and basic viewpoint images and reorganized to form a larger strip-shaped image. This strip-shaped image is called It is a multi-view video block.

在一些實施例中，上述視覺媒體內容為同一個三維空間中同時呈現的媒體內容。在一些實施例中，上述視覺媒體內容為同一個三維空間中不同時間呈現媒體內容。在一些實施例中，上述視覺媒體內容還可以是不同三維空間的媒體內容。即本申請實施例中，對上述至少兩個視覺媒體內容不做具體限制。In some embodiments, the above-mentioned visual media content is media content presented simultaneously in the same three-dimensional space. In some embodiments, the visual media content is media content presented at different times in the same three-dimensional space. In some embodiments, the above-mentioned visual media content may also be media content in different three-dimensional spaces. That is to say, in the embodiments of this application, there are no specific restrictions on the at least two visual media contents mentioned above.

步驟602：對所述至少一種同構區塊進行拼接，得到至少一個拼接圖和拼接圖資訊，其中，所述拼接圖資訊包括第一語法元素，根據所述第一語法元素確定所述拼接圖為異構混合拼接圖或者同構拼接圖，所述異構混合拼接圖包括至少兩種同構區塊，所述同構拼接圖包括一種同構區塊；Step 602: Splice the at least one isomorphic block to obtain at least one spliced graph and spliced graph information, wherein the spliced graph information includes a first syntax element, and the spliced graph is determined according to the first syntactic element. It is a heterogeneous hybrid mosaic diagram or a homogeneous mosaic diagram, the heterogeneous hybrid mosaic diagram includes at least two types of isomorphic blocks, and the isomorphic mosaic diagram includes one type of isomorphic block;

在一些實施例中，所述對所述至少一種同構區塊進行拼接，得到至少一個拼接圖和拼接圖資訊，包括：對至少兩種表達格式的同構區塊進行異構拼接，生成異構混合拼接圖和拼接圖訊息；對相同表達格式的同構區塊進行同構拼接，生成同構拼接圖和拼接圖資訊。In some embodiments, splicing the at least one homogeneous block to obtain at least one spliced image and spliced image information includes: heterogeneously splicing homogeneous blocks in at least two expression formats to generate heterogeneous Construct mixed splicing diagrams and splicing diagram information; perform isomorphic splicing of isomorphic blocks with the same expression format to generate isomorphic splicing diagrams and splicing diagram information.

示例性的，至少一種同構區塊包括第一表達格式的同構區塊和第二表達格式的同構區塊。該方法具體包括：對第一表達格式的同構區塊進行同構拼接，得到第一同構拼接圖和拼接圖資訊，對第二表達格式的同構區塊進行同構拼接，得到第二同構拼接圖和拼接圖資訊；或者，對第一表達格式的同構區塊和第二表達格式的同構區塊進行異構拼接，得到異構混合拼接圖和拼接圖訊息；或者，對第一表達格式的同構區塊進行同構拼接，得到第一同構拼接圖和拼接圖資訊，對第一表達格式的同構區塊和第二表達格式的同構區塊進行異構拼接，得到異構混合拼接圖和拼接圖訊息；或者，對第二表達格式的同構區塊進行同構拼接，得到第二同構拼接圖和拼接圖資訊，對第一表達格式的同構區塊和第二表達格式的同構區塊進行異構拼接，得到異構混合拼接圖和拼接圖訊息。Exemplarily, at least one isomorphic block includes a isomorphic block in a first expression format and a isomorphic block in a second expression format. The method specifically includes: isomorphically splicing the isomorphic blocks of the first expression format to obtain the first isomorphic splicing graph and splicing graph information, and isomorphically splicing the isomorphic blocks of the second expression format to obtain the second isomorphic splicing graph. Homogeneous splicing pictures and mosaic picture information; or, perform heterogeneous splicing on homogeneous blocks in the first expression format and homogeneous blocks in the second expression format to obtain heterogeneous mixed splicing pictures and mosaic picture information; or, perform heterogeneous splicing on The isomorphic blocks in the first expression format are isomorphically spliced to obtain the first isomorphic splicing diagram and the splicing diagram information, and the isomorphic blocks in the first expression format and the isomorphic blocks in the second expression format are heterogeneously spliced. , obtain the heterogeneous mixed splicing picture and the splicing picture information; or perform isomorphic splicing on the isomorphic blocks of the second expression format, obtain the second isomorphic splicing picture and the splicing picture information, and perform isomorphic splicing on the isomorphic areas of the first expression format Blocks are heterogeneously spliced with homogeneous blocks in the second expression format to obtain heterogeneous mixed splicing images and splicing image information.

也就是說，同構拼接圖中可以包括同一個表達格式的一個同構區塊或者多個同構區塊，異構混合拼接圖包括至少兩種表達格式的至少兩個同構區塊。本申請實施例中，其中，第一表達格式為多視點視訊、點雲、網格中的一個，第二表達格式為多視點視訊、點雲、網格中的一個，第一表達格式和第二表達格式為不同表達格式。如圖7所示，多視點視訊區塊1、多視點視訊區塊2和點雲區塊1拼接得到一種異構混合拼接圖。That is to say, the homogeneous splicing diagram may include one isomorphic block or multiple isomorphic blocks of the same expression format, and the heterogeneous mixed splicing diagram includes at least two isomorphic blocks of at least two expression formats. In the embodiment of the present application, the first expression format is one of multi-view video, point cloud, and grid, the second expression format is one of multi-view video, point cloud, and grid, and the first expression format and the third expression format are one of multi-view video, point cloud, and grid. The two expression formats are different expression formats. As shown in Figure 7, multi-viewpoint video block 1, multi-viewpoint video block 2 and point cloud block 1 are spliced to obtain a heterogeneous hybrid mosaic image.

示例性的，第一表達格式為多視點視訊，第二表達格式為點雲。所述對所述至少一種同構區塊進行拼接，得到至少一個拼接圖和拼接圖資訊，包括：將一部分多視點視訊區塊和一部分點雲區塊拼接成異構混合拼接圖；將另一部分多視點視訊區塊拼接成多視點拼接圖；將另一部分點雲區塊拼接成點雲拼接圖。For example, the first expression format is multi-viewpoint video, and the second expression format is point cloud. The splicing of the at least one homogeneous block to obtain at least one spliced image and spliced image information includes: splicing a part of the multi-viewpoint video block and a part of the point cloud block into a heterogeneous hybrid spliced image; Multi-viewpoint video blocks are spliced into a multi-viewpoint mosaic image; another part of the point cloud blocks are spliced into a point cloud mosaic image.

其中，所述拼接圖資訊包括第一語法元素，所述第一語法元素用於指示拼接圖為異構混合拼接圖或者同構拼接圖。Wherein, the mosaic image information includes a first syntax element, and the first syntax element is used to indicate that the mosaic image is a heterogeneous hybrid mosaic image or a homogeneous mosaic image.

在一些實施例中，所述根據所述第一語法元素確定拼接圖為異構混合拼接圖或者同構拼接圖，包括：所述第一語法元素為第一預設值，則確定所述拼接圖為包括第一表達格式和第二表達格式的同構區塊的異構混合拼接圖，其中，所述第一表達格式和所述第二表達格式為不同表達格式；所述第一語法元素為第二預設值，則確定所述拼接圖為包括所述第一表達格式的同構區塊的同構拼接圖；所述第一語法元素為第三預設值，則確定所述拼接圖為包括所述第二表達格式的同構區塊的同構拼接圖。也就是說，通過為第一語法元素設置不同值，用於指示拼接圖類型。進一步地，所述第一語法元素還可以設置為其他值，用於指示拼接圖為包括其他表達格式的同構區塊的同構拼接圖，或者用於指示拼接圖為包括其他至少兩種表達格式的同構區塊的異構拼接圖。In some embodiments, determining according to the first syntax element that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram includes: if the first syntax element is a first preset value, then determining that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram. The figure shows a heterogeneous mixed splicing diagram including homogeneous blocks of a first expression format and a second expression format, wherein the first expression format and the second expression format are different expression formats; the first syntax element is the second preset value, then it is determined that the splicing diagram is a isomorphic splicing diagram including the isomorphic blocks of the first expression format; the first syntax element is the third preset value, then it is determined that the splicing The figure shows a isomorphic mosaic diagram including the isomorphic blocks of the second expression format. That is to say, by setting different values for the first syntax element, it is used to indicate the mosaic type. Further, the first syntax element can also be set to other values to indicate that the spliced graph is a isomorphic spliced graph that includes isomorphic blocks of other expression formats, or to indicate that the spliced graph includes at least two other expressions. Heterogeneous mosaic graph of homogeneous blocks in format.

在一些實施例中，所述第一語法元素包括至少兩個子語法元素。示例性的，所述第一語法元素包括：第一子語法元素和第二子語法元素，根據所述第一子語法元素和所述第二子語法元素確定所述拼接圖為異構混合拼接圖或者同構拼接圖；所述根據所述第一語法元素確定拼接圖為異構混合拼接圖或者同構拼接圖，包括：所述第一子語法元素為第四預設值，則確定所述拼接圖包括第一表達格式的同構區塊；所述第二子語法元素為第五預設值，則確定所述拼接圖包括第二表達格式的同構區塊。In some embodiments, the first syntax element includes at least two sub-syntax elements. Exemplarily, the first syntax element includes: a first sub-grammar element and a second sub-grammar element. According to the first sub-grammar element and the second sub-grammar element, it is determined that the splicing diagram is heterogeneous hybrid splicing. graph or isomorphic splicing graph; determining that the splicing graph is a heterogeneous hybrid splicing graph or a homogeneous splicing graph according to the first syntax element includes: the first sub-grammar element is a fourth preset value, then it is determined that the splicing graph is a heterogeneous hybrid splicing graph or a isomorphic splicing graph. If the spliced graph includes isomorphic blocks of the first expression format; if the second sub-syntax element is the fifth preset value, it is determined that the spliced graph includes isomorphic blocks of the second expression format.

可以理解為，當所述第一子語法元素為第四預設值，則確定所述拼接圖包括第一表達格式的同構區塊，即確定所述拼接圖為包括所述第一表達格式的同構區塊的同構拼接圖；所述第二子語法元素為第五預設值，則確定所述拼接圖包括第二表達格式的同構區塊，即確定所述拼接圖為包括所述第二表達格式的同構區塊的同構拼接圖；當所述第一子語法元素為第四預設值且所述第二子語法元素為第五預設值，則確定所述拼接圖包括第一表達格式的同構區塊和第二表達格式的同構區塊，即確定所述拼接圖為包括第一表達格式和第二表達格式的同構區塊的異構混合拼接圖。It can be understood that when the first sub-syntax element is the fourth preset value, it is determined that the spliced graph includes isomorphic blocks of the first expression format, that is, it is determined that the spliced graph includes the first expression format. isomorphic splicing diagram of isomorphic blocks; the second sub-grammar element is the fifth preset value, then it is determined that the splicing diagram includes isomorphic blocks of the second expression format, that is, it is determined that the splicing diagram includes Isomorphic mosaic diagram of the isomorphic blocks of the second expression format; when the first sub-grammar element is a fourth preset value and the second sub-grammar element is a fifth preset value, then it is determined that the The spliced diagram includes homogeneous blocks in the first expression format and isomorphic blocks in the second expression format, that is, the spliced diagram is determined to be a heterogeneous hybrid splicing including homogeneous blocks in the first expression format and the second expression format. Figure.

在一些實施例中，所述方法還包括：所述第一子語法元素為第六預設值，則確定所述拼接圖不包括第一表達格式的同構區塊；所述第二子語法元素為第七預設值，則確定所述拼接圖不包括第二表達格式的同構區塊。In some embodiments, the method further includes: when the first sub-grammar element is a sixth preset value, it is determined that the splicing diagram does not include isomorphic blocks of the first expression format; the second sub-grammar element If the element is the seventh preset value, it is determined that the mosaic image does not include isomorphic blocks in the second expression format.

具體地，所述第一子語法元素為第四預設值且所述第二子語法元素為第五預設值，確定所述拼接圖為包括第一表達格式的同構區塊和第二表達格式的同構區塊的異構混合拼接圖，所述第一子語法元素為第四預設值且所述第二子語法元素為第七預設值，確定所述拼接圖為包括所述第一表達格式的同構區塊的同構拼接圖；所述第一子語法元素為第六預設值且所述第二子語法元素為第五預設值，確定所述拼接圖為包括所述第二表達格式的同構區塊的同構拼接圖。Specifically, the first sub-grammar element is a fourth preset value and the second sub-grammar element is a fifth preset value, and it is determined that the mosaic diagram includes isomorphic blocks of the first expression format and the second A heterogeneous mixed mosaic diagram of homogeneous blocks in an expression format, the first sub-grammar element is a fourth preset value and the second sub-grammar element is a seventh preset value, and it is determined that the mosaic diagram includes all The isomorphic mosaic diagram of the isomorphic blocks of the first expression format; the first sub-grammar element is the sixth preset value and the second sub-grammar element is the fifth preset value, and it is determined that the mosaic diagram is A isomorphic mosaic diagram including the isomorphic blocks of the second expression format.

也就是說，還可以根據兩個子語法元素的取值確定拼接圖中同構區塊的表達格式。進一步地，拼接圖包括更多表達格式時，還可以通過多個語法元素來指示拼接圖中同構區塊的表達格式。例如，包括三種表達格式時，設置三個語法元素，包括四種表達格式時設置四個語法元素，也可以通過一個語法元素設置多種取值，來表示多種表達格式。In other words, the expression format of the isomorphic block in the splicing diagram can also be determined based on the values of the two sub-grammatical elements. Furthermore, when the splicing diagram includes more expression formats, multiple syntax elements can also be used to indicate the expression formats of the isomorphic blocks in the splicing diagram. For example, when three expression formats are included, three syntax elements are set, and when four expression formats are included, four syntax elements are set. Multiple values can also be set through one syntax element to represent multiple expression formats.

在一些實施例中，所述第一語法元素位於所述碼流的參數集中。示例性的，碼流的參數集可以為V3C_VPS，第一語法元素可以為V3C_VPS中的ptl_profile_toolset_idc。In some embodiments, the first syntax element is located in a parameter set of the code stream. For example, the parameter set of the code stream may be V3C_VPS, and the first syntax element may be ptl_profile_toolset_idc in V3C_VPS.

在一些實施例中，所述拼接圖對應的拼接圖序列參數集包括所述第一語法元素。示例性的，所述拼接圖對應的拼接圖序列參數集包括所述第一子語法元素和所述第二子語法元素。示例性的，第一子語法元素為拼接圖序列參數集中的asps_vpcc_extension_present_flag，第二子語法元素為asps_miv_extension_present_flag。In some embodiments, the mosaic graph sequence parameter set corresponding to the mosaic graph includes the first syntax element. Exemplarily, the splicing graph sequence parameter set corresponding to the splicing graph includes the first sub-syntax element and the second sub-syntax element. For example, the first sub-syntax element is asps_vpcc_extension_present_flag in the splicing diagram sequence parameter set, and the second sub-syntax element is asps_miv_extension_present_flag.

也就是說，第一語法元素可以位於碼流的參數集中，解碼端能夠更早的解析到每個拼接圖的拼接圖類型。第一語法元素也可以位於每個拼接圖對應的拼接圖序列參數集中，解碼端在解析每個拼接圖時獲取再確定拼接圖類型。In other words, the first syntax element can be located in the parameter set of the code stream, and the decoding end can parse the splicing pattern type of each splicing pattern earlier. The first syntax element may also be located in the mosaic sequence parameter set corresponding to each mosaic image, and the decoding end obtains and then determines the mosaic image type when parsing each mosaic image.

在一些實施例中，本申請實施例的異構混合拼接圖包括以下至少一種：單一屬性異構混合拼接圖和多屬性異構混合拼接圖。In some embodiments, the heterogeneous hybrid mosaic graph of the embodiment of the present application includes at least one of the following: a single attribute heterogeneous hybrid mosaic graph and a multi-attribute heterogeneous hybrid mosaic graph.

其中，單一屬性異構混合拼接圖是指包括的所有同構區塊的屬性資訊均相同的異構混合拼接圖。例如，一張單一屬性異構混合拼接圖只包括屬性資訊的同構區塊，比如只包括多視點視訊紋理區塊和點雲紋理區塊。又例如，一張單一屬性異構混合拼接圖只包括幾何資訊的同構區塊，比如只包括多視點視訊幾何區塊和點雲幾何區塊。Among them, a single-attribute heterogeneous hybrid mosaic diagram refers to a heterogeneous hybrid mosaic diagram in which all homogeneous blocks included have the same attribute information. For example, a single attribute heterogeneous hybrid mosaic image only includes homogeneous blocks of attribute information, such as only multi-view video texture blocks and point cloud texture blocks. For another example, a single-attribute heterogeneous hybrid mosaic image only includes homogeneous blocks of geometric information, such as only multi-view video geometry blocks and point cloud geometry blocks.

多屬性異構混合拼接圖是指包括的至少兩個同構區塊的屬性資訊不同的異構混合拼接圖，例如一張多屬性異構混合拼接圖中既包括屬性資訊的同構區塊，又包括幾何資訊的同構區塊。作為示例，可以將點雲、多視點視訊和網格中至少兩個的任意一個屬性或任意兩個屬性下的區塊拼接在一張圖中，得到異構混合拼接圖。本申請對此不做限定。A multi-attribute heterogeneous hybrid mosaic map refers to a heterogeneous hybrid mosaic map that includes at least two homogeneous blocks with different attribute information. For example, a multi-attribute heterogeneous hybrid mosaic map includes both homogeneous blocks with attribute information. Also includes isomorphic blocks of geometric information. As an example, any one attribute or blocks under any two attributes of at least two of the point cloud, multi-viewpoint video and grid can be spliced into one image to obtain a heterogeneous hybrid spliced image. This application does not limit this.

在一些實施例中，對第一表達格式的單一屬性同構區塊和第二表達格式的單一屬性區塊進行拼接，得到異構混合拼接圖。其中，第一表達格式和第二表達格式均為多視點視訊、點雲和網格中的任意一個，且第一表達格式和所述第二表達格式不同，第一表達格式和第二表達格式的屬性資訊相同。In some embodiments, the single-attribute homogeneous blocks in the first expression format and the single-attribute blocks in the second expression format are spliced to obtain a heterogeneous hybrid spliced image. Wherein, the first expression format and the second expression format are any one of multi-view video, point cloud and grid, and the first expression format and the second expression format are different. The first expression format and the second expression format The attribute information is the same.

多視點視訊的單一屬性同構區塊包括多視點視訊紋理區塊和多視點視訊幾何區塊等中的至少一個。The single attribute isomorphic block of the multi-view video includes at least one of the multi-view video texture block, the multi-view video geometry block, and the like.

點雲的單一屬性同構區塊包括點雲紋理區塊、點雲幾何區塊和點雲佔用情況區塊等中的至少一個。The single attribute isomorphic block of the point cloud includes at least one of a point cloud texture block, a point cloud geometry block, a point cloud occupancy block, and the like.

網格的單一屬性同構區塊包括網格紋理區塊、網格幾何區塊中的至少一個。The single attribute isomorphic block of the grid includes at least one of a grid texture block and a grid geometry block.

例如，將多視點視訊幾何區塊、點雲幾何區塊、網格幾何區塊中的至少兩個拼接在一張圖中，得到一張異構混合拼接圖。該異構混合拼接圖稱為單一屬性異構混合拼接圖。再例如，將多視點視訊紋理區塊、點雲紋理區塊、網格紋理區塊中的至少兩個拼接在一張圖中，得到一張異構混合拼接圖。該異構混合拼接圖稱為單一屬性異構混合拼接圖。For example, at least two of the multi-viewpoint video geometry blocks, point cloud geometry blocks, and grid geometry blocks are spliced into one image to obtain a heterogeneous hybrid spliced image. This heterogeneous mixed mosaic diagram is called a single attribute heterogeneous mixed mosaic diagram. For another example, at least two of the multi-viewpoint video texture blocks, point cloud texture blocks, and grid texture blocks are spliced into one image to obtain a heterogeneous hybrid spliced image. This heterogeneous mixed mosaic diagram is called a single attribute heterogeneous mixed mosaic diagram.

在一些實施例中，對第一表達格式的多屬性同構區塊和第二表達格式的多屬性同構區塊進行拼接，得到異構混合拼接圖。其中，第一表達格式和第二表達格式均為多視點視訊、點雲和網格中的任意一個，且第一表達格式和所述第二表達格式不同，第一表達格式和第二表達格式的屬性資訊不完全相同。In some embodiments, the multi-attribute isomorphic blocks in the first expression format and the multi-attribute isomorphic blocks in the second expression format are spliced to obtain a heterogeneous hybrid spliced image. Wherein, the first expression format and the second expression format are any one of multi-view video, point cloud and grid, and the first expression format and the second expression format are different. The first expression format and the second expression format The attribute information is not exactly the same.

例如，將多視點視訊紋理區塊，與點雲幾何區塊和網格幾何區塊中的至少一個拼接在一張圖中，得到一張異構混合拼接圖。再例如，將多視點視訊幾何區塊，與點雲紋理區塊和網格紋理區塊中的至少一個拼接在一張圖中，得到一張異構混合拼接圖。再例如，將點雲紋理區塊，與多視點視訊幾何區塊和網格幾何區塊中的至少一個拼接在一張圖中，得到一張異構混合拼接圖。再例如，將點雲幾何區塊，與多視點視訊紋理區塊和網格紋理區塊中的至少一個拼接在一張圖中，得到一張異構混合拼接圖。再例如，將點雲幾何區塊、多視點視訊紋理區塊和多視點視訊紋理區塊拼接在一張圖中，得到一張異構混合拼接圖。再例如，將點雲幾何區塊、點雲紋理區塊和多視點視訊紋理區塊和多視點視訊紋理區塊拼接在一張圖中，得到一張異構混合拼接圖。這裡，得到的異構混合拼接圖稱為多屬性異構混合拼接圖。For example, the multi-viewpoint video texture block and at least one of the point cloud geometry block and the mesh geometry block are spliced into one image to obtain a heterogeneous hybrid spliced image. For another example, a multi-viewpoint video geometry block and at least one of a point cloud texture block and a mesh texture block are spliced into one image to obtain a heterogeneous hybrid spliced image. For another example, the point cloud texture block and at least one of the multi-viewpoint video geometry block and the grid geometry block are spliced into one image to obtain a heterogeneous hybrid spliced image. For another example, the point cloud geometry block and at least one of the multi-viewpoint video texture block and the grid texture block are spliced into one image to obtain a heterogeneous hybrid spliced image. For another example, point cloud geometry blocks, multi-viewpoint video texture blocks, and multi-viewpoint video texture blocks are spliced into one image to obtain a heterogeneous hybrid spliced image. For another example, point cloud geometry blocks, point cloud texture blocks, multi-viewpoint video texture blocks, and multi-viewpoint video texture blocks are spliced into one image to obtain a heterogeneous hybrid spliced image. Here, the obtained heterogeneous hybrid mosaic graph is called a multi-attribute heterogeneous hybrid mosaic graph.

下面以第一表達格式為多視點視訊，第二表達格式為點雲為例，對拼接方法進行詳細介紹。Taking the first expression format as multi-viewpoint video and the second expression format as point cloud as an example, the splicing method will be introduced in detail below.

假設多視點視訊區塊包括多視點視訊紋理區塊和多視點視訊幾何區塊，點雲區塊包括點雲紋理區塊、點雲幾何區塊和點雲佔用情況區塊。那麼，上述的異構拼接方式可以包括但不限於如下兩種：It is assumed that the multi-view video block includes a multi-view video texture block and a multi-view video geometry block, and the point cloud block includes a point cloud texture block, a point cloud geometry block and a point cloud occupancy block. Then, the above-mentioned heterogeneous splicing methods can include but are not limited to the following two:

方式一：將多視點視訊紋理區塊、多視點視訊幾何區塊、點雲紋理區塊、點雲幾何區塊和點雲佔用情況區塊，均拼接在一張異構混合拼接圖中。Method 1: Splice the multi-viewpoint video texture block, multi-viewpoint video geometry block, point cloud texture block, point cloud geometry block and point cloud occupancy block into a heterogeneous hybrid mosaic image.

方式二：按照預設的異構拼接方式，將多視點視訊紋理區塊、多視點視訊幾何區塊、點雲紋理區塊、點雲幾何區塊和點雲佔用情況區塊進行拼接，得到M個異構混合拼接圖，M為大於或等於1的正整數。Method 2: According to the preset heterogeneous splicing method, splice the multi-view video texture block, multi-view video geometry block, point cloud texture block, point cloud geometry block and point cloud occupancy block to obtain M A heterogeneous mixed mosaic image, M is a positive integer greater than or equal to 1.

其中，方式二至少可以包括如下幾種示例：示例1，將多視點視訊紋理區塊和點雲紋理區塊進行拼接，得到異構混合紋理拼接圖，將多視點視訊幾何區塊和點雲幾何區塊進行拼接，得到異構混合幾何拼接圖，將點雲佔用情況區塊單獨作為一張混合拼接圖。示例2，將多視點視訊紋理區塊和點雲紋理區塊進行拼接，得到異構混合紋理拼接圖，將多視點視訊幾何區塊、點雲幾何區塊和點雲佔用情況區塊進行拼接，得到異構混合幾何和佔用情況拼接圖。示例3，將多視點視訊紋理區塊、點雲紋理區塊和點雲佔用情況區塊進行拼接，得到一張子異構混合拼接圖，將將多視點視訊幾何區塊和點雲幾何區塊進行拼接，得到另一張子異構混合拼接圖。進一步地，得到M個異構混合拼接圖後，可以對該M個異構混合拼接圖分別進行視訊編碼，得到視訊壓縮子碼流。Among them, the second method can include at least the following examples: Example 1, splicing multi-view video texture blocks and point cloud texture blocks to obtain a heterogeneous mixed texture splicing map, combining multi-view video geometry blocks and point cloud geometry The blocks are spliced to obtain a heterogeneous mixed geometry splicing map, and the point cloud occupancy blocks are separately used as a mixed splicing map. Example 2: Splice multi-view video texture blocks and point cloud texture blocks to obtain a heterogeneous mixed texture splicing map. Splice multi-view video geometry blocks, point cloud geometry blocks and point cloud occupancy blocks. A mosaic of heterogeneous mixed geometry and occupancy is obtained. Example 3: Splice multi-view video texture blocks, point cloud texture blocks and point cloud occupancy blocks to obtain a sub-heterogeneous hybrid stitching image. Multi-view video geometry blocks and point cloud geometry blocks are combined. Splicing, another sub-heterogeneous hybrid splicing picture is obtained. Further, after obtaining M heterogeneous mixed spliced images, video coding can be performed on the M heterogeneous mixed spliced images respectively to obtain video compression sub-streams.

在一些實施例中，本申請實施例的同構拼接圖包括以下至少一種：單一屬性同構拼接圖和多屬性同構拼接圖。在一些實施例中，對第一表達格式的第一屬性同構區塊進行拼接，得到同構拼接圖。或者，第一表達格式的第一屬性同構區塊和第二屬性同構區塊進行拼接，得到同構拼接圖。In some embodiments, the isomorphic splicing graph of the embodiment of the present application includes at least one of the following: a single attribute isomorphic splicing graph and a multi-attribute isomorphic splicing graph. In some embodiments, the first attribute isomorphic blocks of the first expression format are spliced to obtain an isomorphic splicing graph. Alternatively, the first attribute isomorphic block and the second attribute isomorphic block of the first expression format are spliced to obtain an isomorphic splicing diagram.

其中，單一屬性同構拼接圖是指包括的所有同構區塊的表達格式相同和屬性資訊均相同的同構拼接圖。例如，一張單一屬性同構拼接圖只包括第表達格式的屬性資訊的同構區塊，比如張單一屬性同構拼接圖只包括多視點視訊紋理區塊，或只包括點雲紋理區塊。又例如，一張單一屬性同構拼接圖只包括幾何資訊的同構區塊，比如只包括多視點視訊幾何區塊，或只包括點雲幾何區塊。Among them, a single-attribute isomorphic splicing diagram refers to a isomorphic splicing diagram in which all isomorphic blocks included have the same expression format and the same attribute information. For example, a single-attribute isomorphic mosaic includes only isomorphic blocks expressing attribute information in the first format. For example, a single-attribute isomorphic mosaic includes only multi-view video texture blocks, or only point cloud texture blocks. For another example, a single-attribute isomorphic mosaic includes only isomorphic blocks of geometric information, such as only multi-viewpoint video geometry blocks, or only point cloud geometry blocks.

多屬性同構拼接圖是指包括的至少兩個同構區塊的表達格式相同但屬性資訊不同的同構拼接圖，例如一張多屬性同構拼接圖中既包括屬性資訊的同構區塊，又包括幾何資訊的同構區塊。作為示例，一張多屬性同構拼接圖包括多視點視訊紋理區塊和多視點視訊集合區塊。又例如，一張多屬性同構拼接圖包括點雲幾何區塊和點雲紋理區塊，如圖8所示，一張多屬性同構拼接圖包括點雲紋理區塊1、點雲幾何區塊1和點雲幾何區塊2。A multi-attribute isomorphic splicing diagram refers to an isomorphic splicing diagram that includes at least two isomorphic blocks with the same expression format but different attribute information. For example, a multi-attribute isomorphic splicing diagram includes both isomorphic blocks with attribute information. , and also includes isomorphic blocks of geometric information. As an example, a multi-attribute isomorphic mosaic image includes a multi-viewpoint video texture block and a multi-viewpoint video set block. For another example, a multi-attribute isomorphic mosaic image includes a point cloud geometry block and a point cloud texture block. As shown in Figure 8, a multi-attribute isomorphic mosaic image includes a point cloud texture block 1 and a point cloud geometry area. Block 1 and Point Cloud Geometry Block 2.

在一些實施例中，拼接圖資訊還可以包括語法元素，根據所述語法元素確定拼接圖為單一屬性異構混合拼接圖、多屬性異構混合拼接圖、單一屬性同構拼接圖或者多屬性同構拼接圖。In some embodiments, the mosaic map information may also include syntax elements. According to the syntax elements, the mosaic map is determined to be a single attribute heterogeneous hybrid mosaic map, a multi-attribute heterogeneous hybrid mosaic map, a single attribute isomorphic mosaic map, or a multi-attribute homogeneous mosaic map. Construct a mosaic diagram.

步驟603：對所述至少一個拼接圖和拼接圖資訊進行編碼，得到碼流。Step 603: Encode the at least one spliced image and the spliced image information to obtain a code stream.

在一些實施例中，碼流包括視訊壓縮子碼流和拼接圖資訊子碼流。所述對所述至少一個拼接圖和拼接圖資訊進行編碼，得到碼流，包括：對所述至少一個拼接圖進行編碼，得到視訊壓縮子碼流；對所述至少一個拼接圖的拼接圖資訊進行編碼，得到拼接圖資訊子碼流；將所述視訊壓縮子碼流和所述拼接圖資訊子碼流合成所述碼流。這樣，實現在同一壓縮碼流中支援視訊、點雲、網格等異構訊源格式，實現壓縮碼流中同時存在多視點視訊拼接圖、點雲視訊拼接圖、網格拼接圖、異構混合拼接圖，能夠減少所需要調用的HEVC，VVC，AVC，AVS等二維視訊編碼器的個數，降低實現代價，提高易用性。In some embodiments, the code stream includes a video compression sub-stream and a splicing information sub-stream. The encoding of the at least one spliced image and the spliced image information to obtain a code stream includes: encoding the at least one spliced image to obtain a video compression sub-stream; and encoding the spliced image information of the at least one spliced image. Encoding is performed to obtain a spliced picture information sub-stream; the video compression sub-stream and the spliced picture information sub-stream are synthesized into the code stream. In this way, it is possible to support heterogeneous source formats such as video, point cloud, and grid in the same compressed code stream, and to realize the simultaneous existence of multi-viewpoint video splicing images, point cloud video splicing images, grid splicing images, and heterogeneous formats in the compressed code stream. Hybrid splicing can reduce the number of two-dimensional video encoders such as HEVC, VVC, AVC, and AVS that need to be called, reduce implementation costs, and improve ease of use.

在一些實施例中，根據所述第一語法元素確定所述拼接圖為異構混合拼接圖時，所述拼接圖資訊還包括第二語法元素，根據所述第二語法元素確定所述拼接圖中第i個區塊的表達格式。編碼端將第二語法元素寫入碼流，能夠有助於提高解碼端的解碼準確性，同時能夠使得V3C標準支援在同一壓縮碼流中包含多視點視訊和點雲等不同表達格式的視覺媒體內容。In some embodiments, when it is determined based on the first syntax element that the spliced image is a heterogeneous hybrid spliced image, the spliced image information also includes a second syntax element, and the spliced image is determined based on the second syntax element The expression format of the i-th block in . The encoding end writes the second syntax element into the code stream, which can help improve the decoding accuracy of the decoding end, and at the same time enable the V3C standard to support visual media content with different expression formats such as multi-view video and point clouds in the same compressed code stream. .

在一些實施例中，所述根據所述第二語法元素確定所述拼接圖中第i個區塊的表達格式包括：所述第二語法元素為第八預設值，則確定所述第i個區塊的表達格式為第一表達格式；所述第二語法元素為第九預設值，則確定所述第i個區塊的表達格式為第二表達格式。In some embodiments, determining the expression format of the i-th block in the mosaic diagram based on the second syntax element includes: if the second syntax element is an eighth preset value, then determining the i-th block The expression format of the i-th block is the first expression format; if the second syntax element is the ninth preset value, it is determined that the expression format of the i-th block is the second expression format.

具體的，可以通過對第二語法元素置不同的值來指示拼接圖中第i個區塊對應的表達格式類型。以第一表達格式為點雲，第二表達格式為多視點視訊為例，若第i個區塊為點雲區塊，則將第二語法元素置為第八預設值；若第i個區塊為多視點視訊區塊，則將第二語法元素置為第九預設值。本申請實施例對第八預設值和第九預設值的具體取值不做限定。可選的，第八預設值為0。可選的，第九預設值為1。Specifically, the expression format type corresponding to the i-th block in the splicing diagram can be indicated by setting different values to the second syntax element. Taking the first expression format as point cloud and the second expression format as multi-viewpoint video as an example, if the i-th block is a point cloud block, the second syntax element is set to the eighth default value; if the i-th block is a point cloud block, the second syntax element is set to the eighth default value; If the block is a multi-view video block, the second syntax element is set to the ninth default value. The embodiments of this application do not limit the specific values of the eighth preset value and the ninth preset value. Optional, the eighth preset value is 0. Optional, the ninth default value is 1.

在一些實施例中，所述對所述至少一個拼接圖和拼接圖資訊進行編碼，得到碼流，包括：若所述第i個區塊的表達格式為第一表達格式，確定所述第i個區塊中子圖塊採用所述第一表達格式對應的編碼標準進行編碼，得到所述第一表達格式的視覺媒體內容對應的碼流；若所述第i個區塊的表達格式為第二表達格式，確定所述第i個區塊中子圖塊採用所述第二表達格式對應的編碼標準進行編碼，得到所述第二表達格式的視覺媒體內容對應的碼流。In some embodiments, encoding the at least one mosaic and the mosaic information to obtain a code stream includes: if the expression format of the i-th block is a first expression format, determining the i-th block The sub-tiles in each block are encoded using the encoding standard corresponding to the first expression format to obtain a code stream corresponding to the visual media content of the first expression format; if the expression format of the i-th block is the In the second expression format, it is determined that the sub-tiles in the i-th block are encoded using the encoding standard corresponding to the second expression format, and a code stream corresponding to the visual media content of the second expression format is obtained.

在一些實施例中，所述第二語法元素位於所述拼接圖的第i個區塊的拼接圖區塊資料單元頭中。在一些實施例中，第二語法元素還可以位於子圖塊資料單元(patch_data_unit)中。示例性的，在已知第二語法元素(ath_toolset_type)為1的前提下，確定當前子圖塊採用多視點視訊編碼標準進行編碼。在已知第二語法元素(ath_toolset_type)為0的前提下，確定當前子圖塊採用點雲編碼標準進行編碼。In some embodiments, the second syntax element is located in the mosaic block data unit header of the i-th block of the mosaic. In some embodiments, the second syntax element may also be located in a sub-patch data unit (patch_data_unit). For example, on the premise that the second syntax element (ath_toolset_type) is known to be 1, it is determined that the current sub-tile is encoded using the multi-view video coding standard. On the premise that the second syntax element (ath_toolset_type) is known to be 0, it is determined that the current sub-tile is encoded using the point cloud encoding standard.

在一些實施例中，所述對所述至少一個拼接圖和拼接圖資訊進行編碼，得到碼流，包括：調用視訊編碼器，對所述至少一個拼接圖進行編碼，得到視訊壓縮子碼流。In some embodiments, encoding the at least one spliced image and the spliced image information to obtain a code stream includes: calling a video encoder to encode the at least one spliced image to obtain a video compression sub-stream.

本申請實施例中，為了減少編碼器的個數，降低編碼代價，在編碼時，首先將至少兩個視覺媒體內容分別進行處理(即打包)，得到多個同構區塊。接著，將表達格式不完全相同的至少兩個同構區塊拼接成異構混合拼接圖，將表達格式完全相同的至少一個同構區塊拼接成同構拼接圖，對異構混合拼接圖和同構拼接圖進行編碼，得到視訊壓縮子碼流。使得該編解碼方法適用於多種表達格式的視覺媒體內容的應用場景，擴展了應用範圍，而且通過將不同表達格式的同構區塊拼接在一張異構混合拼接圖中進行編碼，在編碼時，可以只調用一次視訊編碼器進行編碼，進而減少了所需要調用的HEVC，VVC，AVC，AVS等二維視訊編碼器的個數，減少了編碼代價，提高易用性。In the embodiment of the present application, in order to reduce the number of encoders and reduce the encoding cost, during encoding, at least two visual media contents are first processed separately (that is, packaged) to obtain multiple isomorphic blocks. Next, at least two homogeneous blocks with different expression formats are spliced into a heterogeneous mixed spliced graph, and at least one homogeneous block with exactly the same expression format is spliced into a homogeneous spliced graph. For the heterogeneous mixed spliced graph and The isomorphic splicing image is encoded to obtain the video compression sub-stream. This encoding and decoding method is suitable for application scenarios of visual media content in multiple expression formats, expanding the scope of application. Moreover, by splicing homogeneous blocks of different expression formats into a heterogeneous hybrid splicing image for encoding, during encoding, it can The video encoder is only called once for encoding, thereby reducing the number of 2D video encoders such as HEVC, VVC, AVC, and AVS that need to be called, reducing encoding costs and improving ease of use.

本申請實施例中，對異構混合拼接圖和同構拼接圖進行視訊編碼，得到視訊壓縮子碼流所使用的視訊編碼器，可以為上述圖2A所示的視訊編碼器。也就是說，本申請實施例將異構混合拼接圖或同構拼接圖作為一幀圖像，首先進行塊劃分，接著使用幀內或幀間預測得到編碼塊的預測值，編碼塊的預測值和原始值進行相減，得到殘差值，對殘差值進行變換和量化處理後，得到視訊壓縮子碼流。In the embodiment of the present application, the video encoder used to perform video encoding on the heterogeneous hybrid splicing image and the homogeneous splicing image to obtain the video compression sub-stream can be the video encoder shown in Figure 2A above. That is to say, in the embodiment of the present application, the heterogeneous hybrid splicing image or the homogeneous splicing image is used as a frame image. Block division is first performed, and then intra-frame or inter-frame prediction is used to obtain the predicted value of the coding block. The predicted value of the coding block Subtract from the original value to obtain the residual value. After transforming and quantizing the residual value, the video compression sub-stream is obtained.

本申請實施例中，在生成至少一個拼接圖的同時，生成每個拼接圖對應的拼接圖資訊。對拼接圖資訊進行編碼得到拼接圖資訊子碼流。其中，拼接圖資訊包括用於指示拼接圖類型的第一語法元素，以及拼接圖中每個同構區塊的表達格式的第二語法元素。本申請實施例對拼接圖資訊進行編碼的方式不做限制，例如使用等長編碼或變長編碼等常規資料壓縮編碼方式進行壓縮。In the embodiment of the present application, while at least one mosaic is generated, mosaic information corresponding to each mosaic is generated. The spliced image information is encoded to obtain the spliced image information sub-stream. The mosaic information includes a first syntax element for indicating the type of the mosaic, and a second syntax element for the expression format of each isomorphic block in the mosaic. The embodiments of the present application do not limit the method of encoding the spliced image information. For example, conventional data compression encoding methods such as equal-length encoding or variable-length encoding may be used for compression.

最後，將視訊壓縮子碼流和拼接圖訊息子碼流寫在同一個碼流中，得到最終的碼流。也就是說，本申請實施例不僅實現在同一壓縮碼流中支援視訊、點雲、網格等異構訊源格式和同構訊源格式。Finally, the video compression sub-stream and the splicing image message sub-stream are written in the same code stream to obtain the final code stream. In other words, the embodiments of the present application not only support heterogeneous source formats such as video, point cloud, grid, etc., but also support homogeneous source formats in the same compressed code stream.

在一些實施例中，該方法還包括：將碼流的參數集進行編碼得到碼流參數集子碼流。具體地，編碼端將視訊壓縮子碼流、拼接圖資訊子碼流和該參數集子碼流合成碼流。所述碼流的參數集子碼流中包括第三語法元素，根據所述第三語法元素確定所述碼流中包括至少一種表達格式的視覺媒體內容對應的碼流。也就是說，編碼端通過發送第三語法元素，用於指示碼流中是否同時包含至少兩種表達格式的視覺媒體內容。示例性的，第三語法元素指示碼流中包括一種表達格式的視覺媒體內容對應的碼流時，可以理解為編碼端對一種表達格式的視覺媒體內容進行處理得到一種同構區塊，對一種同構區塊進行拼接得到同構拼接圖。第三語法元素指示碼流中包括至少兩種表達格式的視覺媒體內容對應的碼流時，可以理解為編碼端對至少兩種表達格式的視覺媒體內容得到至少兩種同構區塊，對至少兩種同構區塊進行拼接得到同構拼接圖和/或異構混合拼接圖。In some embodiments, the method further includes: encoding the parameter set of the code stream to obtain a code stream parameter set sub-stream. Specifically, the encoding end synthesizes the video compression sub-stream, the splicing image information sub-stream and the parameter set sub-stream into a code stream. The parameter set sub-code stream of the code stream includes a third syntax element, and the code stream corresponding to the visual media content including at least one expression format in the code stream is determined according to the third syntax element. That is to say, the encoding end sends the third syntax element to indicate whether the code stream contains visual media content in at least two expression formats at the same time. For example, when the third syntax element indicates that the code stream includes a code stream corresponding to visual media content in an expression format, it can be understood that the encoding end processes the visual media content in an expression format to obtain a isomorphic block, and processes the visual media content in an expression format to obtain a isomorphic block. The isomorphic blocks are spliced to obtain the isomorphic splicing diagram. When the third syntax element indicates that the code stream includes code streams corresponding to visual media content in at least two expression formats, it can be understood that the encoding end obtains at least two isomorphic blocks for the visual media content in at least two expression formats. Two homogeneous blocks are spliced to obtain a homogeneous spliced image and/or a heterogeneous hybrid spliced image.

示例性的，第三語法元素指示碼流中包括至少兩種表達格式的視覺媒體內容對應的碼流時，該方法包括：對第一表達格式的同構區塊進行同構拼接，得到第一同構拼接圖，對第二表達格式的同構區塊進行同構拼接，得到第二同構拼接圖；或者，對第一表達格式的同構區塊和第二表達格式的同構區塊進行異構拼接，得到異構混合拼接圖；或者，對第一表達格式的同構區塊進行同構拼接，得到第一同構拼接圖，對第一表達格式的同構區塊和第二表達格式的同構區塊進行異構拼接，得到異構混合拼接圖；或者，對第二表達格式的同構區塊進行同構拼接，得到第二同構拼接圖，對第一表達格式的同構區塊和第二表達格式的同構區塊進行異構拼接，得到異構混合拼接圖。Exemplarily, when the third syntax element indicates that the code stream includes code streams corresponding to visual media content in at least two expression formats, the method includes: isomorphically splicing the isomorphic blocks of the first expression format to obtain the first Isomorphic splicing diagram: perform isomorphic splicing on the isomorphic blocks of the second expression format to obtain the second isomorphic splicing diagram; or, perform isomorphic splicing on the isomorphic blocks of the first expression format and the isomorphic blocks of the second expression format. Perform heterogeneous splicing to obtain a heterogeneous mixed splicing diagram; or perform isomorphic splicing on the isomorphic blocks of the first expression format to obtain a first homogeneous splicing diagram, and perform homogeneous splicing on the isomorphic blocks of the first expression format and the second The isomorphic blocks of the expression format are heterogeneously spliced to obtain a heterogeneous mixed splicing diagram; or the isomorphic blocks of the second expression format are isomorphically spliced to obtain a second isomorphic splicing diagram. The homogeneous blocks and the homogeneous blocks in the second expression format are heterogeneously spliced to obtain a heterogeneous mixed splicing diagram.

在一些實施例中，通過將第三語法元素置為不同值來指示碼流中包括至少一種表達格式的視覺媒體內容對應的碼流。也就是說，第三語法元素的某些預設值能夠表明碼流中包括一種或多種表達格式的視覺媒體內容對應的碼流。In some embodiments, setting the third syntax element to a different value indicates that the code stream includes a code stream corresponding to the visual media content of at least one expression format. That is to say, certain preset values of the third syntax element can indicate that the code stream includes code streams corresponding to visual media content in one or more expression formats.

示例性的，所述根據所述第三語法元素確定所述碼流中包括至少一種表達格式的視覺媒體內容對應的碼流，包括：所述第三語法元素為第一數值，確定所述碼流中同時包括第一表達格式的視覺媒體內容對應的碼流和第二表達格式的視覺媒體內容對應的碼流；所述第三語法元素為第二數值，確定所述碼流中包括所述第一表達格式的視覺媒體內容對應的碼流；所述第三語法元素為第三數值，確定所述碼流中包括所述第二表達格式的視覺媒體內容對應的碼流。Exemplarily, determining the code stream corresponding to the visual media content including at least one expression format in the code stream according to the third syntax element includes: the third syntax element is a first value, and determining that the code stream The stream simultaneously includes a code stream corresponding to the visual media content in the first expression format and a code stream corresponding to the visual media content in the second expression format; the third syntax element is a second value, which determines that the code stream includes the The code stream corresponding to the visual media content in the first expression format; the third syntax element is a third value, which determines that the code stream includes the code stream corresponding to the visual media content in the second expression format.

示例性的，碼流的參數集可以為V3C_VPS，第三語法元素可以為V3C_VPS中的ptl_profile_toolset_idc。For example, the parameter set of the code stream may be V3C_VPS, and the third syntax element may be ptl_profile_toolset_idc in V3C_VPS.

示例性的，以第一表達格式為多視點視訊，第二表達格式為點雲為例，第三語法元素置為第一數值時，第一數值用於指示碼流中同時包含多視點視訊碼流和點雲碼流。作為具體的例子，當ptl_profile_toolset_idc=X，X為128/129/130/132/133/134則表示當前碼流中同時包含點雲和多視點兩類碼流。又例如，第三語法元素置為第二數值時，第二數值用於指示碼流中只包含點雲碼流。作為具體的例子，當ptl_profile_toolset_idc=X，X為0/1則表示當前碼流中只包含點雲碼流。又例如，第三語法元素置為第三數值，第三數值用於指示碼流中只包含多視點視訊碼流。作為具體的例子，當ptl_profile_toolset_idc=X，X為64/65/66則表示當前碼流中只包含多視點視訊碼流。應理解，以上第一數值、第二數值、第三數值的取值僅作為示例，本申請實施例並不限制與此。For example, taking the first expression format as multi-view video and the second expression format as point cloud, when the third syntax element is set to the first value, the first value is used to indicate that the code stream also contains multi-view video code. Streams and point cloud code streams. As a specific example, when ptl_profile_toolset_idc=X and X is 128/129/130/132/133/134, it means that the current code stream contains both point cloud and multi-viewpoint code streams. For another example, when the third syntax element is set to the second value, the second value is used to indicate that the code stream only contains the point cloud code stream. As a specific example, when ptl_profile_toolset_idc=X, X is 0/1, which means that the current code stream only contains point cloud code streams. For another example, the third syntax element is set to a third value, and the third value is used to indicate that the code stream only contains a multi-view video code stream. As a specific example, when ptl_profile_toolset_idc=X and X is 64/65/66, it means that the current code stream only contains multi-view video code streams. It should be understood that the above values of the first numerical value, the second numerical value, and the third numerical value are only examples, and the embodiments of the present application are not limited thereto.

在該示例中，可以複用現有V3C標準中的V3C_VPS，並為ptl_profile_toolset_idc預配置了0/1，64/65/66，128/129/130/132/133/134等數值來指示當前碼流中包含的碼流類型。本申請實施例在對視覺媒體內容進行編碼時，通過在參數集中添加第三語法元素的取值，來指示碼流中包含那種表達格式的視覺媒體內容對應的碼流，能夠有助於提高解碼端的解碼準確性，同時能夠使得V3C標準支援在同一壓縮碼流中包含多視點視訊、點雲、網格等一種或多種表達格式的視覺媒體內容。In this example, the V3C_VPS in the existing V3C standard can be reused, and ptl_profile_toolset_idc is preconfigured with values such as 0/1, 64/65/66, 128/129/130/132/133/134 to indicate the current code stream. The code stream type included. When encoding visual media content, the embodiment of the present application adds the value of the third syntax element in the parameter set to indicate that the code stream contains the code stream corresponding to the visual media content in that expression format, which can help improve The decoding accuracy of the decoder also enables the V3C standard to support visual media content containing one or more expression formats such as multi-view video, point clouds, grids, etc. in the same compressed code stream.

表3示出了可用的工具集設定檔元件(Available toolset profile components)的一個示例。表3提供了為V3C定義的工具集設定檔元件及其相應的標識語法元素值列表，例如ptl_profile_toolset_idc和ptc_one_v3c_frame_only_flag，該定義可以僅供本文檔使用。語法元素ptl_profile_toolset_idc提供了工具集設定檔的主要定義，如ptc_one_v3c_frame_only_flag等附加語法元素可以指定已定義設定檔的附加特徵或限制。ptc_one_v3c_frame_only_flag可以只用於支援單個V3C幀。需要說明的是，ptl_profile_toolset_idc中的2..63, 67..127,131,135..255保留，暫時未定義，標準組織可能在未來的標準中再做規定。表3中定義的設定檔類型可以包括動態(Dynamic)或靜態(Static)。表3可用的工具集設定檔元件(Available toolset profile components) ptl_profile_toolset_idc ptc_one_v3c_frame_only_flag Toolset profile component 類型 0 0 V-PCC Basic /* Specified in Annex H*/ Dynamic 1 V-PCC Basic Still /* Specified in Annex H*/ Static 1 0 V-PCC Extended /* Specified in Annex H*/ Dynamic 1 V-PCC Extended Still /* Specified in Annex H*/ Static 64 0 MIV Main /* Specified in ISO/IEC 23090-12 */ Dynamic 65 0 MIV Extended /* Specified in ISO/IEC 23090-12 */ Dynamic 66 0 MIV Geometry Absent /* Specified in ISO/IEC 23090-12 */ Dynamic 128 0 V-PCC Basic & MIV Main Dynamic 129 0 V-PCC Basic & MIV Extended Dynamic 130 0 V-PCC Basic & MIV Geometry Absent Dynamic 132 0 V-PCC Extended & MIV Main Dynamic 133 0 V-PCC Extended & MIV Extended Dynamic 134 0 V-PCC Extended & MIV Geometry Absent 134 2..63, 67..127,131,135..255 - 保留 - Table 3 shows an example of available toolset profile components. Table 3 provides a list of toolset profile elements defined for V3C and their corresponding identification syntax element values, such as ptl_profile_toolset_idc and ptc_one_v3c_frame_only_flag. This definition may be used only in this document. The syntax element ptl_profile_toolset_idc provides the main definition of a toolset profile. Additional syntax elements such as ptc_one_v3c_frame_only_flag can specify additional characteristics or limitations of the defined profile. ptc_one_v3c_frame_only_flag can be used to support only a single V3C frame. It should be noted that 2..63, 67..127,131,135..255 in ptl_profile_toolset_idc are reserved and are temporarily undefined. The standards organization may further stipulate them in future standards. The profile types defined in Table 3 can include Dynamic or Static. Table 3 Available toolset profile components (Available toolset profile components) ptl_profile_toolset_idc ptc_one_v3c_frame_only_flag Toolset profile component Type 0 0 V-PCC Basic /* Specified in Annex H*/ Dynamic 1 V-PCC Basic Still /* Specified in Annex H*/ Static 1 0 V-PCC Extended /* Specified in Annex H*/ Dynamic 1 V-PCC Extended Still /* Specified in Annex H*/ Static 64 0 MIV Main /* Specified in ISO/IEC 23090-12 */ Dynamic 65 0 MIV Extended /* Specified in ISO/IEC 23090-12 */ Dynamic 66 0 MIV Geometry Absent /* Specified in ISO/IEC 23090-12 */ Dynamic 128 0 V-PCC Basic & MIV Main Dynamic 129 0 V-PCC Basic & MIV Extended Dynamic 130 0 V-PCC Basic & MIV Geometry Absent Dynamic 132 0 V-PCC Extended & MIV Main Dynamic 133 0 V-PCC Extended & MIV Extended Dynamic 134 0 V-PCC Extended & MIV Geometry Absent 134 2..63, 67..127,131,135..255 - reserve -

在一些實施例中，碼流的參數集還包括第一語法元素，其中，所述第一語法元素用於指示每張拼接圖類型，具體用於指示拼接圖為所述異構混合拼接圖或所述同構拼接圖；將所述第一語法元素寫入所述碼流的參數集。示例性的， V3C_VPS中增加第一語法元素(vps_toolset_type)，vps_toolset_type來分辨每張拼接圖及其對應的V3C unit應該歸屬於點雲拼接圖/多視點拼接圖/點雲+多視點的異構混合拼接圖。同時為了相容之前的標準，實現了以下新增語法和語義，以及對舊語義的約束。In some embodiments, the parameter set of the code stream further includes a first syntax element, wherein the first syntax element is used to indicate the type of each mosaic picture, specifically to indicate that the mosaic picture is the heterogeneous hybrid mosaic picture or The isomorphic splicing diagram; writing the first syntax element into the parameter set of the code stream. For example, the first syntax element (vps_toolset_type) is added to V3C_VPS, and vps_toolset_type is used to determine whether each spliced image and its corresponding V3C unit should belong to a point cloud spliced image/multi-viewpoint spliced image/point cloud + multi-viewpoint heterogeneous mixture. Mosaic diagram. At the same time, in order to be compatible with previous standards, the following new syntax and semantics are implemented, as well as constraints on the old semantics.

示例性的，所述第一語法元素為第一預設值，確定所述拼接圖為包括第一表達格式和第二表達格式的同構區塊的異構混合拼接圖，其中，所述第一表達格式和所述第二表達格式為不同表達格式；所述第一語法元素為第二預設值，確定所述拼接圖為包括所述第一表達格式的同構區塊的同構拼接圖；所述第一語法元素為第三預設值，確定所述拼接圖為包括所述第二表達格式的同構區塊的同構拼接圖。Exemplarily, the first syntax element is a first preset value, which determines that the mosaic graph is a heterogeneous hybrid mosaic graph including homogeneous blocks of the first expression format and the second expression format, wherein the first An expression format and the second expression format are different expression formats; the first syntax element is a second preset value, which determines that the splicing diagram is a isomorphic splicing including isomorphic blocks of the first expression format Figure; the first syntax element is a third preset value, which determines that the mosaic graph is a isomorphic mosaic graph including isomorphic blocks of the second expression format.

示例性的，以第一表達格式為多視點視訊，第二表達格式為點雲為例，第一語法元素為第一預設值，第一預設值用於指示拼接圖包括點雲區塊和多視點視訊區塊的異構混合拼接圖，第一語法元素為第二預設值，第二預設值用於指示拼接圖包括多視點視訊區塊的同構拼接圖(可以稱為多視點視訊拼接圖)，第一語法元素為第三預設值，第三預設值用於指示拼接圖包括點雲區塊的同構拼接圖(可以稱為點雲拼接圖)。For example, taking the first expression format as multi-view video and the second expression format as point cloud, the first syntax element is a first default value, and the first default value is used to indicate that the spliced image includes point cloud blocks. and a heterogeneous hybrid spliced image of multi-viewpoint video blocks, the first syntax element is a second default value, and the second default value is used to indicate that the spliced image includes a homogeneous spliced image of multi-viewpoint video blocks (which can be called multi-viewpoint video blocks). viewpoint video mosaic), the first syntax element is a third preset value, and the third preset value is used to indicate that the mosaic includes a isomorphic mosaic of point cloud blocks (which may be called a point cloud mosaic).

示例性的，得到第三語法元素ptl_profile_toolset_idc=128/129/130/132/133/134後，在VPS中需要對每一張拼接圖解析得到第一語法元素(vps_toolset_type)，判斷vps_toolset_type=X，X為1表示拼接圖僅存在多視點視訊區塊，應滿足多視點編碼方法要求；X為2表示拼接圖僅存在點雲區塊，應滿足點雲編碼方法要求；X為3表示拼接圖同時存在多視點視訊區塊和點雲區塊，應同時滿足多視點和點雲編碼方法要求。應理解，以上第一語法元素的取值僅作為示例，本申請實施例並不限制與此。For example, after obtaining the third syntax element ptl_profile_toolset_idc=128/129/130/132/133/134, it is necessary to parse each splicing image in VPS to obtain the first syntax element (vps_toolset_type), and determine vps_toolset_type=X, X When X is 1, it means that the spliced image only contains multi-view video blocks, which should meet the requirements of the multi-view coding method; when X is 2, it means that the spliced image only contains point cloud blocks, which should meet the requirements of the point cloud encoding method; Multi-view video blocks and point cloud blocks should meet the requirements of multi-view and point cloud coding methods at the same time. It should be understood that the above values of the first syntax element are only examples, and the embodiments of the present application are not limited thereto.

表4示出了通用V3C參數集的語法(General V3C parameter set syntax)，V3C參數集新增語法元素vps_toolset_type，具體可以用vps_toolset_type[j]表示索引為j的拼接圖的類型。通過在V3C參數集中新增語法元素vps_toolset_type，解碼端解碼碼流能夠從V3C參數集獲取vps_toolset_type，根據vps_toolset_type快速分辨每張拼接圖及其對應的V3C unit應該歸屬於點雲/多視點/點雲+多視點，從而確定拼接圖應滿足哪種編碼方法要求。表4通用V3C參數集的語法(General V3C parameter set syntax) v3c_parameter_set( numBytesInV3CPayload ) { Descriptor profile_tier_level( ) vps_v3c_parameter_set_id u(4) vps_reserved_zero_8bits u(8) vps_atlas_count_minus1 u(6) for( k = 0; k ＜ vps_atlas_count_minus1 + 1; k++ ) { vps_atlas_id[ k ] u(6) j = vps_atlas_id[ k ] if (ptl_profile_toolset_idc==128 || ptl_profile_toolset_idc==129 || ptl_profile_toolset_idc==130 || ptl_profile_toolset_idc==132 || ptl_profile_toolset_idc==133 || ptl_profile_toolset_idc==134 ) { vps_ toolset_t ype[ j ] } u(3) vps_frame_width[ j ] ue(v) vps_frame_height[ j ] ue(v) vps_map_count_minus1[ j ] u(4) if( vps_map_count_minus1[ j ] ＞ 0 ) vps_multiple_map_streams_present_flag[ j ] u(1) vps_map_absolute_coding_enabled_flag[ j ][ 0 ] = 1 vps_map_predictor_index_diff[ j ][ 0 ] = 0 for( i = 1; i ＜= vps_map_count_minus1[ j ]; i++ ) { if( vps_multiple_map_streams_present_flag[ j ] ) vps_map_absolute_coding_enabled_flag[ j ][ i ] u(1) else vps_map_absolute_coding_enabled_flag[ j ][ i ] = 1 if( vps_map_absolute_coding_enabled_flag[ j ][ i ] == 0 ) { vps_map_predictor_index_diff[ j ][ i ] ue(v) } } vps_auxiliary_video_present_flag[ j ] u(1) vps_occupancy_video_present_flag[ j ] u(1) vps_geometry_video_present_flag[ j ] u(1) vps_attribute_video_present_flag[ j ] u(1) if( vps_occupancy_video_present_flag[ j ] ) occupancy_information( j ) if( vps_geometry_video_present_flag[ j ] ) geometry_information( j ) if( vps_attribute_video_present_flag[ j ] ) attribute_information( j ) } vps_extension_present_flag u(1) if( vps_extension_present_flag ) { vps_packing_information_present_flag u(1) vps_miv_extension_present_ flag u(1) vps_extension_6bits u(6) } if( vps_packing_information_present_flag ) { for( k = 0; k ＜= vps_atlas_count_minus1; k++ ) { j = vps_atlas_id[ k ] vps_packed_video_present_flag[ j ] if( vps_packed_video_present_flag[ j ] ) packing_information( j ) } } if( vps_miv_extension_present_flag ) vps_miv_extension( ) /*Specified in ISO/IEC 23090-12[1] */ if( vps_extension_6bits ) { vps_extension_length_minus1 ue(v) for( j = 0; j ＜ vps_extension_length_minus1 + 1; j++ ) { vps_extension_data_byte u(8) } } byte_alignment( ) } Table 4 shows the syntax of the general V3C parameter set (General V3C parameter set syntax). The V3C parameter set has a new syntax element vps_toolset_type. Specifically, vps_toolset_type[j] can be used to represent the type of the splicing image with index j. By adding the syntax element vps_toolset_type to the V3C parameter set, the decoding end decoding stream can obtain vps_toolset_type from the V3C parameter set. According to the vps_toolset_type, each stitched image and its corresponding V3C unit should be quickly identified as point cloud/multi-viewpoint/point cloud+ Multiple viewpoints to determine which coding method the spliced image should meet. Table 4 General V3C parameter set syntax (General V3C parameter set syntax) v3c_parameter_set( numBytesInV3CPayload ) { Descriptor profile_tier_level( ) vps_v3c_parameter_set_id u(4) vps_reserved_zero_8bits u(8) vps_atlas_count_minus1 u(6) for( k = 0; k < vps_atlas_count_minus1 + 1; k++ ) { vps_atlas_id [k] u(6) j = vps_atlas_id[k] if ( ptl_profile_toolset_idc==128 || ptl_profile_toolset_idc==129 || ptl_profile_toolset_idc==130 || ptl_profile_toolset_idc==132 || ptl_profile_toolset_idc==133 || ptl_profile_toolset_idc==134 ) { vps_ toolset_t ype[ j ] } u(3) vps_frame_width [j] ue(v) vps_frame_height [j] ue(v) vps_map_count_minus1 [j] u(4) if( vps_map_count_minus1[ j ] ＞ 0 ) vps_multiple_map_streams_present_flag [j] u(1) vps_map_absolute_coding_enabled_flag[ j ][ 0 ] = 1 vps_map_predictor_index_diff[ j ][ 0 ] = 0 for( i = 1; i <= vps_map_count_minus1[ j ]; i++ ) { if( vps_multiple_map_streams_present_flag[ j ] ) vps_map_absolute_coding_enabled_flag [ j ][ i ] u(1) else vps_map_absolute_coding_enabled_flag[ j ][ i ] = 1 if( vps_map_absolute_coding_enabled_flag[ j ][ i ] == 0 ) { vps_map_predictor_index_diff [ j ][ i ] ue(v) } } vps_auxiliary_video_present_flag [j] u(1) vps_occupancy_video_present_flag [j] u(1) vps_geometry_video_present_flag [j] u(1) vps_attribute_video_present_flag [j] u(1) if( vps_occupancy_video_present_flag[ j ] ) occupation_information(j) if( vps_geometry_video_present_flag[ j ] ) geometry_information(j) if( vps_attribute_video_present_flag[ j ] ) attribute_information(j) } vps_extension_present_flag u(1) if( vps_extension_present_flag ) { vps_packing_information_present_flag u(1) vps_miv_extension_present_ flag u(1) vps_extension_6bits u(6) } if( vps_packing_information_present_flag ) { for( k = 0; k <= vps_atlas_count_minus1; k++ ) { j = vps_atlas_id[k] vps_packed_video_present_flag [j] if( vps_packed_video_present_flag[ j ] ) packing_information(j) } } if(vps_miv_extension_present_flag) vps_miv_extension( ) /*Specified in ISO/IEC 23090-12[1] */ if( vps_extension_6bits ) { vps_extension_length_minus1 ue(v) for( j = 0; j < vps_extension_length_minus1 + 1; j++ ) { vps_extension_data_byte u(8) } } byte_alignment( ) }

在一些實施例中，所述拼接圖對應的拼接圖序列參數集包括所述第一語法元素。示例性的，所述拼接圖對應的拼接圖序列參數集包括所述第一子語法元素和所述第二子語法元素。所述第一子語法元素和所述第二子語法元素用於指示拼接圖類型，其中，所述拼接圖為所述異構混合拼接圖或所述同構拼接圖。In some embodiments, the mosaic graph sequence parameter set corresponding to the mosaic graph includes the first syntax element. Exemplarily, the splicing graph sequence parameter set corresponding to the splicing graph includes the first sub-syntax element and the second sub-syntax element. The first sub-syntax element and the second sub-syntax element are used to indicate a splicing diagram type, wherein the splicing diagram is the heterogeneous hybrid splicing diagram or the isomorphic splicing diagram.

示例性的，所述第一子語法元素為第四預設值且所述第二子語法元素為第五預設值，確定所述拼接圖為包括第一表達格式的同構區塊和第二表達格式的同構區塊的異構混合拼接圖，所述第一子語法元素為第四預設值且所述第二子語法元素為第七預設值，確定所述拼接圖為包括所述第一表達格式的同構區塊的同構拼接圖；所述第一子語法元素為第六預設值且所述第二子語法元素為第五預設值，確定所述拼接圖為包括所述第二表達格式的同構區塊的同構拼接圖。Exemplarily, the first sub-grammar element is a fourth preset value and the second sub-grammar element is a fifth preset value, and it is determined that the mosaic diagram includes the isomorphic block of the first expression format and the th A heterogeneous hybrid splicing diagram of homogeneous blocks in two expression formats, the first sub-grammar element is a fourth preset value and the second sub-grammar element is a seventh preset value, and it is determined that the splicing diagram includes The isomorphic mosaic diagram of the isomorphic blocks of the first expression format; the first sub-grammar element is the sixth preset value and the second sub-grammar element is the fifth preset value, determining the mosaic diagram It is a isomorphic mosaic diagram including the isomorphic blocks of the second expression format.

可選的，第一子語法元素為拼接圖序列參數集中的asps_vpcc_extension_present_flag，第二子語法元素為asps_miv_extension_present_flag。Optionally, the first sub-grammar element is asps_vpcc_extension_present_flag in the splicing diagram sequence parameter set, and the second sub-grammar element is asps_miv_extension_present_flag.

例如，可以在V3C_AD碼流的NAL-ASPS包含asps_miv_extension_present_flag和asps_vpcc_extension_present_flag。For example, the NAL-ASPS of the V3C_AD code stream may contain asps_miv_extension_present_flag and asps_vpcc_extension_present_flag.

本申請實施例中，通過將第一子語法元素和第二子語法元素置特定值來指示拼接圖為所述異構混合拼接圖或所述同構拼接圖。In the embodiment of the present application, the first sub-syntax element and the second sub-syntax element are set to specific values to indicate that the spliced image is the heterogeneous hybrid spliced image or the isomorphic spliced image.

示例性的，得到第三語法元素ptl_profile_toolset_idc=128/129/130/132/133/134後，從拼接圖的拼接圖資訊中獲取asps_vpcc_extension_present_flag=X和asps_miv_extension_present_flag=Y。X為0，Y為1表示拼接圖僅存在多視點視訊區塊，應滿足多視點編碼方法要求；X為1，Y為0表示拼接圖僅存在點雲區塊，應滿足點雲編碼方法要求；X為1，Y為1表示拼接圖同時存在多視點視訊區塊和點雲區塊，應同時滿足多視點和點雲編碼方法要求。應理解，以上第七預設值、第八預設值的取值僅作為示例，本申請實施例並不限制與此。For example, after obtaining the third syntax element ptl_profile_toolset_idc=128/129/130/132/133/134, obtain asps_vpcc_extension_present_flag=X and asps_miv_extension_present_flag=Y from the splicing image information of the splicing image. When X is 0 and Y is 1, it means that the spliced image only contains multi-viewpoint video blocks, which should meet the requirements of the multi-viewpoint encoding method; when X is 1 and Y is 0, it means that the spliced image only contains point cloud blocks, which should meet the requirements of the point cloud encoding method. ; X is 1 and Y is 1, which means that the mosaic image contains both multi-view video blocks and point cloud blocks, and should meet the requirements of multi-view and point cloud coding methods at the same time. It should be understood that the above values of the seventh preset value and the eighth preset value are only examples, and the embodiments of the present application are not limited thereto.

表5示出了通用拼接圖序列參數集的語法(General atlas sequence parameter set RBSP syntax)，拼接圖序列參數集可以理解為拼接圖資訊，編碼端利用拼接圖序列參數集中的語法元素asps_vpcc_extension_present_flag和asps_miv_extension_present_flag表示拼接圖類型，編碼端解析碼流能夠從拼接圖的參數集中獲取這兩個語法元素，根據這兩個語法元素的取值確定拼接圖應該歸屬於點雲/多視點/點雲+多視點，從而確定拼接圖應滿足哪種編碼方法要求。表5通用拼接圖序列參數集的語法(General atlas sequence parameter set RBSP syntax) atlas_sequence_parameter_set_rbsp( ) { Descriptor asps_atlas_sequence_parameter_set_id ue(v) asps_frame_width ue(v) asps_frame_height ue(v) asps_geometry_3d_bit_depth_minus1 u(5) if ( vps_toolset_t ype==3 ) asps_miv_geometry_3d_bit_depth_minus1 u(5) asps_geometry_2d_bit_depth_minus1 u(5) asps_log2_max_atlas_frame_order_cnt_lsb_minus4 ue(v) asps_max_dec_atlas_frame_buffering_minus1 ue(v) asps_long_term_ref_atlas_frames_flag u(1) asps_num_ref_atlas_frame_lists_in_asps ue(v) for( i = 0; i ＜ asps_num_ref_atlas_frame_lists_in_asps; i++ ) ref_list_struct( i ) asps_use_eight_orientations_flag u(1) asps_extended_projection_enabled_flag u(1) if( asps_extended_projection_enabled_flag ) asps_max_number_projections_minus1 ue(v) asps_normal_axis_limits_quantization_enabled_flag u(1) asps_normal_axis_max_delta_value_enabled_flag u(1) asps_patch_precedence_order_flag u(1) asps_log2_patch_packing_block_size u(3) asps_patch_size_quantizer_present_flag u(1) asps_map_count_minus1 u(4) asps_pixel_deinterleaving_enabled_flag u(1) if( asps_pixel_deinterleaving_enabled_flag ) for( j = 0; j ＜= asps_map_count_minus1; j++ ) asps_map_pixel_deinterleaving_flag[ j ] u(1) asps_raw_patch_enabled_flag u(1) asps_eom_patch_enabled_flag u(1) if( asps_eom_patch_enabled_flag && asps_map_count_minus1 == 0 ) asps_eom_fix_bit_count_minus1 u(4) if( asps_raw_patch_enabled_flag || asps_eom_patch_enabled_flag ) asps_auxiliary_video_enabled_flag u(1) asps_plr_enabled_flag u(1) if( asps_plr_enabled_flag ) asps_plr_information( asps_map_count_minus1 ) asps_ vui_parameters_present_flag u(1) if( asps_vui_parameters_present_flag ) vui_parameters( ) asps_extension_present_flag u(1) if( asps_extension_present_flag ) { asps_vpcc_extension_present_flag u(1) asps_miv_extension_present_flag u(1) asps_extension_6bits u(6) } if( asps_vpcc_extension_present_flag ) asps_vpcc_extension( ) /* Specified in Annex H */ if( asps_miv_extension_present_flag ) asps_miv_extension( ) /* Specified in ISO/IEC 23090-12 */ if( asps_extension_6bits ) while( more_rbsp_data( ) ) asps_extension_data_flag u(1) rbsp_trailing_bits( ) } Table 5 shows the syntax of the general atlas sequence parameter set RBSP syntax. The splicing map sequence parameter set can be understood as splicing map information. The encoding end uses the syntax elements asps_vpcc_extension_present_flag and asps_miv_extension_present_flag in the splicing map sequence parameter set to represent The type of splicing image. The encoding end can obtain these two syntax elements from the parameter set of the splicing image by parsing the code stream. Based on the values of these two syntax elements, it is determined that the splicing image should belong to point cloud/multi-viewpoint/point cloud+multi-viewpoint. This determines which encoding method requirements the spliced image should meet. Table 5 General atlas sequence parameter set RBSP syntax atlas_sequence_parameter_set_rbsp( ) { Descriptor asps_atlas_sequence_parameter_set_id ue(v) asps_frame_width ue(v) asps_frame_height ue(v) asps_geometry_3d_bit_depth_minus1 u(5) if ( vps_toolset_t ype==3 ) asps_miv_geometry_3d_bit_depth_minus1 u(5) asps_geometry_2d_bit_depth_minus1 u(5) asps_log2_max_atlas_frame_order_cnt_lsb_minus4 ue(v) asps_max_dec_atlas_frame_buffering_minus1 ue(v) asps_long_term_ref_atlas_frames_flag u(1) asps_num_ref_atlas_frame_lists_in_asps ue(v) for( i = 0; i <asps_num_ref_atlas_frame_lists_in_asps; i++ ) ref_list_struct(i) asps_use_eight_orientations_flag u(1) asps_extended_projection_enabled_flag u(1) if(asps_extended_projection_enabled_flag) asps_max_number_projections_minus1 ue(v) asps_normal_axis_limits_quantization_enabled_flag u(1) asps_normal_axis_max_delta_value_enabled_flag u(1) asps_patch_precedence_order_flag u(1) asps_log2_patch_packing_block_size u(3) asps_patch_size_quantizer_present_flag u(1) asps_map_count_minus1 u(4) asps_pixel_deinterleaving_enabled_flag u(1) if( asps_pixel_deinterleaving_enabled_flag) for( j = 0; j <= asps_map_count_minus1; j++ ) asps_map_pixel_deinterleaving_flag [j] u(1) asps_raw_patch_enabled_flag u(1) asps_eom_patch_enabled_flag u(1) if( asps_eom_patch_enabled_flag && asps_map_count_minus1 == 0 ) asps_eom_fix_bit_count_minus1 u(4) if( asps_raw_patch_enabled_flag || asps_eom_patch_enabled_flag ) asps_auxiliary_video_enabled_flag u(1) asps_plr_enabled_flag u(1) if(asps_plr_enabled_flag) asps_plr_information( asps_map_count_minus1) asps_ vui_parameters_present_flag u(1) if( asps_vui_parameters_present_flag) vui_parameters( ) asps_extension_present_flag u(1) if( asps_extension_present_flag ) { asps_vpcc_extension_present_flag u(1) asps_miv_extension_present_flag u(1) asps_extension_6bits u(6) } if( asps_vpcc_extension_present_flag) asps_vpcc_extension( ) /* Specified in Annex H */ if( asps_miv_extension_present_flag) asps_miv_extension( ) /* Specified in ISO/IEC 23090-12 */ if(asps_extension_6bits) while( more_rbsp_data( ) ) asps_extension_data_flag u(1) rbsp_trailing_bits( ) }

在實現上述語法後，能夠實現一個VPS下同時存在多視點和點雲的拼接圖，進一步地需要實現在一張拼接圖中存在多個同構區塊時，每個同構區塊均為多視點子塊圖合集或者點雲子塊圖合集的情況。由於現有技術只能實現一張拼接圖記憶體在一種同構區塊。因此本申請實施例增加了第二語法元素，根據第二語法元素確定一張拼接圖中一個同構區塊的表達格式是多視點視訊、點雲或網格等。After implementing the above syntax, it is possible to realize a spliced image with multiple viewpoints and point clouds under a VPS. It is further necessary to realize that when there are multiple isomorphic blocks in a spliced image, each isomorphic block is a multi-view mosaic. The case of a collection of viewpoint sub-block images or a collection of point cloud sub-block images. Since the existing technology can only realize a spliced graph memory in a homogeneous block. Therefore, the embodiment of the present application adds a second syntax element. According to the second syntax element, it is determined whether the expression format of a homogeneous block in a spliced image is multi-view video, point cloud, grid, etc.

在一些實施例中，第i個區塊的拼接圖區塊資料單元頭中包括第二語法元素，根據所述第一語法元素確定所述拼接圖為異構混合拼接圖時，所述拼接圖資訊還包括第二語法元素，根據所述第二語法元素確定所述拼接圖中第i個區塊的表達格式。In some embodiments, the mosaic map block data unit header of the i-th block includes a second syntax element. When it is determined according to the first syntax element that the mosaic map is a heterogeneous hybrid mosaic map, the mosaic map The information also includes a second syntax element, and the expression format of the i-th block in the mosaic is determined according to the second syntax element.

本申請實施例在對異構混合拼接圖進行編碼時，通過設置第二語法元素，來指示異構混合拼接圖中第i個區塊的表達格式，能夠有助於提高解碼端的解碼準確性，同時能夠使得V3C標準支援在同一壓縮碼流中包含多視點視訊和點雲等不同表達格式的視覺媒體內容。示例性的，所述第二語法元素可以是拼接圖區塊資料單元頭(atlas_tile_header)中ath_toolset_type。When encoding the heterogeneous hybrid splicing image, the embodiment of the present application sets a second syntax element to indicate the expression format of the i-th block in the heterogeneous hybrid splicing image, which can help improve the decoding accuracy of the decoder. At the same time, the V3C standard can support visual media content in different expression formats such as multi-view video and point cloud in the same compressed code stream. For example, the second syntax element may be ath_toolset_type in the mosaic map tile data unit header (atlas_tile_header).

示例性的，所述第二語法元素為第八預設值，確定所述第i個區塊的表達格式為第一表達格式；所述第二語法元素為第九預設值，確定所述第i個區塊的表達格式為第二表達格式。Exemplarily, the second syntax element is the eighth preset value, and it is determined that the expression format of the i-th block is the first expression format; the second syntax element is the ninth preset value, and it is determined that the expression format of the i-th block is the first expression format. The expression format of the i-th block is the second expression format.

示例性的，拼接圖為異構混合拼接圖時，在AD unit中對ACL NAL unit type碼流中解析atlas_tile_header，從中解析得到ath_toolset_type，判斷ath_toolset_type=X，X為0表示當前區塊為點雲區塊；X為1表示當前區塊為多視點視訊區塊。For example, when the spliced image is a heterogeneous hybrid spliced image, the atlas_tile_header is parsed in the ACL NAL unit type code stream in the AD unit, the ath_toolset_type is obtained from the analysis, and ath_toolset_type=X is determined. If X is 0, it means that the current block is a point cloud area. block; X is 1 indicating that the current block is a multi-view video block.

表6示出了拼接圖區塊資料單元頭語法(Atlas tile header syntax)，編碼端在拼接圖區塊資料單元頭語法中新增語法元素ath_toolset_type，用於表示區塊類型，解碼碼流能夠拼接圖區塊資料單元頭語法獲取ath_toolset_type，從而確定當前區塊應該屬於多視點視訊解碼還是點雲解碼。表6拼接圖區塊資料單元頭語法(Atlas tile header syntax) atlas_tile_header( ) { Descriptor if( nal_unit_type ＞= NAL_BLA_W_LP && nal_unit_type ＜= NAL_RSV_IRAP_ACL_29 ) ath_no_output_of_prior_atlas_frames_flag u(1) ath_atlas_frame_parameter_set_id ue(v) ath_atlas_adaptation_parameter_set_id ue(v) ath_id u(v) If( vps_toolset_type==3 ) ath_toolset_type u(1) tileID = ath_id ath_type ue(v) if( afps_output_flag_present_flag ) ath_atlas_output_flag u(1) ath_atlas_frm_order_cnt_lsb u(v) if( asps_num_ref_atlas_frame_lists_in_asps ＞ 0 ) ath_ref_atlas_frame_list_asps_flag u(1) if( ath_ref_atlas_frame_list_asps_flag == 0 ) ref_list_struct( asps_num_ref_atlas_frame_lists_in_asps ) else if( asps_num_ref_atlas_frame_lists_in_asps ＞ 1 ) ath_ref_atlas_frame_list_idx u(v) for( j = 0; j ＜ NumLtrAtlasFrmEntries[ RlsIdx ]; j++ ) { ath_additional_afoc_lsb_present_flag[ j ] u(1) if( ath_additional_afoc_lsb_present_flag[ j ] ) ath_additional_afoc_lsb_val[ j ] u(v) } if( ath_type != SKIP_TILE ) { if( asps_normal_axis_limits_quantization_enabled_flag ) { ath_pos_min_d_quantizer u(5) if( asps_normal_axis_max_delta_value_enabled_flag ) ath_pos_delta_max_d_quantizer u(5) } if( asps_patch_size_quantizer_present_flag ) { ath_patch_size_x_info_quantizer u(3) ath_patch_size_y_info_quantizer u(3) } if( afps_raw_3d_offset_bit_count_explicit_mode_flag ) ath_raw_3d_offset_axis_bit_count_minus1 u(v) if( ath_type == P_TILE && num_ref_entries[ RlsIdx ] ＞ 1 ) { ath_num_ref_idx_active_override_flag u(1) if( ath_num_ref_idx_active_override_flag ) ath_num_ref_idx_active_minus1 ue(v) } } byte_alignment( ) } Table 6 shows the Atlas tile header syntax (Atlas tile header syntax). The encoding end adds a new syntax element ath_toolset_type in the Atlas tile data unit header syntax, which is used to indicate the block type. The decoded code stream can be spliced. The picture block data unit header syntax obtains ath_toolset_type to determine whether the current block belongs to multi-view video decoding or point cloud decoding. Table 6 Atlas tile header syntax atlas_tile_header( ) { Descriptor if( nal_unit_type ＞= NAL_BLA_W_LP && nal_unit_type ＜= NAL_RSV_IRAP_ACL_29 ) ath_no_output_of_prior_atlas_frames_flag u(1) ath_atlas_frame_parameter_set_id ue(v) ath_atlas_adaptation_parameter_set_id ue(v) ath_id u(v) If( vps_toolset_type==3 ) ath_toolset_type u(1) tileID = ath_id ath_type ue(v) if(afps_output_flag_present_flag) ath_atlas_output_flag u(1) ath_atlas_frm_order_cnt_lsb u(v) if( asps_num_ref_atlas_frame_lists_in_asps ＞ 0 ) ath_ref_atlas_frame_list_asps_flag u(1) if( ath_ref_atlas_frame_list_asps_flag == 0 ) ref_list_struct( asps_num_ref_atlas_frame_lists_in_asps ) else if( asps_num_ref_atlas_frame_lists_in_asps ＞ 1 ) ath_ref_atlas_frame_list_idx u(v) for( j = 0; j < NumLtrAtlasFrmEntries[ RlsIdx ]; j++ ) { ath_additional_afoc_lsb_present_flag [j] u(1) if( ath_additional_afoc_lsb_present_flag[ j ] ) ath_additional_afoc_lsb_val [j] u(v) } if( ath_type != SKIP_TILE ) { if( asps_normal_axis_limits_quantization_enabled_flag ) { ath_pos_min_d_quantizer u(5) if( asps_normal_axis_max_delta_value_enabled_flag) ath_pos_delta_max_d_quantizer u(5) } if( asps_patch_size_quantizer_present_flag ) { ath_patch_size_x_info_quantizer u(3) ath_patch_size_y_info_quantizer u(3) } if( afps_raw_3d_offset_bit_count_explicit_mode_flag ) ath_raw_3d_offset_axis_bit_count_minus1 u(v) if( ath_type == P_TILE && num_ref_entries[ RlsIdx ] > 1 ) { ath_num_ref_idx_active_override_flag u(1) if( ath_num_ref_idx_active_override_flag ) ath_num_ref_idx_active_minus1 ue(v) } } byte_alignment( ) }

可選的，第二語法元素還可以位於子圖塊資料單元(patch_data_unit)中。示例性的，在已知第二語法元素(ath_toolset_type)為1的前提下，確定當前子圖塊採用多視點視訊編碼方法進行編碼。在已知第二語法元素(ath_toolset_type)為0的前提下，確定當前子圖塊採用點雲編碼方法進行編碼。子圖塊資料單元語法(Patch data unit syntax)可以如表7所示：表7子圖塊資料單元語法(Patch data unit syntax) patch_data_unit( tileID, patchIdx ) { Descriptor pdu_2d_pos_x[ tileID ][ patchIdx ] ue(v) pdu_2d_pos_y[ tileID ][ patchIdx ] ue(v) pdu_2d_size_x_minus1[ tileID ][ patchIdx ] ue(v) pdu_2d_size_y_minus1[ tileID ][ patchIdx ] ue(v) pdu_3d_offset_u[ tileID ][ patchIdx ] u(v) pdu_3d_offset_v[ tileID ][ patchIdx ] u(v) pdu_3d_offset_d[ tileID ][ patchIdx ] u(v) if( asps_normal_axis_max_delta_value_enabled_flag ) pdu_3d_range_d[ tileID ][ patchIdx ] u(v) pdu_projection_id[ tileID ][ patchIdx ] u(v) pdu_orientation_index[ tileID ][ patchIdx ] u(v) if( afps_lod_mode_enabled_flag ) { pdu_lod_enabled_flag[ tileID ][ patchIdx ] u(1) if( pdu_lod_enabled_flag[ tileID ][ patchIdx ] ) { pdu_lod_scale_x_minus1[ tileID ][ patchIdx ] ue(v) pdu_lod_scale_y_idc[ tileID ][ patchIdx ] ue(v) } } if( asps_plr_enabled_flag && ath_toolset_type==0) plr_data( tileID, patchIdx ) if( asps_miv_extension_present_flag && ath_toolset_type==1) pdu_miv_extension( tileID, patchIdx ) /* Specified in ISO/IEC 23090-12 */ } Optionally, the second syntax element can also be located in the sub-patch data unit (patch_data_unit). For example, on the premise that the second syntax element (ath_toolset_type) is known to be 1, it is determined that the current sub-tile is encoded using the multi-viewpoint video encoding method. On the premise that the second syntax element (ath_toolset_type) is known to be 0, it is determined that the current sub-tile is encoded using the point cloud encoding method. The sub-patch data unit syntax (Patch data unit syntax) can be shown in Table 7: Table 7 Sub-patch data unit syntax (Patch data unit syntax) patch_data_unit(tileID, patchIdx) { Descriptor pdu_2d_pos_x [tileID][patchIdx] ue(v) pdu_2d_pos_y [tileID][patchIdx] ue(v) pdu_2d_size_x_minus1 [tileID][patchIdx] ue(v) pdu_2d_size_y_minus1 [tileID][patchIdx] ue(v) pdu_3d_offset_u [tileID][patchIdx] u(v) pdu_3d_offset_v [tileID][patchIdx] u(v) pdu_3d_offset_d [tileID][patchIdx] u(v) if( asps_normal_axis_max_delta_value_enabled_flag) pdu_3d_range_d [tileID][patchIdx] u(v) pdu_projection_id [tileID][patchIdx] u(v) pdu_orientation_index [tileID][patchIdx] u(v) if( afps_lod_mode_enabled_flag ) { pdu_lod_enabled_flag [tileID][patchIdx] u(1) if( pdu_lod_enabled_flag[ tileID ][ patchIdx ] ) { pdu_lod_scale_x_minus1 [tileID][patchIdx] ue(v) pdu_lod_scale_y_idc [tileID][patchIdx] ue(v) } } if( asps_plr_enabled_flag && ath_toolset_type==0 ) plr_data(tileID, patchIdx) if( asps_miv_extension_present_flag && ath_toolset_type==1 ) pdu_miv_extension( tileID, patchIdx ) /* Specified in ISO/IEC 23090-12 */ }

在一些實施例中，vps_toolset_type[j]值為1表示索引為j的拼接圖(atlas)的工具集檔次元件的語法元素的取值應符合ISO/IEC 23090-12 表A-1-1(即表8)中規定的取值；In some embodiments, the value of vps_toolset_type[j] is 1, indicating that the value of the syntax element of the toolset profile element of the mosaic map (atlas) with index j should comply with ISO/IEC 23090-12 Table A-1-1 (i.e. The values specified in Table 8);

vps_toolset_type[ j ]值為2表示索引為j的atlas的的工具集檔次元件的語法元素的取值應符合ISO/IEC 23090-5 表H-3中規定的取值，但是vps_extension_present_flag, vps_packing_information_present_flag, vps_miv_extension_present_flag，vuh_unit_type, vps_atlas_count_minus1 的取值除外，它們的取值應符合ISO/IEC 23090-12 表A-1-1中規定的取值；The value of vps_toolset_type[j] is 2, which means that the value of the syntax element of the atlas toolset profile element with index j should comply with the values specified in ISO/IEC 23090-5 Table H-3, but vps_extension_present_flag, vps_packing_information_present_flag, vps_miv_extension_present_flag, Except for the values of vuh_unit_type, vps_atlas_count_minus1, their values should comply with the values specified in ISO/IEC 23090-12 Table A-1-1;

vps_toolset_type[j]值為3表示索引為j的atlas的工具集檔次元件的語法元素的取值應符合擴展後的ISO/IEC 23090-12 表A-1-2(即表9-1和表9-2)中規定的取值；其中表A-1-1和表A-1-2分別表示集成碼流下針對多視點的工具箱檔次元件的相關語法的限制和針對異構資料的工具箱檔次元件相關語法的限制。A value of 3 for vps_toolset_type[j] indicates that the value of the syntax element of the atlas toolset grade element with index j should comply with the extended ISO/IEC 23090-12 Table A-1-2 (i.e. Table 9-1 and Table 9 -2); Table A-1-1 and Table A-1-2 respectively represent the relevant syntax restrictions of toolbox grade components for multi-viewpoints under the integrated code stream and the toolbox grade for heterogeneous data. Restrictions on component-related syntax.

vps_toolset_type[j]值為0或4~7中的任意值表示為將保留的值用於ISO/IEC將來使用，並且不應出現在符合本文檔此版本的位元流中。符合本文檔此版本的解碼器應忽略此類保留單元類型。MIV工具集設定檔的語法元素值的允許值。A vps_toolset_type[j] value of 0 or any value from 4 to 7 indicates that the value is reserved for future use by ISO/IEC and should not appear in bitstreams conforming to this version of this document. Decoders conforming to this version of this document should ignore such reserved unit types. Allowed values for syntax element values for MIV toolset profiles.

ath_toolset_type 表示當前tile的工具集檔次元件的語法元素的取值應符合ISO/IEC 23090-12 擴展的表A-1中規定的取值。ath_toolset_type 取值範圍應在0和1之間。表8工具集設定檔語法元素的允許取值(Allowable values of syntax element values for the MIV toolset profile) Profile name Syntax element MIV Main MIV Extended MIV Extended Restricted Geometry MIV Geometry Absent vuh_unit_type V3C_VPS, V3C_AD, V3C_GVD, V3C_AVD, or V3C_CAD V3C_VPS, V3C_AD, V3C_OVD, V3C_GVD, V3C_AVD, V3C_PVD, or V3C CAD V3C_VPS, V3C_AD, V3C_AVD, V3C_PVD, or V3C CAD V3C_VPS, V3C_AD, V3C_AVD, V3C_PVD, or V3C CAD ptl_profile_toolset_idc 64 65 66 ptl_profile_reconstruction_idc 255 255 255 ptc_restricted_geometry_flag N/A 0 1 N/A vps_miv_extension_present_flag 1 1 1 1 vps_packing_information_present_flag 0 0, 1 0, 1 0, 1 vps_map_count_minus1[ atlasID ] 0 0 0 0 vps_auxiliary_video_present_flag[ atlasID ] 0 0 0 0 vps_occupancy_video_present_flag[ atlasID ] 0 0, 1 0 0 vps_geometry_video_present_flag[ atlasID ] 1 0, 1 0 0 vps_packed_video_present_flag[ atlasID ] 0 0, 1 0, 1 0, 1 vme_embedded_occupancy_enabled_flag 1 0, 1 0 0 oi_occupancy_MSB_align_flag[ atlasID ] 0 0 0 0 gi_geometry_MSB_align_flag[ atlasID ] 0 0 0 0 ai_attribute_count[ atlasID ] 0, 1 0, 1, 2 2 0, 1 ai_attribute_type_id[ atlasID ][ attrIdx ] ATTR_TEXTURE ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE ai_attribute_dimension_minus1[ atlasID ] [ attrTextureIdx ] 2 2 2 2 ai_attribute_dimension_minus1[ atlasID ] [ attrTransparencyIdx ] N/A 0 0 N/A ai_attribute_dimension_partitions_minus1[ atlasID ] [ attrIdx ] 0 0 0 0 ai_attribute_MSB_align_flag[ atlasID ][ attrIdx ] 0 0 0 0 pin_attribute_count[ atlasID ] N/A 0, 1, 2 2 N/A pin_attribute_type_id[ atlasID ] [ attrIdx ] N/A ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE, ATTR_TRANSPARENCY N/A pin_attribute_dimension_minus1[ atlasID ] [ attrTextureIdx ] N/A 2 2 N/A pin_attribute_dimension_minus1[ atlasID ] [ attrTransparencyIdx ] N/A 0 0 N/A pin_attribute_dimension_partitions_minus1[ atlasID ] [ attrIdx ] N/A 0 0 N/A pin_attribute_MSB_align_flag[ atlasID ][ attrIdx ] N/A 0 0 N/A asps_long_term_ref_atlas_frames_flag 0 0 0 0 asps_pixel_deinterleaving_enabled_flag 0 0 0 0 asps_patch_precedence_order_flag 0 0 0 0 asps_raw_patch_enabled_flag 0 0 0 0 asps_eom_patch_enabled_flag 0 0 0 0 asps_plr_enabled_flag 0 0 0 0 asps_vpcc_extension_present_flag 0 0 0 0 asme_patch_constant_depth_flag 0 0, 1 1 0, 1 vps_geometry_video_present_flag[ atlasID ] || pin_geometry_present_flag[ atlasID ] || asme_patch_constant_depth_flag -N/A 1 1 N/A afps_lod_mode_enabled_flag 0 0 0 0 afps_raw_3d_pos_bit_count_explicit_mode_flag 0 0 0 0 afti_single_tile_in_atlas_frame_flag 1 0, 1 0, 1 0, 1 dq_quantization_law[ viewID ] 0 0 0 0 ath_type I_TILE I_TILE I_TILE I_TILE atdu_patch_mode[ tileID ][ patchIdx ] I_INTRA I_INTRA I_INTRA I_INTRA asps_atlas_sequence_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive afps_atlas_frame_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive afps_atlas_sequence_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive aaps_atlas_adaptation_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive ath_atlas_frame_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive ath_atlas_adaptation_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive 表9-1工具集設定檔語法元素的允許取值擴展(Allowable values of syntax element values for the MIV toolset profile(Extended)) Profile name Syntax element MIV Main Mixed V-PCC Basic MIV Extended Mixed V-PCC Basic MIV Extended Restricted Geometry Mixed V-PCC Basic MIV Geometry Absent Mixed V-PCC Basic vuh_unit_type V3C_VPS, V3C_AD, V3C_GVD, V3C_AVD, V3C_OVD, or V3C_CAD V3C_VPS, V3C_AD, V3C_OVD, V3C_GVD, V3C_AVD, V3C_PVD, or V3C CAD V3C_VPS, V3C_AD, V3C_AVD, V3C_PVD, V3C_GVD, V3C_OVD or V3C CAD V3C_VPS, V3C_AD, V3C_AVD, V3C_PVD, V3C_GVD, V3C_OVD or V3C CAD ptl_profile_toolset_idc 128 129 130 ptl_profile_reconstruction_idc - - - ptc_restricted_geometry_flag N/A 0 1 N/A ptc_no_eight_orientations_constraint_flag 1 1 1 1 vps_ toolset_t ype[ atlasID ] 1,2 11 22 33 11 22 33 11 22 33 ath_toolset_type N/A N/A N/A 0, 1 N/A N/A 0, 1 N/A N/A 0, 1 vps_miv_extension_present_flag 1 1 1 1 vps_packing_information_present_flag 0 0, 1 0, 1 0, 1 vps_map_count_minus1[ atlasID ] 0,1 0,1 0,1 0,1 vps_auxiliary_video_present_flag[ atlasID ] 0,1 0,1 0,1 0,1 vps_occupancy_video_present_flag[ atlasID ] 0,1 0, 1 0,1 0,1 vps_geometry_video_present_flag[ atlasID ] 1 0, 1 0 0 vps_packed_video_present_flag[ atlasID ] 0 0, 1 0, 1 0, 1 vme_embedded_occupancy_enabled_flag 1 0, 1 0,1 0,1 oi_occupancy_MSB_align_flag[ atlasID ] 0 0 0 0 gi_geometry_MSB_align_flag[ atlasID ] 0 0 0 0 ai_attribute_count[ atlasID ] 0, 1 0, 1, 2 2 0, 1 ai_attribute_type_id[ atlasID ][ attrIdx ] ATTR_TEXTURE ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE ai_attribute_dimension_minus1[ atlasID ] [ attrTextureIdx ] 2 2 2 2 ai_attribute_dimension_minus1[ atlasID ] [ attrTransparencyIdx ] N/A 0 0 N/A ai_attribute_dimension_partitions_minus1[ atlasID ] [ attrIdx ] 0 0 0 0 ai_attribute_MSB_align_flag[ atlasID ][ attrIdx ] 0 0 0 0 pin_attribute_count[ atlasID ] N/A 0, 1, 2 2 N/A pin_attribute_type_id[ atlasID ] [ attrIdx ] N/A ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE, ATTR_TRANSPARENCY N/A pin_attribute_dimension_minus1[ atlasID ] [ attrTextureIdx ] N/A 2 2 N/A pin_attribute_dimension_minus1[ atlasID ] [ attrTransparencyIdx ] N/A 0 0 N/A pin_attribute_dimension_partitions_minus1[ atlasID ] [ attrIdx ] N/A 0 0 N/A pin_attribute_MSB_align_flag[ atlasID ][ attrIdx ] N/A 0 0 N/A asps_use_eight_orientations_flag 0 0 0 0 asps_extended_projection_enabled_flag 0 0 0 0 asps_long_term_ref_atlas_frames_flag 0 0 0 0 asps_pixel_deinterleaving_enabled_flag 0 0 0 0 asps_patch_precedence_order_flag 0 0 0 0 asps_raw_patch_enabled_flag 0 0 0 0 asps_eom_patch_enabled_flag 0 0 0 0 asps_plr_enabled_flag 0 0 0 0 asps_vpcc_extension_present_flag 1 1 1 1 asme_patch_constant_depth_flag 0 0, 1 1 0, 1 vps_geometry_video_present_flag[ atlasID ] || pin_geometry_present_flag[ atlasID ] || asme_patch_constant_depth_flag -N/A 1 1 N/A afps_lod_mode_enabled_flag 0 0 0 0 afps_raw_3d_pos_bit_count_explicit_mode_flag 0 0 0 0 afti_single_tile_in_atlas_frame_flag 1 0, 1 0, 1 0, 1 dq_quantization_law[ viewID ] 0 0 0 0 ath_type I_TILE I_TILE I_TILE I_TILE atdu_patch_mode[ tileID ][ patchIdx ] I_INTRA I_INTRA I_INTRA I_INTRA asps_atlas_sequence_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive afps_atlas_frame_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive afps_atlas_sequence_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive aaps_atlas_adaptation_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive ath_atlas_frame_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive ath_atlas_adaptation_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive 表9-2工具集設定檔語法元素的允許取值擴展(Allowable values of syntax element values for the MIV toolset profile(Extended)) Profile name Syntax element MIV Main Mixed V-PCC Extended MIV Extended Mixed V-PCC Extended MIV Extended Restricted Geometry Mixed V-PCC Extended MIV Geometry Absent Mixed V-PCC Extended vuh_unit_type V3C_VPS, V3C_AD, V3C_GVD, V3C_AVD, V3C_OVD, or V3C_CAD V3C_VPS, V3C_AD, V3C_OVD, V3C_GVD, V3C_AVD, V3C_PVD, or V3C CAD V3C_VPS, V3C_AD, V3C_AVD, V3C_PVD, V3C_GVD, V3C_OVD or V3C CAD V3C_VPS, V3C_AD, V3C_AVD, V3C_PVD, V3C_GVD, V3C_OVD or V3C CAD ptl_profile_toolset_idc 132 133 133 134 ptl_profile_reconstruction_idc - - - ptc_restricted_geometry_flag N/A 0 1 N/A ptc_no_eight_orientations_constraint_flag 0 0 0 0 vps_ toolset_t ype[ atlasID ] 1,2 1 2 3 1 2 3 1 2 3 ath_toolset_type N/A N/A N/A 0, 1 N/A N/A 0, 1 N/A N/A 0, 1 vps_miv_extension_present_flag 1 1 1 1 vps_packing_information_present_flag 0 0, 1 0, 1 0, 1 vps_map_count_minus1[ atlasID ] - - - - vps_auxiliary_video_present_flag[ atlasID ] 0,1 0,1 0,1 0,1 vps_occupancy_video_present_flag[ atlasID ] 0,1 0, 1 0,1 0,1 vps_geometry_video_present_flag[ atlasID ] 1 0, 1 0 0 vps_packed_video_present_flag[ atlasID ] 0 0, 1 0, 1 0, 1 vme_embedded_occupancy_enabled_flag 1 0, 1 0,1 0,1 oi_occupancy_MSB_align_flag[ atlasID ] 0 0 0 0 gi_geometry_MSB_align_flag[ atlasID ] 0 0 0 0 ai_attribute_count[ atlasID ] 0, 1 0, 1, 2 2 0, 1 ai_attribute_type_id[ atlasID ][ attrIdx ] ATTR_TEXTURE ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE ai_attribute_dimension_minus1[ atlasID ] [ attrTextureIdx ] 2 2 2 2 ai_attribute_dimension_minus1[ atlasID ] [ attrTransparencyIdx ] N/A 0 0 N/A ai_attribute_dimension_partitions_minus1[ atlasID ] [ attrIdx ] 0 0 0 0 ai_attribute_MSB_align_flag[ atlasID ][ attrIdx ] 0 0 0 0 pin_attribute_count[ atlasID ] N/A 0, 1, 2 2 N/A pin_attribute_type_id[ atlasID ] [ attrIdx ] N/A ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE, ATTR_TRANSPARENCY N/A pin_attribute_dimension_minus1[ atlasID ] [ attrTextureIdx ] N/A 2 2 N/A pin_attribute_dimension_minus1[ atlasID ] [ attrTransparencyIdx ] N/A 0 0 N/A pin_attribute_dimension_partitions_minus1[ atlasID ] [ attrIdx ] N/A 0 0 N/A pin_attribute_MSB_align_flag[ atlasID ][ attrIdx ] N/A 0 0 N/A asps_use_eight_orientations_flag 0,1 0,1 0,1 0,1 asps_extended_projection_enabled_flag 0,1 0,1 0,1 0,1 asps_long_term_ref_atlas_frames_flag 0 0 0 0 asps_pixel_deinterleaving_enabled_flag 0 0 0 0 asps_patch_precedence_order_flag 0 0 0 0 asps_raw_patch_enabled_flag 0 0 0 0 asps_eom_patch_enabled_flag 0 0 0 0 asps_plr_enabled_flag 0，1 0,1 0,1 0,1 asps_vpcc_extension_present_flag 1 1 1 1 asme_patch_constant_depth_flag 0 0, 1 1 0, 1 vps_geometry_video_present_flag[ atlasID ] || pin_geometry_present_flag[ atlasID ] || asme_patch_constant_depth_flag -N/A 1 1 N/A afps_lod_mode_enabled_flag 0 0 0 0 afps_raw_3d_pos_bit_count_explicit_mode_flag 0 0 0 0 afti_single_tile_in_atlas_frame_flag 1 0, 1 0, 1 0, 1 dq_quantization_law[ viewID ] 0 0 0 0 ath_type I_TILE I_TILE I_TILE I_TILE atdu_patch_mode[ tileID ][ patchIdx ] I_INTRA I_INTRA I_INTRA I_INTRA asps_atlas_sequence_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive afps_atlas_frame_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive afps_atlas_sequence_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive aaps_atlas_adaptation_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive ath_atlas_frame_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive ath_atlas_adaptation_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive ath_toolset_type indicates that the value of the syntax element of the tool set level element of the current tile should comply with the value specified in Table A-1 of the ISO/IEC 23090-12 extension. The value range of ath_toolset_type should be between 0 and 1. Table 8 Allowable values of syntax element values for the MIV toolset profile Profile name Syntax element MIV Main MIV Extended MIV Extended Restricted Geometry MIV Geometry Absent vuh_unit_type V3C_VPS, V3C_AD, V3C_GVD, V3C_AVD, or V3C_CAD V3C_VPS, V3C_AD, V3C_OVD, V3C_GVD, V3C_AVD, V3C_PVD, or V3C CAD V3C_VPS, V3C_AD, V3C_AVD, V3C_PVD, or V3C CAD V3C_VPS, V3C_AD, V3C_AVD, V3C_PVD, or V3C CAD ptl_profile_toolset_idc 64 65 66 ptl_profile_reconstruction_idc 255 255 255 ptc_restricted_geometry_flag N/A 0 1 N/A vps_miv_extension_present_flag 1 1 1 1 vps_packing_information_present_flag 0 0, 1 0, 1 0, 1 vps_map_count_minus1[ atlasID ] 0 0 0 0 vps_auxiliary_video_present_flag[ atlasID ] 0 0 0 0 vps_occupancy_video_present_flag[ atlasID ] 0 0, 1 0 0 vps_geometry_video_present_flag[ atlasID ] 1 0, 1 0 0 vps_packed_video_present_flag[ atlasID ] 0 0, 1 0, 1 0, 1 vme_embedded_occupancy_enabled_flag 1 0, 1 0 0 oi_occupancy_MSB_align_flag[ atlasID ] 0 0 0 0 gi_geometry_MSB_align_flag[ atlasID ] 0 0 0 0 ai_attribute_count[ atlasID ] 0, 1 0, 1, 2 2 0, 1 ai_attribute_type_id[ atlasID ][ attrIdx ] ATTR_TEXTURE ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE ai_attribute_dimension_minus1[ atlasID ] [ attrTextureIdx ] 2 2 2 2 ai_attribute_dimension_minus1[ atlasID ] [ attrTransparencyIdx ] N/A 0 0 N/A ai_attribute_dimension_partitions_minus1[ atlasID ] [ attrIdx ] 0 0 0 0 ai_attribute_MSB_align_flag[ atlasID ][ attrIdx ] 0 0 0 0 pin_attribute_count[ atlasID ] N/A 0, 1, 2 2 N/A pin_attribute_type_id[ atlasID ] [ attrIdx ] N/A ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE, ATTR_TRANSPARENCY N/A pin_attribute_dimension_minus1[ atlasID ] [ attrTextureIdx ] N/A 2 2 N/A pin_attribute_dimension_minus1[ atlasID ] [ attrTransparencyIdx ] N/A 0 0 N/A pin_attribute_dimension_partitions_minus1[ atlasID ] [ attrIdx ] N/A 0 0 N/A pin_attribute_MSB_align_flag[ atlasID ][ attrIdx ] N/A 0 0 N/A asps_long_term_ref_atlas_frames_flag 0 0 0 0 asps_pixel_deinterleaving_enabled_flag 0 0 0 0 asps_patch_precedence_order_flag 0 0 0 0 asps_raw_patch_enabled_flag 0 0 0 0 asps_eom_patch_enabled_flag 0 0 0 0 asps_plr_enabled_flag 0 0 0 0 asps_vpcc_extension_present_flag 0 0 0 0 asme_patch_constant_depth_flag 0 0, 1 1 0, 1 vps_geometry_video_present_flag[ atlasID ] || pin_geometry_present_flag[ atlasID ] || asme_patch_constant_depth_flag -N/A 1 1 N/A afps_lod_mode_enabled_flag 0 0 0 0 afps_raw_3d_pos_bit_count_explicit_mode_flag 0 0 0 0 afti_single_tile_in_atlas_frame_flag 1 0, 1 0, 1 0, 1 dq_quantization_law[viewID] 0 0 0 0 ath_type I_TILE I_TILE I_TILE I_TILE atdu_patch_mode[tileID][patchIdx] I_INTRA I_INTRA I_INTRA I_INTRA asps_atlas_sequence_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive afps_atlas_frame_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive afps_atlas_sequence_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive aaps_atlas_adaptation_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive ath_atlas_frame_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive ath_atlas_adaptation_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive Table 9-1 Allowable values of syntax element values for the MIV toolset profile(Extended) Profile name Syntax element MIV Main Mixed V-PCC Basic MIV Extended Mixed V-PCC Basic MIV Extended Restricted Geometry Mixed V-PCC Basic MIV Geometry Absent Mixed V-PCC Basic vuh_unit_type V3C_VPS, V3C_AD, V3C_GVD, V3C_AVD, V3C_OVD, or V3C_CAD V3C_VPS, V3C_AD, V3C_OVD, V3C_GVD, V3C_AVD, V3C_PVD, or V3C CAD V3C_VPS, V3C_AD, V3C_AVD, V3C_PVD, V3C_GVD, V3C_OVD or V3C CAD V3C_VPS, V3C_AD, V3C_AVD, V3C_PVD, V3C_GVD, V3C_OVD or V3C CAD ptl_profile_toolset_idc 128 129 130 ptl_profile_reconstruction_idc - - - ptc_restricted_geometry_flag N/A 0 1 N/A ptc_no_eight_orientations_constraint_flag 1 1 1 1 vps_ toolset_t ype[ atlasID ] 1,2 11 twenty two 33 11 twenty two 33 11 twenty two 33 ath_toolset_type N/A N/A N/A 0, 1 N/A N/A 0, 1 N/A N/A 0, 1 vps_miv_extension_present_flag 1 1 1 1 vps_packing_information_present_flag 0 0, 1 0, 1 0, 1 vps_map_count_minus1[ atlasID ] 0,1 0,1 0,1 0,1 vps_auxiliary_video_present_flag[ atlasID ] 0,1 0,1 0,1 0,1 vps_occupancy_video_present_flag[ atlasID ] 0,1 0, 1 0,1 0,1 vps_geometry_video_present_flag[ atlasID ] 1 0, 1 0 0 vps_packed_video_present_flag[ atlasID ] 0 0, 1 0, 1 0, 1 vme_embedded_occupancy_enabled_flag 1 0, 1 0,1 0,1 oi_occupancy_MSB_align_flag[ atlasID ] 0 0 0 0 gi_geometry_MSB_align_flag[ atlasID ] 0 0 0 0 ai_attribute_count[ atlasID ] 0, 1 0, 1, 2 2 0, 1 ai_attribute_type_id[ atlasID ][ attrIdx ] ATTR_TEXTURE ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE ai_attribute_dimension_minus1[ atlasID ] [ attrTextureIdx ] 2 2 2 2 ai_attribute_dimension_minus1[ atlasID ] [ attrTransparencyIdx ] N/A 0 0 N/A ai_attribute_dimension_partitions_minus1[ atlasID ] [ attrIdx ] 0 0 0 0 ai_attribute_MSB_align_flag[ atlasID ][ attrIdx ] 0 0 0 0 pin_attribute_count[ atlasID ] N/A 0, 1, 2 2 N/A pin_attribute_type_id[ atlasID ] [ attrIdx ] N/A ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE, ATTR_TRANSPARENCY N/A pin_attribute_dimension_minus1[ atlasID ] [ attrTextureIdx ] N/A 2 2 N/A pin_attribute_dimension_minus1[ atlasID ] [ attrTransparencyIdx ] N/A 0 0 N/A pin_attribute_dimension_partitions_minus1[ atlasID ] [ attrIdx ] N/A 0 0 N/A pin_attribute_MSB_align_flag[ atlasID ][ attrIdx ] N/A 0 0 N/A asps_use_eight_orientations_flag 0 0 0 0 asps_extended_projection_enabled_flag 0 0 0 0 asps_long_term_ref_atlas_frames_flag 0 0 0 0 asps_pixel_deinterleaving_enabled_flag 0 0 0 0 asps_patch_precedence_order_flag 0 0 0 0 asps_raw_patch_enabled_flag 0 0 0 0 asps_eom_patch_enabled_flag 0 0 0 0 asps_plr_enabled_flag 0 0 0 0 asps_vpcc_extension_present_flag 1 1 1 1 asme_patch_constant_depth_flag 0 0, 1 1 0, 1 vps_geometry_video_present_flag[ atlasID ] || pin_geometry_present_flag[ atlasID ] || asme_patch_constant_depth_flag -N/A 1 1 N/A afps_lod_mode_enabled_flag 0 0 0 0 afps_raw_3d_pos_bit_count_explicit_mode_flag 0 0 0 0 afti_single_tile_in_atlas_frame_flag 1 0, 1 0, 1 0, 1 dq_quantization_law[viewID] 0 0 0 0 ath_type I_TILE I_TILE I_TILE I_TILE atdu_patch_mode[tileID][patchIdx] I_INTRA I_INTRA I_INTRA I_INTRA asps_atlas_sequence_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive afps_atlas_frame_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive afps_atlas_sequence_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive aaps_atlas_adaptation_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive ath_atlas_frame_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive ath_atlas_adaptation_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive Table 9-2 Allowable values of syntax element values for the MIV toolset profile(Extended) Profile name Syntax element MIV Main Mixed V-PCC Extended MIV Extended Mixed V-PCC Extended MIV Extended Restricted Geometry Mixed V-PCC Extended MIV Geometry Absent Mixed V-PCC Extended vuh_unit_type V3C_VPS, V3C_AD, V3C_GVD, V3C_AVD, V3C_OVD, or V3C_CAD V3C_VPS, V3C_AD, V3C_OVD, V3C_GVD, V3C_AVD, V3C_PVD, or V3C CAD V3C_VPS, V3C_AD, V3C_AVD, V3C_PVD, V3C_GVD, V3C_OVD or V3C CAD V3C_VPS, V3C_AD, V3C_AVD, V3C_PVD, V3C_GVD, V3C_OVD or V3C CAD ptl_profile_toolset_idc 132 133 133 134 ptl_profile_reconstruction_idc - - - ptc_restricted_geometry_flag N/A 0 1 N/A ptc_no_eight_orientations_constraint_flag 0 0 0 0 vps_ toolset_t ype[ atlasID ] 1,2 1 2 3 1 2 3 1 2 3 ath_toolset_type N/A N/A N/A 0, 1 N/A N/A 0, 1 N/A N/A 0, 1 vps_miv_extension_present_flag 1 1 1 1 vps_packing_information_present_flag 0 0, 1 0, 1 0, 1 vps_map_count_minus1[ atlasID ] - - - - vps_auxiliary_video_present_flag[ atlasID ] 0,1 0,1 0,1 0,1 vps_occupancy_video_present_flag[ atlasID ] 0,1 0, 1 0,1 0,1 vps_geometry_video_present_flag[ atlasID ] 1 0, 1 0 0 vps_packed_video_present_flag[ atlasID ] 0 0, 1 0, 1 0, 1 vme_embedded_occupancy_enabled_flag 1 0, 1 0,1 0,1 oi_occupancy_MSB_align_flag[ atlasID ] 0 0 0 0 gi_geometry_MSB_align_flag[ atlasID ] 0 0 0 0 ai_attribute_count[ atlasID ] 0, 1 0, 1, 2 2 0, 1 ai_attribute_type_id[ atlasID ][ attrIdx ] ATTR_TEXTURE ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE ai_attribute_dimension_minus1[ atlasID ] [ attrTextureIdx ] 2 2 2 2 ai_attribute_dimension_minus1[ atlasID ] [ attrTransparencyIdx ] N/A 0 0 N/A ai_attribute_dimension_partitions_minus1[ atlasID ] [ attrIdx ] 0 0 0 0 ai_attribute_MSB_align_flag[ atlasID ][ attrIdx ] 0 0 0 0 pin_attribute_count[ atlasID ] N/A 0, 1, 2 2 N/A pin_attribute_type_id[ atlasID ] [ attrIdx ] N/A ATTR_TEXTURE, ATTR_TRANSPARENCY ATTR_TEXTURE, ATTR_TRANSPARENCY N/A pin_attribute_dimension_minus1[ atlasID ] [ attrTextureIdx ] N/A 2 2 N/A pin_attribute_dimension_minus1[ atlasID ] [ attrTransparencyIdx ] N/A 0 0 N/A pin_attribute_dimension_partitions_minus1[ atlasID ] [ attrIdx ] N/A 0 0 N/A pin_attribute_MSB_align_flag[ atlasID ][ attrIdx ] N/A 0 0 N/A asps_use_eight_orientations_flag 0,1 0,1 0,1 0,1 asps_extended_projection_enabled_flag 0,1 0,1 0,1 0,1 asps_long_term_ref_atlas_frames_flag 0 0 0 0 asps_pixel_deinterleaving_enabled_flag 0 0 0 0 asps_patch_precedence_order_flag 0 0 0 0 asps_raw_patch_enabled_flag 0 0 0 0 asps_eom_patch_enabled_flag 0 0 0 0 asps_plr_enabled_flag 0,1 0,1 0,1 0,1 asps_vpcc_extension_present_flag 1 1 1 1 asme_patch_constant_depth_flag 0 0, 1 1 0, 1 vps_geometry_video_present_flag[ atlasID ] || pin_geometry_present_flag[ atlasID ] || asme_patch_constant_depth_flag -N/A 1 1 N/A afps_lod_mode_enabled_flag 0 0 0 0 afps_raw_3d_pos_bit_count_explicit_mode_flag 0 0 0 0 afti_single_tile_in_atlas_frame_flag 1 0, 1 0, 1 0, 1 dq_quantization_law[viewID] 0 0 0 0 ath_type I_TILE I_TILE I_TILE I_TILE atdu_patch_mode[tileID][patchIdx] I_INTRA I_INTRA I_INTRA I_INTRA asps_atlas_sequence_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive afps_atlas_frame_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive afps_atlas_sequence_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive aaps_atlas_adaptation_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive ath_atlas_frame_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive ath_atlas_adaptation_parameter_set_id 0..63, inclusive 0..63, inclusive 0..63, inclusive 0..63, inclusive

圖9為本申請實施例提供的V3C位元流結構的一個示意圖。其中，V3C_VPS的V3C參數集()(V3C_parameter_set())中可以包括ptl_profile_toolset_idc，ptl_profile_toolset_idc為128/129/130/132/133/134則表示當前碼流中同時包含點雲碼流 (比如VPCC basic或VPCC extended等)和多視點視訊碼流(比如MIV main或MIV Extended 或MIV Geometry Absent等)。Figure 9 is a schematic diagram of the V3C bit stream structure provided by the embodiment of the present application. Among them, the V3C parameter set () (V3C_parameter_set()) of V3C_VPS can include ptl_profile_toolset_idc. If ptl_profile_toolset_idc is 128/129/130/132/133/134, it means that the current code stream also contains a point cloud code stream (such as VPCC basic or VPCC extended, etc.) and multi-view video code streams (such as MIV main or MIV Extended or MIV Geometry Absent, etc.).

V3C_VPS的V3C參數集()(V3C_parameter_set())中可以包括第一語法元素(vps_toolset_type)，在ptl_profile_toolset_idc為128/129/130/132/133/134的情況下，vps_toolset_type為1表示當前拼接圖僅存在多視點視訊區塊，為2表示當前拼接圖僅存在點雲區塊，為3表示當前拼接圖同時存在多視點區塊和點雲區塊。The V3C parameter set () (V3C_parameter_set()) of V3C_VPS can include the first syntax element (vps_toolset_type). When ptl_profile_toolset_idc is 128/129/130/132/133/134, vps_toolset_type is 1, which means that the current splicing diagram only exists For multi-view video blocks, a value of 2 means that only point cloud blocks exist in the current mosaic, and a value of 3 means that both multi-view blocks and point cloud blocks exist in the current mosaic.

或者，V3C_AD的atlas子位元流()(Atlas_sub_bitstream())中NAL_ASPS中的拼接圖序列參數集()(Atlas_sequence_parameter_set_rbsp())可以包括asps_vpcc_extension_present_flag和asps_miv_extension_present_flag。在ptl_profile_toolset_idc為128/129/130/132/133/134的情況下，asps_vpcc_extension_present_flag=X，asps_miv_extension_present_flag=Y。X為0，Y為1表示拼接圖僅存在多視點視訊區塊；X為1，Y為0表示拼接圖僅存在點雲區塊；X為1，Y為1表示拼接圖同時存在多視點視訊區塊和點雲區塊。Alternatively, the splicing map sequence parameter set () (Atlas_sequence_parameter_set_rbsp()) in NAL_ASPS in the atlas sub-bitstream () (Atlas_sub_bitstream()) of V3C_AD may include asps_vpcc_extension_present_flag and asps_miv_extension_present_flag. When ptl_profile_toolset_idc is 128/129/130/132/133/134, asps_vpcc_extension_present_flag=X, asps_miv_extension_present_flag=Y. When X is 0 and Y is 1, it means that the spliced image only contains multi-viewpoint video blocks; when X is 1 and Y is 0, it means that the spliced image only contains point cloud blocks; Blocks and point cloud blocks.

V3C_AD的Atlas_sub_bitstream())中ACL NAL單元類型(ACL_NAL_unit_type)中包括拼接圖資訊。例如，拼接圖區塊資料單元(atlas_tile_data_unit())中可以包括ath_toolset_type。ath_toolset_type為否(即為0)，則表示當前區塊屬於點雲區塊，atdu_type_flag為是(即為1)，則表示當前條區塊屬於多視點視訊區塊。The ACL NAL unit type (ACL_NAL_unit_type) in Atlas_sub_bitstream() of V3C_AD includes mosaic information. For example, the mosaic tile data unit (atlas_tile_data_unit()) may include ath_toolset_type. If ath_toolset_type is no (that is, it is 0), it means that the current block belongs to a point cloud block. If atdu_type_flag is yes (that is, it is 1), it means that the current block belongs to a multi-viewpoint video block.

進一步的，子圖塊資訊資料()(patch_information_data)中包括子圖塊資料單元(patch_data_unit)。ath_toolset_type為否(即為0)的情況下，表示當前子圖塊採用點雲視訊編碼方法實現。在ath_toolset_type為是(即為1)的情況下，表示當前子圖塊採用多視點視訊編碼方法實現。Further, the sub-patch information data () (patch_information_data) includes a sub-patch data unit (patch_data_unit). When ath_toolset_type is no (that is, 0), it means that the current sub-tile is implemented using the point cloud video encoding method. When ath_toolset_type is yes (that is, 1), it means that the current sub-tile is implemented using a multi-viewpoint video coding method.

通過獲取每一張拼接圖的第一語法元素，根據第一語法元素值確定拼接圖中是否同時包括點雲區塊和多視點視訊區塊，確定拼接圖中同時存在點雲區塊和多視點視訊區塊時，需要獲取拼接圖中每個區塊的ath_toolset_type，來確定區塊類型。By obtaining the first syntax element of each spliced image, it is determined according to the value of the first syntax element whether the spliced image includes both point cloud blocks and multi-viewpoint video blocks, and it is determined that both point cloud blocks and multi-viewpoint video blocks are present in the spliced image. When selecting video blocks, you need to obtain the ath_toolset_type of each block in the mosaic to determine the block type.

上文以編碼端為例對本申請的編碼方法進行介紹，下面以解碼端為例對本申請實施例提供的視訊解碼方法進行說明。The encoding method of the present application is introduced above by taking the encoding end as an example. The video decoding method provided by the embodiment of the present application is described below by taking the decoding end as an example.

圖10為本申請實施例提供的一種解碼方法的示意性流程圖。如圖10所示，本申請實施例的解碼方法包括：Figure 10 is a schematic flow chart of a decoding method provided by an embodiment of the present application. As shown in Figure 10, the decoding method in this embodiment of the present application includes:

步驟1001：解碼碼流，得到拼接圖和拼接圖資訊，其中，所述拼接圖資訊包括第一語法元素，根據所述第一語法元素確定拼接圖為異構混合拼接圖或者同構拼接圖；Step 1001: Decode the code stream to obtain the splicing image and the splicing image information, wherein the splicing image information includes a first syntax element, and the splicing image is determined to be a heterogeneous hybrid splicing image or a homogeneous splicing image according to the first syntax element;

在一些實施例中，所述根據所述第一語法元素確定拼接圖為異構混合拼接圖或者同構拼接圖，包括：所述第一語法元素為第一預設值，則確定所述拼接圖為包括第一表達格式和第二表達格式的同構區塊的異構混合拼接圖，其中，所述第一表達格式和所述第二表達格式為不同表達格式；所述第一語法元素為第二預設值，則確定所述拼接圖為包括所述第一表達格式的同構區塊的同構拼接圖；所述第一語法元素為第三預設值，則確定所述拼接圖為包括所述第二表達格式的同構區塊的同構拼接圖。In some embodiments, determining according to the first syntax element that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram includes: if the first syntax element is a first preset value, then determining that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram. The figure shows a heterogeneous mixed splicing diagram including homogeneous blocks of a first expression format and a second expression format, wherein the first expression format and the second expression format are different expression formats; the first syntax element is the second preset value, then it is determined that the splicing diagram is a isomorphic splicing diagram including the isomorphic blocks of the first expression format; the first syntax element is the third preset value, then it is determined that the splicing The figure shows a isomorphic mosaic diagram including the isomorphic blocks of the second expression format.

在一些實施例中，所述第一語法元素包括：第一子語法元素和第二子語法元素，根據所述第一子語法元素和所述第二子語法元素確定所述拼接圖為異構混合拼接圖或者同構拼接圖；相應的，所述根據所述第一語法元素確定拼接圖為異構混合拼接圖或者同構拼接圖，包括：所述第一子語法元素為第四預設值，則確定所述拼接圖包括第一表達格式的同構區塊；和/或，所述第二子語法元素為第五預設值，則確定所述拼接圖包括第二表達格式的同構區塊；其中，所述第一表達格式和所述第二表達格式為不同表達格式。In some embodiments, the first syntax element includes: a first sub-syntax element and a second sub-syntax element, and it is determined that the splicing graph is heterogeneous according to the first sub-syntax element and the second sub-syntax element. A hybrid splicing diagram or a homogeneous splicing diagram; accordingly, the determination of the splicing diagram according to the first syntax element as a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram includes: the first sub-grammar element is a fourth preset value, it is determined that the spliced graph includes isomorphic blocks of the first expression format; and/or, the second sub-grammar element is a fifth preset value, then it is determined that the spliced graph includes isomorphic blocks of the second expression format. Building block; wherein the first expression format and the second expression format are different expression formats.

在一些實施例中，所述根據所述第一語法元素確定拼接圖為異構混合拼接圖或者同構拼接圖，還包括：所述第一子語法元素為第六預設值，則確定所述拼接圖不包括第一表達格式的同構區塊；所述第二子語法元素為第七預設值，則確定所述拼接圖不包括第二表達格式的同構區塊。In some embodiments, determining according to the first syntax element that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram further includes: determining that the first sub-grammar element is a sixth preset value. If the mosaic graph does not include the isomorphic blocks of the first expression format; if the second sub-syntax element is the seventh preset value, it is determined that the mosaic graph does not include the isomorphic blocks of the second expression format.

具體地，所述根據所述第一語法元素確定拼接圖為異構混合拼接圖或者同構拼接圖，包括：所述第一子語法元素為第四預設值且所述第二子語法元素為第五預設值，確定所述拼接圖為包括第一表達格式的同構區塊和第二表達格式的同構區塊的異構混合拼接圖，所述第一子語法元素為第四預設值且所述第二子語法元素為第七預設值，確定所述拼接圖為包括所述第一表達格式的同構區塊的同構拼接圖；所述第一子語法元素為第六預設值且所述第二子語法元素為第五預設值，確定所述拼接圖為包括所述第二表達格式的同構區塊的同構拼接圖。Specifically, determining that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram according to the first syntax element includes: the first sub-grammar element is a fourth preset value and the second sub-grammar element is the fifth preset value, it is determined that the splicing diagram is a heterogeneous hybrid splicing diagram including homogeneous blocks of the first expression format and homogeneous blocks of the second expression format, and the first sub-grammar element is the fourth The preset value and the second sub-grammar element is the seventh preset value, which determines that the mosaic diagram is a isomorphic mosaic diagram including isomorphic blocks of the first expression format; the first sub-grammar element is The sixth preset value and the second sub-syntax element are the fifth preset value, which determines that the mosaic graph is a isomorphic mosaic graph including isomorphic blocks of the second expression format.

在一些實施例中，所述第一語法元素位於所述碼流的參數集子碼流。In some embodiments, the first syntax element is located in a parameter set sub-codestream of the codestream.

在另一些實施例中，所述拼接圖對應的拼接圖序列參數集包括所述第一語法元素。In some other embodiments, the mosaic map sequence parameter set corresponding to the mosaic map includes the first syntax element.

在一些實施例中，所述至少一種表達格式包括：多視點視訊、點雲和網格中的至少一種。具體地，第一表達格式為多視點視訊、點雲和網格中的一種，第二表達格式為多視點視訊、點雲和網格中的一種，第一表達格式和第二表達格式不同。In some embodiments, the at least one expression format includes: at least one of multi-viewpoint video, point cloud, and mesh. Specifically, the first expression format is one of multi-view video, point cloud and grid, the second expression format is one of multi-view video, point cloud and grid, and the first expression format and the second expression format are different.

在一些實施例中，所述碼流還包括碼流的參數集子碼流，所述碼流的參數集子碼流中包括第三語法元素，根據第三語法元素確定所述碼流中包括至少一種表達格式的視覺媒體內容對應的碼流。所述方法還包括：解碼所述碼流的參數集子碼流，得到所述碼流的參數集，從碼流的參數集中獲取第三語法元素。In some embodiments, the code stream further includes a parameter set sub-code stream of the code stream, the parameter set sub-code stream of the code stream includes a third syntax element, and it is determined according to the third syntax element that the code stream includes A code stream corresponding to visual media content in at least one expression format. The method further includes: decoding the parameter set sub-code stream of the code stream to obtain the parameter set of the code stream, and obtaining the third syntax element from the parameter set of the code stream.

在一些實施例中，所述方法還包括：所述第三語法元素為第一數值，確定所述碼流中同時包括第一表達格式的視覺媒體內容對應的碼流和第二表達格式的視覺媒體內容對應的碼流；所述第三語法元素為第二數值，確定所述碼流中包括所述第一表達格式的視覺媒體內容對應的碼流；所述第三語法元素為第三數值，確定所述碼流中包括所述第二表達格式的視覺媒體內容對應的碼流。In some embodiments, the method further includes: the third syntax element is a first value, determining that the code stream includes both a code stream corresponding to the visual media content in the first expression format and a visual code stream in the second expression format. The code stream corresponding to the media content; the third syntax element is a second value, which determines that the code stream includes the code stream corresponding to the visual media content of the first expression format; the third syntax element is a third value , determining the code stream corresponding to the visual media content including the second expression format in the code stream.

在一些實施例中，所述解碼碼流，得到至少一個拼接圖，包括：根據所述第三語法元素確定所述碼流包括至少兩種表達格式的視覺媒體內容對應的碼流，解碼所述碼流得到異構混合拼接圖。In some embodiments, decoding the code stream to obtain at least one spliced image includes: determining according to the third syntax element that the code stream includes code streams corresponding to visual media content in at least two expression formats, decoding the The code stream obtains a heterogeneous hybrid splicing image.

在一些實施例中，所述解碼碼流，得到至少一個拼接圖，包括：根據所述第三語法元素確定所述碼流包括至少兩種表達格式的視覺媒體內容對應的碼流，解碼所述碼流得到至少兩種表達格式的同構拼接圖。也就是說，當碼流中包括至少兩種表達格式的視覺媒體內容對應的碼流，每種表達格式對應一種同構拼接圖。In some embodiments, decoding the code stream to obtain at least one spliced image includes: determining according to the third syntax element that the code stream includes code streams corresponding to visual media content in at least two expression formats, decoding the The code stream obtains isomorphic splicing images of at least two expression formats. That is to say, when the code stream includes code streams corresponding to visual media content in at least two expression formats, each expression format corresponds to a isomorphic splicing diagram.

在一些實施例中，所述解碼碼流，得到至少一個拼接圖，包括：根據所述第三語法元素確定所述碼流包括至少兩種表達格式的視覺媒體內容對應的碼流，解碼所述碼流得到至少兩種表達格式的異構混合拼接圖和同構拼接圖。也就是說，當碼流中包括至少兩種表達格式的視覺媒體內容對應的碼流，一部分表達格式的同構區塊構建異構混合拼接圖，另一部分表達格式的同構區塊構建同構拼接圖。In some embodiments, decoding the code stream to obtain at least one spliced image includes: determining according to the third syntax element that the code stream includes code streams corresponding to visual media content in at least two expression formats, decoding the The code stream obtains heterogeneous mixed splicing images and isomorphic splicing images of at least two expression formats. That is to say, when the code stream includes code streams corresponding to visual media content in at least two expression formats, the isomorphic blocks of some expression formats construct a heterogeneous hybrid splicing diagram, and the isomorphic blocks of another part of the expression formats construct an isomorphic Mosaic diagram.

在一些實施例中，所述異構混合拼接圖以下至少一種：單一屬性異構混合拼接圖和多屬性異構混合拼接圖；所述同構拼接圖包括以下至少一種：單一屬性同構拼接圖和多屬性同構拼接圖。In some embodiments, the heterogeneous hybrid splicing diagram includes at least one of the following: a single attribute heterogeneous hybrid splicing diagram and a multi-attribute heterogeneous hybrid splicing diagram; the homogeneous hybrid splicing diagram includes at least one of the following: a single attribute isomorphic hybrid splicing diagram. and multi-attribute isomorphic mosaic diagrams.

在一些實施例中，所述碼流包括視訊壓縮子碼流和拼接圖資訊子碼流，所述解碼碼流，得到至少一個拼接圖和拼接圖資訊，包括：解碼所述視訊壓縮子碼流，得到所述至少一個拼接圖；解碼所述拼接圖資訊子碼流，得到所述至少一個拼接圖的拼接圖資訊。示例性的，根據所述第三語法元素確定所述碼流包括至少兩種表達格式的視覺媒體內容對應的碼流，解碼所述視訊壓縮子碼流，解碼所述碼流得到異構混合拼接圖。或者，根據所述第三語法元素確定所述碼流包括至少兩種表達格式的視覺媒體內容對應的碼流，解碼所述視訊壓縮子碼流，解碼所述碼流得到異構混合拼接圖和同構拼接圖；或者，根據所述第三語法元素確定所述碼流包括至少兩種表達格式的視覺媒體內容對應的碼流，解碼所述視訊壓縮子碼流，至少兩種表達格式的同構拼接圖。In some embodiments, the code stream includes a video compression sub-stream and a splicing image information sub-stream, and decoding the code stream to obtain at least one splicing image and splicing image information includes: decoding the video compression sub-stream , obtain the at least one mosaic image; decode the mosaic image information sub-stream to obtain the mosaic image information of the at least one mosaic image. Exemplarily, it is determined according to the third syntax element that the code stream includes a code stream corresponding to visual media content in at least two expression formats, the video compression sub-stream is decoded, and the code stream is decoded to obtain heterogeneous hybrid splicing. Figure. Or, determine according to the third syntax element that the code stream includes a code stream corresponding to visual media content in at least two expression formats, decode the video compression sub-code stream, and decode the code stream to obtain a heterogeneous hybrid splicing image and Isomorphic splicing diagram; or, determine according to the third syntax element that the code stream includes code streams corresponding to visual media content of at least two expression formats, decode the video compression sub-code stream, and obtain the same code stream of at least two expression formats. Construct a mosaic diagram.

步驟1002：根據所述第一語法元素確定所述拼接圖為異構混合拼接圖時，根據所述拼接圖的拼接圖資訊對所述拼接圖進行拆分，得到至少兩種同構區塊，其中，所述至少兩種同構區塊對應不同的視覺媒體內容表達格式；Step 1002: When it is determined that the spliced image is a heterogeneous hybrid spliced image according to the first syntax element, split the spliced image according to the spliced image information of the spliced image to obtain at least two types of isomorphic blocks, Wherein, the at least two isomorphic blocks correspond to different visual media content expression formats;

步驟1003：根據所述第一語法元素確定所述拼接圖為同構拼接圖時，根據所述拼接圖的拼接圖資訊對所述拼接圖進行拆分，得到一種同構區塊，其中，所述一種同構區塊對應相同的視覺媒體內容表達格式；Step 1003: When it is determined that the spliced graph is a isomorphic spliced graph according to the first syntax element, the spliced graph is split according to the spliced graph information of the spliced graph to obtain a homogeneous block, wherein A homogeneous block corresponding to the same visual media content expression format is described;

步驟1004：對所述同構區塊進行解碼重建，得到至少一種表達格式的視覺媒體內容。Step 1004: Decode and reconstruct the isomorphic blocks to obtain visual media content in at least one expression format.

在一些實施例中，所述方法還包括：根據所述第一語法元素確定所述拼接圖為異構混合拼接圖時，所述拼接圖資訊還包括第二語法元素，根據所述第二語法元素確定所述拼接圖中第i個區塊的表達格式。In some embodiments, the method further includes: when it is determined based on the first syntax element that the spliced image is a heterogeneous hybrid spliced image, the spliced image information further includes a second syntax element. According to the second syntax element The element determines the expression format of the i-th block in the mosaic diagram.

示例性的，所述根據所述第二語法元素確定所述拼接圖中第i個區塊的表達格式包括：所述第二語法元素為第八預設值，確定所述第i個區塊的表達格式為第一表達格式；所述第二語法元素為第九預設值，確定所述第i個區塊的表達格式為第二表達格式。Exemplarily, determining the expression format of the i-th block in the mosaic diagram based on the second syntax element includes: the second syntax element is an eighth preset value, and determining the i-th block The expression format is the first expression format; the second syntax element is the ninth preset value, and the expression format of the i-th block is determined to be the second expression format.

在一些實施例中，所述第二語法元素位於所述拼接圖的第i個區塊的拼接圖區塊資料單元頭中。In some embodiments, the second syntax element is located in the mosaic block data unit header of the i-th block of the mosaic.

在一些實施例中，所述對所述同構區塊進行解碼重建，得到至少一種表達格式的視覺媒體內容，包括：若所述第i個區塊的表達格式為第一表達格式，確定所述第i個區塊中子圖塊採用所述第一表達格式對應的解碼方法進行解碼重建，得到所述第一表達格式的視覺媒體內容；若所述第i個區塊的表達格式為第二表達格式，確定所述第i個區塊中子圖塊採用所述第二表達格式對應的解碼方法進行解碼重建，得到所述第二表達格式的視覺媒體內容。In some embodiments, decoding and reconstructing the isomorphic blocks to obtain visual media content in at least one expression format includes: if the expression format of the i-th block is a first expression format, determining the The sub-picture block in the i-th block is decoded and reconstructed using the decoding method corresponding to the first expression format to obtain the visual media content of the first expression format; if the expression format of the i-th block is the In the second expression format, it is determined that the sub-picture block in the i-th block is decoded and reconstructed using the decoding method corresponding to the second expression format to obtain the visual media content of the second expression format.

示例性的，解碼碼流得到多視點視訊拼接圖、點雲拼接圖和異構混合拼接圖。根據異構混合拼接圖的拼接圖訊息，拆分異構混合拼接圖，輸出重建的多視點視訊區塊和點雲區塊；根據多視點視訊拼接圖對應的拼接圖資訊，拆分多視點視訊拼接圖，輸出重建的多視點視訊區塊；根據點雲拼接圖對應的拼接圖資訊，拆分點雲拼接圖，輸出重建點雲區塊；將獲取的所有多視點視訊區塊通過多視點視訊解碼生成重建的多視點視訊；將獲取的所有點雲區塊通過點雲解碼生成重建點雲。For example, the decoded code stream obtains a multi-view video mosaic image, a point cloud mosaic image, and a heterogeneous hybrid mosaic image. Split the heterogeneous hybrid mosaic image according to the mosaic information of the heterogeneous hybrid mosaic image, and output the reconstructed multi-view video blocks and point cloud blocks; split the multi-view video based on the mosaic information corresponding to the multi-view video mosaic image. Splice the image and output the reconstructed multi-viewpoint video blocks; split the point cloud splicing image according to the splicing image information corresponding to the point cloud splicing image, and output the reconstructed point cloud blocks; pass all the acquired multi-viewpoint video blocks through the multi-viewpoint video Decoding generates reconstructed multi-viewpoint video; all acquired point cloud blocks are decoded to generate reconstructed point clouds.

採用上述技術方案，針對包括一種或多種表達格式的視覺媒體內容的應用場景，將不同表達格式的同構區塊拼接成一張異構混合拼接圖，將相同表達格式的的同構區塊拼接成一張同構拼接圖，將得到的拼接圖和拼接圖資訊寫入碼流。碼流中同時存在同構拼接圖(例如多視點拼接圖、點雲拼接圖和網格拼接圖中的至少一個)和異構混合拼接圖，使得該編解碼方法適用於多種表達格式的視覺媒體內容的應用場景，擴大了編解碼方法的應用範圍。而且拼接圖資訊中包括了用於指示拼接圖類型的第一語法元素，提高了解碼端對拼接圖的解碼效率。進一步地，由於將不同表達格式的同構區塊拼接在一張異構混合拼接圖中進行編解碼，能夠減少所需要調用的HEVC，VVC，AVC，AVS等二維視訊編解碼器的個數，降低實現代價，提高易用性。Using the above technical solution, for application scenarios that include visual media content in one or more expression formats, isomorphic blocks in different expression formats are spliced into a heterogeneous mixed splicing picture, and isomorphic blocks in the same expression format are spliced into a homogeneous splicing picture. Picture, write the obtained splicing picture and the splicing picture information into the code stream. There are both homogeneous splicing images (such as at least one of multi-viewpoint splicing images, point cloud splicing images and grid splicing images) and heterogeneous hybrid splicing images in the code stream, making this encoding and decoding method suitable for visual media of multiple expression formats The application scenarios of content expand the application scope of encoding and decoding methods. Moreover, the splicing image information includes the first syntax element used to indicate the type of the splicing image, which improves the decoding efficiency of the splicing image at the decoding end. Furthermore, since homogeneous blocks of different expression formats are spliced into a heterogeneous hybrid splicing image for encoding and decoding, the number of 2D video codecs such as HEVC, VVC, AVC, and AVS that need to be called can be reduced, reducing Realize value and improve ease of use.

本申請實施例還提供了一種編碼裝置，圖11為本申請一實施例提供的編碼裝置的示意性框圖，該編碼裝置110應用於編碼器。如圖11所示，編碼裝置110包括：An embodiment of the present application also provides an encoding device. Figure 11 is a schematic block diagram of an encoding device provided by an embodiment of the present application. The encoding device 110 is applied to an encoder. As shown in Figure 11, the encoding device 110 includes:

處理單元1101，用於對至少一種表達格式的視覺媒體內容進行處理，得到至少一種同構區塊，其中，不同種同構區塊對應不同的視覺媒體內容表達格式；The processing unit 1101 is configured to process visual media content in at least one expression format to obtain at least one isomorphic block, where different types of isomorphic blocks correspond to different visual media content expression formats;

拼接單元1102，用於對所述至少一種同構區塊進行拼接，得到至少一個拼接圖和拼接圖資訊，其中，所述拼接圖資訊包括第一語法元素，根據所述第一語法元素確定所述拼接圖為異構混合拼接圖或者同構拼接圖，所述異構混合拼接圖包括至少兩種同構區塊，所述同構拼接圖包括一種同構區塊；The splicing unit 1102 is used to splice the at least one isomorphic block to obtain at least one spliced image and spliced image information, wherein the spliced image information includes a first syntax element, and the spliced image is determined according to the first syntax element. The mosaic diagram is a heterogeneous hybrid mosaic diagram or a homogeneous mosaic diagram, the heterogeneous hybrid mosaic diagram includes at least two types of isomorphic blocks, and the isomorphic mosaic diagram includes one type of isomorphic block;

編碼單元1103，用於對所述至少一個拼接圖和拼接圖資訊進行編碼，得到碼流。The encoding unit 1103 is used to encode the at least one spliced image and the spliced image information to obtain a code stream.

在一些實施例中，所述第一語法元素包括：第一子語法元素和第二子語法元素，根據所述第一子語法元素和所述第二子語法元素確定所述拼接圖為異構混合拼接圖或者同構拼接圖；In some embodiments, the first syntax element includes: a first sub-syntax element and a second sub-syntax element, and it is determined that the splicing graph is heterogeneous according to the first sub-syntax element and the second sub-syntax element. Hybrid mosaic or isomorphic mosaic;

所述根據所述第一語法元素確定拼接圖為異構混合拼接圖或者同構拼接圖，包括：所述第一子語法元素為第四預設值，則確定所述拼接圖包括第一表達格式的同構區塊；和/或，所述第二子語法元素為第五預設值，則確定所述拼接圖包括第二表達格式的同構區塊；其中，所述第一表達格式和所述第二表達格式為不同表達格式。Determining that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram based on the first syntax element includes: if the first sub-grammar element is a fourth preset value, then it is determined that the splicing diagram includes the first expression isomorphic blocks of the format; and/or, if the second sub-syntax element is the fifth preset value, it is determined that the splicing diagram includes a isomorphic block of the second expression format; wherein, the first expression format and the second expression format are different expression formats.

在一些實施例中，所述拼接圖對應的拼接圖序列參數集包括所述第一語法元素。In some embodiments, the mosaic graph sequence parameter set corresponding to the mosaic graph includes the first syntax element.

在一些實施例中，根據所述第一語法元素確定所述拼接圖為異構混合拼接圖時，所述拼接圖資訊還包括第二語法元素，根據所述第二語法元素確定所述拼接圖中第i個區塊的表達格式。In some embodiments, when it is determined based on the first syntax element that the spliced image is a heterogeneous hybrid spliced image, the spliced image information also includes a second syntax element, and the spliced image is determined based on the second syntax element The expression format of the i-th block in .

在一些實施例中，所述根據所述第二語法元素確定所述拼接圖中第i個區塊的表達格式包括：所述第二語法元素為第八預設值，確定所述第i個區塊的表達格式為第一表達格式；所述第二語法元素為第九預設值，確定所述第i個區塊的表達格式為第二表達格式。In some embodiments, determining the expression format of the i-th block in the mosaic diagram based on the second syntax element includes: the second syntax element is an eighth preset value, and determining the i-th block The expression format of the block is the first expression format; the second syntax element is the ninth preset value, which determines that the expression format of the i-th block is the second expression format.

在一些實施例中，所述編碼單元1103，用於若所述第i個區塊的表達格式為第一表達格式，確定所述第i個區塊中子圖塊採用所述第一表達格式對應的編碼方法進行編碼，得到所述第一表達格式的視覺媒體內容對應的碼流；若所述第i個區塊的表達格式為第二表達格式，確定所述第i個區塊中子圖塊採用所述第二表達格式對應的編碼方法進行編碼，得到所述第二表達格式的視覺媒體內容對應的碼流。In some embodiments, the encoding unit 1103 is configured to, if the expression format of the i-th block is a first expression format, determine that the sub-tile in the i-th block adopts the first expression format. Encode with the corresponding encoding method to obtain the code stream corresponding to the visual media content of the first expression format; if the expression format of the i-th block is the second expression format, determine the neutron of the i-th block The tiles are encoded using the encoding method corresponding to the second expression format to obtain a code stream corresponding to the visual media content of the second expression format.

在一些實施例中，所述碼流的參數集子碼流中包括第三語法元素，根據所述第三語法元素確定所述碼流中包括至少一種表達格式的視覺媒體內容對應的碼流。In some embodiments, the parameter set sub-code stream of the code stream includes a third syntax element, and the code stream corresponding to the visual media content including at least one expression format in the code stream is determined according to the third syntax element.

在一些實施例中，所述根據所述第三語法元素確定所述碼流中包括至少一種表達格式的視覺媒體內容對應的碼流，包括：所述第三語法元素為第一數值，確定所述碼流中同時包括第一表達格式的視覺媒體內容對應的碼流和第二表達格式的視覺媒體內容對應的碼流；所述第三語法元素為第二數值，確定所述碼流中包括所述第一表達格式的視覺媒體內容對應的碼流；所述第三語法元素為第三數值，確定所述碼流中包括所述第二表達格式的視覺媒體內容對應的碼流。In some embodiments, determining the code stream corresponding to the visual media content including at least one expression format in the code stream according to the third syntax element includes: the third syntax element is a first value, and the determination of the code stream includes: The code stream includes both a code stream corresponding to the visual media content in the first expression format and a code stream corresponding to the visual media content in the second expression format; the third syntax element is a second value, which determines that the code stream includes The code stream corresponding to the visual media content in the first expression format; the third syntax element is a third value, which determines that the code stream includes the code stream corresponding to the visual media content in the second expression format.

在一些實施例中，所述至少一個拼接圖包括異構混合拼接圖時，所述第三語法元素用於指示所述碼流中包括至少兩種表達格式的視覺媒體內容對應的碼流。In some embodiments, when the at least one mosaic includes a heterogeneous hybrid mosaic, the third syntax element is used to indicate that the code stream includes a code stream corresponding to visual media content in at least two expression formats.

在一些實施例中，所述編碼單元1103，用於對所述至少一個拼接圖進行編碼，得到視訊壓縮子碼流；對所述至少一個拼接圖的拼接圖資訊進行編碼，得到拼接圖資訊子碼流；將所述視訊壓縮子碼流和所述拼接圖資訊子碼流合成所述碼流。In some embodiments, the encoding unit 1103 is used to encode the at least one spliced image to obtain a video compression sub-stream; to encode the spliced image information of the at least one spliced image to obtain the spliced image information sub-stream. Code stream; synthesize the video compression sub-stream and the splicing image information sub-stream into the code stream.

在一些實施例中，所述至少一種表達格式包括：多視點視訊、點雲和網格中的至少一種。In some embodiments, the at least one expression format includes: at least one of multi-viewpoint video, point cloud, and mesh.

本申請實施例還提供了一種解碼裝置，圖12為本申請一實施例提供的解碼裝置的示意性框圖，該解碼裝置120應用於解碼器。如圖12所示，解碼裝置120包括：An embodiment of the present application also provides a decoding device. Figure 12 is a schematic block diagram of a decoding device provided by an embodiment of the present application. The decoding device 120 is applied to a decoder. As shown in Figure 12, the decoding device 120 includes:

解碼單元1201，用於解碼碼流，得到拼接圖和拼接圖資訊，其中，所述拼接圖資訊包括第一語法元素，根據所述第一語法元素確定拼接圖為異構混合拼接圖或者同構拼接圖；The decoding unit 1201 is used to decode the code stream to obtain the splicing image and the splicing image information, wherein the splicing image information includes a first syntax element, and it is determined according to the first syntax element that the splicing image is a heterogeneous hybrid splicing image or isomorphic mosaic;

拆分單元1202，用於根據所述第一語法元素確定所述拼接圖為異構混合拼接圖時，根據所述拼接圖的拼接圖資訊對所述拼接圖進行拆分，得到至少兩種同構區塊，其中，所述至少兩種同構區塊對應不同的視覺媒體內容表達格式；The splitting unit 1202 is configured to split the spliced image according to the spliced image information of the spliced image to obtain at least two homogeneous ones when the spliced image is determined to be a heterogeneous hybrid spliced image according to the first syntax element. Constructing blocks, wherein the at least two isomorphic blocks correspond to different visual media content expression formats;

所述拆分單元1202，用於根據所述第一語法元素確定所述拼接圖為同構拼接圖時，根據所述拼接圖的拼接圖資訊對所述拼接圖進行拆分，得到一種同構區塊，其中，所述一種同構區塊對應相同的視覺媒體內容表達格式；The splitting unit 1202 is configured to split the spliced image according to the spliced image information of the spliced image to obtain an isomorphic spliced image. Blocks, wherein said one isomorphic block corresponds to the same visual media content expression format;

處理單元1203，用於對所述同構區塊進行解碼重建，得到至少一種表達格式的視覺媒體內容。The processing unit 1203 is configured to decode and reconstruct the homogeneous blocks to obtain visual media content in at least one expression format.

在一些實施例中，根據所述第一語法元素確定所述拼接圖為異構混合拼接圖時，所述拼接圖資訊還包括第二語法元素，根據所述第二語法元素確定所述拼接圖中第i個區塊的表達格式。In some embodiments, when it is determined based on the first syntax element that the spliced image is a heterogeneous hybrid spliced image, the spliced image information also includes a second syntax element, and the spliced image is determined based on the second syntax element. The expression format of the i-th block in .

在一些實施例中，所述處理單元1203，用於若所述第i個區塊的表達格式為第一表達格式，確定所述第i個區塊中子圖塊採用所述第一表達格式對應的解碼方法進行解碼重建，得到所述第一表達格式的視覺媒體內容；若所述第i個區塊的表達格式為第二表達格式，確定所述第i個區塊中子圖塊採用所述第二表達格式對應的解碼方法進行解碼重建，得到所述第二表達格式的視覺媒體內容。In some embodiments, the processing unit 1203 is configured to, if the expression format of the i-th block is a first expression format, determine that the sub-tiles in the i-th block adopt the first expression format. The corresponding decoding method performs decoding and reconstruction to obtain the visual media content of the first expression format; if the expression format of the i-th block is the second expression format, determine the sub-block in the i-th block using The decoding method corresponding to the second expression format performs decoding and reconstruction to obtain the visual media content of the second expression format.

在一些實施例中，所述解碼單元1201，用於根據所述第三語法元素確定所述碼流包括至少兩種表達格式的視覺媒體內容對應的碼流，解碼所述碼流得到異構混合拼接圖。In some embodiments, the decoding unit 1201 is configured to determine, according to the third syntax element, that the code stream includes a code stream corresponding to visual media content in at least two expression formats, and decode the code stream to obtain a heterogeneous hybrid Mosaic diagram.

在一些實施例中，所述碼流包括視訊壓縮子碼流和拼接圖資訊子碼流，所述解碼單元1201，用於解碼所述視訊壓縮子碼流，得到所述至少一個拼接圖；解碼所述拼接圖資訊子碼流，得到所述至少一個拼接圖的拼接圖資訊。In some embodiments, the code stream includes a video compression sub-stream and a splicing image information sub-stream. The decoding unit 1201 is used to decode the video compression sub-stream to obtain the at least one splicing image; decoding The splicing image information sub-stream is used to obtain the splicing image information of the at least one splicing image.

應理解，裝置實施例與方法實施例可以相互對應，類似的描述可以參照方法實施例。為避免重複，此處不再贅述。It should be understood that the device embodiments and the method embodiments may correspond to each other, and similar descriptions may refer to the method embodiments. To avoid repetition, they will not be repeated here.

上文中結合附圖從功能單元的角度描述了本申請實施例的裝置和系統。應理解，該功能單元可以通過硬體形式實現，也可以通過軟體形式的指令實現，還可以通過硬體和軟體單元組合實現。具體地，本申請實施例中的方法實施例的各步驟可以通過處理器中的硬體的集成邏輯電路和/或軟體形式的指令完成，結合本申請實施例公開的方法的步驟可以直接體現為硬體解碼處理器執行完成，或者用解碼處理器中的硬體及軟體單元組合執行完成。可選地，軟體單元可以位於隨機記憶體，快閃記憶體、唯讀記憶體、可程式設計唯讀記憶體、電可讀寫可程式設計記憶體、寄存器等本領域的成熟的儲存媒介中。該儲存媒介位於記憶體，處理器讀取記憶體中的資訊，結合其硬體完成上述方法實施例中的步驟。The device and system of the embodiments of the present application are described above from the perspective of functional units in conjunction with the accompanying drawings. It should be understood that the functional unit can be implemented in the form of hardware, can also be implemented in the form of instructions in the form of software, or can also be implemented in a combination of hardware and software units. Specifically, each step of the method embodiments in the embodiments of the present application can be completed by instructions in the form of hardware integrated logic circuits and/or software in the processor. The steps of the methods disclosed in the embodiments of the present application can be directly embodied as The execution is completed by the hardware decoding processor, or by a combination of hardware and software units in the decoding processor. Optionally, the software unit may be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically readable and writable programmable memory, registers, etc. . The storage medium is located in the memory, and the processor reads the information in the memory and combines its hardware to complete the steps in the above method embodiment.

在實際應用中，本申請實施例還提供了一種編碼器，圖13為本申請一實施例提供的編碼器的示意性框圖，如圖13所示，編碼器1310包括：In practical applications, the embodiment of the present application also provides an encoder. Figure 13 is a schematic block diagram of the encoder provided by an embodiment of the present application. As shown in Figure 13, the encoder 1310 includes:

第二記憶體1320和第二處理器1330；第二記憶體1320儲存有可在第二處理器1330上運行的電腦程式，第二處理器1330執行程式時編碼器側的編碼方法。The second memory 1320 and the second processor 1330; the second memory 1320 stores a computer program that can be run on the second processor 1330, and the second processor 1330 performs the encoding method on the encoder side when executing the program.

在實際應用中，本申請實施例還提供了一種解碼器，圖14為本申請一實施例提供的解碼器的示意性框圖，如圖14所示，解碼器1410包括：In practical applications, this embodiment of the present application also provides a decoder. Figure 14 is a schematic block diagram of a decoder provided by an embodiment of the present application. As shown in Figure 14, the decoder 1410 includes:

第一記憶體1420和第一處理器1430；第一記憶體1420儲存有可在第一處理器1430上運行的電腦程式，第一處理器1430執行程式時解碼器側的解碼方法。The first memory 1420 and the first processor 1430; the first memory 1420 stores a computer program that can be run on the first processor 1430, and the first processor 1430 performs the decoding method on the decoder side when executing the program.

在本申請的一些實施例中，該處理器可以包括但不限於：In some embodiments of the present application, the processor may include, but is not limited to:

通用處理器、數位訊號處理器(Digital Signal Processor，DSP)、專用積體電路(Application Specific Integrated Circuit，ASIC)、現場可程式設計閘陣列(Field Programmable Gate Array，FPGA)或者其他可程式設計邏輯器件、分立門或者電晶體邏輯器件、分立硬體元件等等。General processor, Digital Signal Processor (DSP), Application Specific Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA) or other programmable logic devices , discrete gate or transistor logic devices, discrete hardware components, etc.

在本申請的一些實施例中，該記憶體包括但不限於：In some embodiments of the present application, the memory includes but is not limited to:

易失性記憶體和/或非易失性記憶體。其中，非易失性記憶體可以是唯讀記憶體(Read-Only Memory，ROM)、可程式設計唯讀記憶體(Programmable ROM，PROM)、可擦除可程式設計唯讀記憶體(Erasable PROM，EPROM)、電可擦除可程式設計唯讀記憶體(Electrically EPROM，EEPROM)或快閃記憶體。易失性記憶體可以是隨機存取記憶體(Random Access Memory，RAM)，其用作外部快取記憶體。通過示例性但不是限制性說明，許多形式的RAM可用，例如靜態隨機存取記憶體(Static RAM，SRAM)、動態隨機存取記憶體(Dynamic RAM，DRAM)、同步動態隨機存取記憶體(Synchronous DRAM，SDRAM)、雙倍數據速率同步動態隨機存取記憶體(Double Data Rate SDRAM，DDR SDRAM)、增強型同步動態隨機存取記憶體(Enhanced SDRAM，ESDRAM)、同步連接動態隨機存取記憶體(synch link DRAM，SLDRAM)和直接記憶體匯流排隨機存取記憶體(Direct Rambus RAM，DR RAM)。Volatile memory and/or non-volatile memory. Among them, the non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM) , EPROM), electrically erasable programmable read-only memory (Electrically EPROM, EEPROM) or flash memory. The volatile memory may be Random Access Memory (RAM), which is used as external cache memory. By way of illustration, but not limitation, many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (Dynamic RAM, DRAM), synchronous dynamic random access memory (DRAM), Synchronous DRAM (SDRAM), double data rate synchronous dynamic random access memory (Double Data Rate SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous connected dynamic random access memory (synch link DRAM, SLDRAM) and direct memory bus random access memory (Direct Rambus RAM, DR RAM).

另外，在本實施例中的各功能模組可以集成在一個處理單元中，也可以是各個單元單獨物理存在，也可以兩個或兩個以上單元集成在一個單元中。上述集成的單元既可以採用硬體的形式實現，也可以採用軟體功能模組的形式實現。In addition, each functional module in this embodiment can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit. The above integrated units can be implemented in the form of hardware or software function modules.

在本申請的再一實施例中，參見圖15，其示出了本申請實施例提供的一種編解碼系統的組成結構示意圖。如圖15所示，編解碼系統150可以包括編碼器1501和解碼器1502。其中，編碼器1501可以為集成有前述實施例所述編碼裝置的設備；解碼器1502可以為集成有前述實施例所述解碼裝置的設備。In yet another embodiment of the present application, see FIG. 15 , which shows a schematic structural diagram of a coding and decoding system provided by an embodiment of the present application. As shown in Figure 15, the encoding and decoding system 150 may include an encoder 1501 and a decoder 1502. The encoder 1501 may be a device integrated with the encoding device described in the previous embodiment; the decoder 1502 may be a device integrated with the decoding device described in the previous embodiment.

在本申請實施例中，該編解碼系統150中，無論是編碼器1501還是解碼器1502，均可以利用相鄰參考像素與待預測像素的顏色分量資訊，實現待預測像素對應加權係數的計算；而且不同的參考像素可以具有不同的加權係數，將此加權係數應用於當前塊中待預測像素的色度預測，不僅可以提高色度預測的準確性，節省碼率，而且還能夠提升編解碼性能。In the embodiment of the present application, in the encoding and decoding system 150, both the encoder 1501 and the decoder 1502 can use the color component information of adjacent reference pixels and the pixels to be predicted to realize the calculation of the weighting coefficient corresponding to the pixel to be predicted; Moreover, different reference pixels can have different weighting coefficients. Applying this weighting coefficient to the chroma prediction of the pixels to be predicted in the current block can not only improve the accuracy of chroma prediction and save code rate, but also improve the encoding and decoding performance. .

本申請實施例還提供一種晶片，用於實現上述編解碼方法。具體地，該晶片包括：處理器，用於從記憶體中調用並運行電腦程式，使得安裝有該晶片的電子設備執行如上述編解碼方法。An embodiment of the present application also provides a chip for implementing the above encoding and decoding method. Specifically, the chip includes: a processor for calling and running a computer program from the memory, so that the electronic device installed with the chip executes the above encoding and decoding method.

本申請實施例還提供一種電腦儲存媒介，其中儲存有電腦程式，該電腦程式被第二處理器執行時，實現編碼器的編碼方法；或者，該電腦程式被第一處理器執行時，實現解碼器的解碼方法。或者說，本申請實施例還提供一種包含指令的電腦程式產品，該指令被電腦執行時使得電腦執行上述方法實施例的方法。Embodiments of the present application also provide a computer storage medium in which a computer program is stored. When the computer program is executed by the second processor, the encoding method of the encoder is implemented; or when the computer program is executed by the first processor, the decoding method is implemented. decoding method. In other words, embodiments of the present application also provide a computer program product containing instructions, which when executed by a computer causes the computer to execute the method of the above method embodiments.

本申請還提供了一種碼流，該碼流是根據上述編碼方法生成的，可選的，該碼流中包括上述第一語法元素，或者包括第二語法元素和第三語法元素。This application also provides a code stream, which is generated according to the above encoding method. Optionally, the code stream includes the above first syntax element, or includes a second syntax element and a third syntax element.

當使用軟體實現時，可以全部或部分地以電腦程式產品的形式實現。該電腦程式產品包括一個或多個電腦指令。在電腦上載入和執行該電腦程式指令時，全部或部分地產生按照本申請實施例該的流程或功能。該電腦可以是通用電腦、專用電腦、電腦網路、或者其他可程式設計裝置。該電腦指令可以儲存在電腦可讀儲存媒介中，或者從一個電腦可讀儲存媒介向另一個電腦可讀儲存媒介傳輸，例如，該電腦指令可以從一個網站網站、電腦、伺服器或資料中心通過有線(例如同軸電纜、光纖、數位用戶線路(digital subscriber line，DSL))或無線(例如紅外、無線、微波等)方式向另一個網站網站、電腦、伺服器或資料中心進行傳輸。該電腦可讀儲存媒介可以是電腦能夠存取的任何可用媒介或者是包含一個或多個可用媒介集成的伺服器、資料中心等資料存放裝置。該可用媒介可以是磁性媒介(例如，軟碟、硬碟、磁帶)、光媒介(例如數位視訊光碟(digital video disc，DVD))、或者半導體媒介(例如固態硬碟(solid state disk，SSD))等。When implemented using software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server or data center through Transmission to another website, computer, server or data center via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) methods. The computer-readable storage medium can be any available media that can be accessed by a computer or a data storage device such as a server, data center, or other integrated media that includes one or more available media. The available media may be magnetic media (such as floppy disks, hard disks, magnetic tapes), optical media (such as digital video discs (DVD)), or semiconductor media (such as solid state disks (SSD) )wait.

本領域普通技術人員可以意識到，結合本申請中所公開的實施例描述的各示例的單元及演算法步驟，能夠以電子硬體、或者電腦軟體和電子硬體的結合來實現。這些功能究竟以硬體還是軟體方式來執行，取決於技術方案的特定應用和設計約束條件。專業技術人員可以對每個特定的應用來使用不同方法來實現所描述的功能，但是這種實現不應認為超出本申請的範圍。Those of ordinary skill in the art can appreciate that the units and algorithm steps of each example described in conjunction with the embodiments disclosed in this application can be implemented with electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each specific application, but such implementations should not be considered beyond the scope of this application.

在本申請所提供的幾個實施例中，應該理解到，所揭露的系統、裝置和方法，可以通過其它的方式實現。例如，以上所描述的裝置實施例僅僅是示意性的，例如，該單元的劃分，僅僅為一種邏輯功能劃分，實際實現時可以有另外的劃分方式，例如多個單元或元件可以結合或者可以集成到另一個系統，或一些特徵可以忽略，或不執行。另一點，所顯示或討論的相互之間的耦合或直接耦合或通訊連接可以是通過一些介面，裝置或單元的間接耦合或通訊連接，可以是電性，機械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented. On the other hand, the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.

作為分離部件說明的單元可以是或者也可以不是物理上分開的，作為單元顯示的部件可以是或者也可以不是物理單元，即可以位於一個地方，或者也可以分佈到多個網路單元上。可以根據實際的需要選擇其中的部分或者全部單元來實現本實施例方案的目的。例如，在本申請各個實施例中的各功能單元可以集成在一個處理單元中，也可以是各個單元單獨物理存在，也可以兩個或兩個以上單元集成在一個單元中。A unit described as a separate component may or may not be physically separate. A component shown as a unit may or may not be a physical unit, that is, it may be located in one place, or it may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment. For example, each functional unit in various embodiments of the present application can be integrated into a processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit.

以上內容，僅為本申請的具體實施方式，但本申請的保護範圍並不局限於此，任何熟悉本技術領域的技術人員在本申請揭露的技術範圍內，可輕易想到變化或替換，都應涵蓋在本申請的保護範圍之內。因此，本申請的保護範圍應以申請專利範圍的保護範圍為准。The above contents are only specific embodiments of the present application, but the protection scope of the present application is not limited thereto. Any person familiar with the technical field can easily think of changes or replacements within the technical scope disclosed in the present application, and should are covered by the protection scope of this application. Therefore, the protection scope of this application should be subject to the protection scope of the patent application.

100:視訊編解碼系統 110:編碼設備、編碼裝置 111:視訊源 112:視訊編碼器 113:輸出介面 120:解碼設備、解碼裝置 121:輸入介面 122:視訊解碼器 123:顯示裝置 130:通道 150:編解碼系統 200:視訊編碼器 210:預測單元 211:幀間預測單元 212:幀內預測單元 220:殘差單元 230:變換/量化單元 240:反變換/量化單元 250:重建單元 260:環路濾波單元 270:解碼圖像緩存 280:熵編碼單元 300:視訊解碼器 310:熵解碼單元 320:預測單元 321:幀間預測單元 322:幀內預測單元 330:反量化/變換單元 340:重建單元 350:環路濾波單元 360:解碼圖像緩存 601~603:步驟 1001~1004:步驟 1101:處理單元 1102:拼接單元 1103:編碼單元 1201:解碼單元 1202:拆分單元 1203:處理單元 1310:編碼器 1320:第二記憶體 1330:第二處理器 1410:解碼器 1420:第一記憶體 1430:第一處理器 1501:編碼器 1502:解碼器 100:Video codec system 110: Encoding equipment and encoding devices 111:Video source 112:Video encoder 113:Output interface 120: Decoding equipment, decoding device 121:Input interface 122:Video decoder 123:Display device 130:Channel 150:Coding and decoding system 200:Video encoder 210: Prediction unit 211: Inter prediction unit 212: Intra prediction unit 220: Residual unit 230: Transform/quantization unit 240: Inverse transform/quantization unit 250:Rebuild unit 260: Loop filter unit 270: Decode image cache 280:Entropy coding unit 300:Video decoder 310: Entropy decoding unit 320: Prediction unit 321: Inter prediction unit 322: Intra prediction unit 330: Inverse quantization/transform unit 340:Rebuild unit 350: Loop filter unit 360: Decode image cache 601~603: Steps 1001~1004: steps 1101: Processing unit 1102:Splicing unit 1103: Coding unit 1201: Decoding unit 1202: Split unit 1203: Processing unit 1310:Encoder 1320: Second memory 1330: Second processor 1410:Decoder 1420: First memory 1430: First processor 1501:Encoder 1502:Decoder

圖1為本申請實施例涉及的一種視訊編解碼系統的示意性框圖；Figure 1 is a schematic block diagram of a video encoding and decoding system according to an embodiment of the present application;

圖2A是本申請實施例涉及的視訊編碼器的示意性框圖；Figure 2A is a schematic block diagram of a video encoder involved in an embodiment of the present application;

圖2B是本申請實施例涉及的視訊解碼器的示意性框圖；Figure 2B is a schematic block diagram of a video decoder involved in an embodiment of the present application;

圖3A是多視點視訊資料的組織和表達框架圖；Figure 3A is a diagram of the organization and expression framework of multi-view video data;

圖3B是多視點視訊資料的拼接圖像生成示意圖；Figure 3B is a schematic diagram of splicing image generation for multi-viewpoint video data;

圖3C是點雲資料的組織和表達框架圖；Figure 3C is a diagram of the organization and expression framework of point cloud data;

圖3D至圖3F為不同類型的點雲資料示意圖；Figures 3D to 3F are schematic diagrams of different types of point cloud data;

圖4為多視點視訊的編碼示意圖；Figure 4 is a schematic diagram of multi-viewpoint video encoding;

圖5為多視點視訊的解碼示意圖；Figure 5 is a schematic diagram of multi-viewpoint video decoding;

圖6為本申請一實施例提供的編碼方法流程示意圖；Figure 6 is a schematic flow chart of an encoding method provided by an embodiment of the present application;

圖7為本申請一實施例提供的異構混合拼接圖示意圖；Figure 7 is a schematic diagram of a heterogeneous hybrid splicing diagram provided by an embodiment of the present application;

圖8為本申請一實施例提供的同構拼接圖示意圖；Figure 8 is a schematic diagram of a isomorphic splicing diagram provided by an embodiment of the present application;

圖9為本申請實施例提供的V3C位元流結構的一個示意圖；Figure 9 is a schematic diagram of the V3C bit stream structure provided by the embodiment of the present application;

圖10為本申請實施例提供的一種解碼方法的示意性流程圖；Figure 10 is a schematic flow chart of a decoding method provided by an embodiment of the present application;

圖11為本申請一實施例提供的編碼裝置的示意性框圖；Figure 11 is a schematic block diagram of an encoding device provided by an embodiment of the present application;

圖12為本申請一實施例提供的解碼裝置的示意性框圖；Figure 12 is a schematic block diagram of a decoding device provided by an embodiment of the present application;

圖13為本申請一實施例提供的編碼器的示意性框圖；Figure 13 is a schematic block diagram of an encoder provided by an embodiment of the present application;

圖14為本申請一實施例提供的解碼器的示意性框圖；Figure 14 is a schematic block diagram of a decoder provided by an embodiment of the present application;

圖15為本申請實施例提供的一種編解碼系統的組成結構示意圖。Figure 15 is a schematic structural diagram of a coding and decoding system provided by an embodiment of the present application.

601~603:步驟 601~603: Steps

Claims

A decoding method, including: Decode the code stream to obtain the spliced image and the spliced image information, wherein the spliced image information includes a first syntax element, and the spliced image is determined to be a heterogeneous hybrid spliced image or a homogeneous spliced image according to the first syntax element; When the mosaic is determined to be a heterogeneous hybrid mosaic according to the first syntax element, the mosaic is split according to the mosaic information of the mosaic to obtain at least two types of isomorphic blocks, wherein: The at least two isomorphic blocks correspond to different visual media content expression formats; When it is determined that the spliced graph is a isomorphic spliced graph according to the first syntax element, the spliced graph is split according to the spliced graph information of the spliced graph to obtain a homogeneous block, wherein the isomorphic block is obtained. The building blocks correspond to the same visual media content expression format; The homogeneous blocks are decoded and reconstructed to obtain visual media content in at least one expression format.

The method according to claim 1, wherein the determining according to the first syntax element that the spliced graph is a heterogeneous hybrid spliced graph or a homogeneous spliced graph includes: If the first syntax element is a first preset value, it is determined that the mosaic diagram is a heterogeneous hybrid mosaic diagram including homogeneous blocks of the first expression format and the second expression format, wherein the first expression format and the second expression format are different expression formats; If the first syntax element is a second preset value, it is determined that the mosaic graph is a isomorphic mosaic graph including isomorphic blocks of the first expression format; If the first syntax element is a third preset value, it is determined that the mosaic graph is a isomorphic mosaic graph including isomorphic blocks of the second expression format.

The method according to claim 1, wherein the first syntax element includes: a first sub-syntax element and a second sub-syntax element, and the said first sub-syntax element and the second sub-syntax element are determined according to the first sub-syntax element and the second sub-syntax element. The splicing diagram is a heterogeneous mixed splicing diagram or a homogeneous splicing diagram; Determining according to the first syntax element that the spliced graph is a heterogeneous hybrid spliced graph or a homogeneous spliced graph includes: If the first sub-syntax element is a fourth preset value, it is determined that the splicing diagram includes isomorphic blocks of the first expression format; If the second sub-syntax element is a fifth preset value, it is determined that the splicing diagram includes isomorphic blocks of the second expression format; Wherein, the first expression format and the second expression format are different expression formats.

The method according to claim 3, wherein the determining according to the first syntax element that the spliced graph is a heterogeneous hybrid spliced graph or a homogeneous spliced graph, further includes: If the first sub-syntax element is a sixth preset value, it is determined that the spliced image does not include isomorphic blocks of the first expression format; If the second sub-syntax element is a seventh preset value, it is determined that the spliced image does not include isomorphic blocks of the second expression format.

The method according to any one of claims 1 to 4, wherein the first syntax element is located in a parameter set sub-code stream of the code stream.

The method according to any one of claims 1 to 4, wherein the mosaic map sequence parameter set corresponding to the mosaic map includes the first syntax element.

The method according to claim 1, wherein the method further includes: When the mosaic is determined to be a heterogeneous hybrid mosaic based on the first syntax element, the mosaic information also includes a second syntax element, and the i-th block in the mosaic is determined based on the second syntax element. expression format.

The method according to claim 7, wherein determining the expression format of the i-th block in the splicing diagram according to the second syntax element includes: If the second syntax element is the eighth preset value, then the expression format of the i-th block is determined to be the first expression format; If the second syntax element is a ninth preset value, it is determined that the expression format of the i-th block is the second expression format.

The method according to claim 7, wherein the second syntax element is located in the mosaic block data unit header of the i-th block of the mosaic map.

The method according to any one of claims 7 to 9, wherein the decoding and reconstruction of the isomorphic blocks to obtain visual media content in at least one expression format includes: If the expression format of the i-th block is the first expression format, it is determined that the sub-block in the i-th block is decoded and reconstructed using the decoding method corresponding to the first expression format to obtain the first expression. formats of visual media content; If the expression format of the i-th block is the second expression format, it is determined that the sub-block in the i-th block is decoded and reconstructed using the decoding method corresponding to the second expression format to obtain the second expression. format of visual media content.

The method according to claim 1, wherein the parameter set sub-code stream of the code stream includes a third syntax element, and the code stream corresponding to the visual media content including at least one expression format is determined according to the third syntax element. code stream.

The method according to claim 11, wherein determining, according to the third syntax element, the code stream corresponding to the visual media content including at least one expression format in the code stream includes: The third syntax element is a first value, which determines that the code stream includes both a code stream corresponding to the visual media content in the first expression format and a code stream corresponding to the visual media content in the second expression format; The third syntax element is a second value, which determines that the code stream includes a code stream corresponding to the visual media content of the first expression format; The third syntax element is a third numerical value, which determines that the code stream includes a code stream corresponding to the visual media content of the second expression format.

The method according to any one of claims 11 to 12, wherein the decoding code stream obtains at least one mosaic image, including: It is determined according to the third syntax element that the code stream includes a code stream corresponding to visual media content in at least two expression formats, and the code stream is decoded to obtain a heterogeneous hybrid splicing image.

The method according to claim 1, wherein the code stream includes a video compression sub-stream and a splicing image information sub-stream, and the decoding code stream obtains at least one splicing image and splicing image information, including: Decode the video compression sub-stream to obtain the at least one spliced image; Decode the splicing image information sub-stream to obtain the splicing image information of the at least one splicing image.

The method according to claim 1, wherein the at least one expression format includes: at least one of multi-viewpoint video, point cloud, and grid.

The method according to claim 1, wherein the heterogeneous hybrid mosaic graph is at least one of the following: a single attribute heterogeneous hybrid mosaic graph and a multi-attribute heterogeneous hybrid mosaic graph; The isomorphic splicing diagram includes at least one of the following: a single attribute isomorphic splicing diagram and a multi-attribute isomorphic splicing diagram.

A coding method that includes: Process the visual media content of at least one expression format to obtain at least one isomorphic block, where different types of isomorphic blocks correspond to different visual media content expression formats; The at least one homogeneous block is spliced to obtain at least one spliced graph and spliced graph information, wherein the spliced graph information includes a first syntax element, and the spliced graph is determined to be heterogeneous according to the first syntax element. A hybrid mosaic diagram or a homogeneous mosaic diagram, the heterogeneous hybrid mosaic diagram includes at least two types of isomorphic blocks, and the isomorphic mosaic diagram includes one type of isomorphic block; The at least one spliced image and the spliced image information are encoded to obtain a code stream.

The method according to claim 17, wherein the determining according to the first syntax element that the spliced graph is a heterogeneous hybrid spliced graph or a homogeneous spliced graph includes: If the first syntax element is a first preset value, it is determined that the mosaic diagram is a heterogeneous hybrid mosaic diagram including homogeneous blocks of the first expression format and the second expression format, wherein the first expression format and the second expression format are different expression formats; If the first syntax element is a second preset value, it is determined that the mosaic graph is a isomorphic mosaic graph including isomorphic blocks of the first expression format; If the first syntax element is a third preset value, it is determined that the mosaic graph is a isomorphic mosaic graph including isomorphic blocks of the second expression format.

The method according to claim 17, wherein the first syntax element includes: a first sub-syntax element and a second sub-syntax element, and the said first sub-syntax element and the second sub-syntax element are determined according to the first sub-syntax element and the second sub-syntax element. The splicing diagram is a heterogeneous mixed splicing diagram or a homogeneous splicing diagram; Determining according to the first syntax element that the spliced graph is a heterogeneous hybrid spliced graph or a homogeneous spliced graph includes: If the first sub-syntax element is a fourth preset value, it is determined that the splicing diagram includes isomorphic blocks of the first expression format; If the second sub-syntax element is a fifth preset value, it is determined that the splicing diagram includes isomorphic blocks of the second expression format; Wherein, the first expression format and the second expression format are different expression formats.

The method according to claim 19, wherein the determining according to the first syntax element that the spliced graph is a heterogeneous hybrid spliced graph or a homogeneous spliced graph also includes: If the first sub-syntax element is a sixth preset value, it is determined that the spliced image does not include isomorphic blocks of the first expression format; If the second sub-syntax element is a seventh preset value, it is determined that the spliced image does not include isomorphic blocks of the second expression format.

The method according to any one of claims 17 to 20, wherein the first syntax element is located in a parameter set sub-code stream of the code stream.

The method according to any one of claims 17 to 20, wherein the mosaic map sequence parameter set corresponding to the mosaic map includes the first syntax element.

The method according to claim 17, wherein the method further includes: When the mosaic is determined to be a heterogeneous hybrid mosaic based on the first syntax element, the mosaic information also includes a second syntax element, and the i-th block in the mosaic is determined based on the second syntax element. expression format.

The method according to claim 23, wherein determining the expression format of the i-th block in the splicing diagram according to the second syntax element includes: If the second syntax element is the eighth preset value, then the expression format of the i-th block is determined to be the first expression format; If the second syntax element is a ninth preset value, it is determined that the expression format of the i-th block is the second expression format.

The method according to claim 24, wherein the second syntax element is located in the mosaic block data unit header of the i-th block of the mosaic map.

The method according to any one of claims 23 to 25, wherein said encoding the at least one spliced image and the spliced image information to obtain a code stream includes: If the expression format of the i-th block is the first expression format, it is determined that the sub-blocks in the i-th block are encoded using the encoding method corresponding to the first expression format to obtain the first expression format. The code stream corresponding to the visual media content; If the expression format of the i-th block is the second expression format, it is determined that the sub-tiles in the i-th block are encoded using the encoding method corresponding to the second expression format to obtain the second expression format. The code stream corresponding to the visual media content.

The method according to claim 17, wherein the parameter set sub-code stream of the code stream includes a third syntax element, and it is determined according to the third syntax element that the code stream includes visual media content of at least one expression format. The corresponding code stream.

The method according to claim 27, wherein determining, according to the third syntax element, a code stream corresponding to visual media content including at least one expression format in the code stream includes: The third syntax element is a first value, which determines that the code stream includes both a code stream corresponding to the visual media content in the first expression format and a code stream corresponding to the visual media content in the second expression format; The third syntax element is a second value, which determines that the code stream includes a code stream corresponding to the visual media content of the first expression format; The third syntax element is a third numerical value, which determines that the code stream includes a code stream corresponding to the visual media content of the second expression format.

The method according to any one of claims 27 to 28, wherein when the at least one splicing diagram includes a heterogeneous hybrid splicing diagram, it is determined according to the third syntax element that the code stream includes at least two expression formats. The code stream corresponding to the visual media content.

The method according to claim 17, wherein said encoding the at least one spliced image and the spliced image information to obtain a code stream includes: Encode the at least one spliced image to obtain a video compression sub-stream; Encode the splicing image information of the at least one splicing image to obtain a splicing image information sub-stream; The video compression sub-stream and the splicing image information sub-stream are combined into the code stream.

The method according to claim 17, wherein the at least one expression format includes: at least one of multi-viewpoint video, point cloud and grid.

The method according to claim 17, wherein the heterogeneous hybrid mosaic graph is at least one of the following: a single attribute heterogeneous hybrid mosaic graph and a multi-attribute heterogeneous hybrid mosaic graph; The isomorphic splicing diagram includes at least one of the following: a single attribute isomorphic splicing diagram and a multi-attribute isomorphic splicing diagram.

A decoding device, including: The decoding unit is used to decode the code stream to obtain the splicing image and the splicing image information, wherein the splicing image information includes a first syntax element, and the splicing image is determined to be a heterogeneous hybrid splicing image or a homogeneous splicing image according to the first syntax element. Figure; A splitting unit configured to split the spliced image according to the spliced image information of the spliced image to obtain at least two isomorphic images when it is determined according to the first syntax element that the spliced image is a heterogeneous mixed spliced image. Blocks, wherein the at least two isomorphic blocks correspond to different visual media content expression formats; The splitting unit is configured to split the spliced image according to the spliced image information of the spliced image to obtain a homogeneous region when it is determined that the spliced image is a isomorphic spliced image according to the first syntax element. Blocks, wherein said one isomorphic block corresponds to the same visual media content expression format; A processing unit configured to decode and reconstruct the homogeneous blocks to obtain visual media content in at least one expression format.

An encoding device, which includes: A processing unit, configured to process visual media content in at least one expression format to obtain at least one isomorphic block, wherein different types of isomorphic blocks correspond to different visual media content expression formats; A splicing unit, configured to splice the at least one isomorphic block to obtain at least one spliced image and spliced image information, wherein the spliced image information includes a first syntax element, and the spliced image is determined according to the first syntax element. The mosaic diagram is a heterogeneous hybrid mosaic diagram or a homogeneous mosaic diagram, the heterogeneous hybrid mosaic diagram includes at least two types of isomorphic blocks, and the isomorphic mosaic diagram includes one type of isomorphic block; An encoding unit is used to encode the at least one spliced image and the spliced image information to obtain a code stream.

A decoder, wherein the decoder includes: first memory and first processor; The first memory stores a computer program that can be run on the first processor. When the first processor executes the program, the decoding method described in any one of claims 1 to 16 is implemented.

An encoder, wherein the encoder includes: second memory and second processor; The second memory stores a computer program that can be run on the second processor. When the second processor executes the program, the encoding method described in any one of claims 17 to 32 is implemented.

A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by the first processor, the decoding method described in any one of claims 1 to 16 is implemented; or, When the computer program is executed by the second processor, the encoding method described in any one of claims 17 to 32 is implemented.

A code stream, wherein the code stream is generated based on the method described in any one of the above claims 17 to 32.