WO2024011386A1 - Procédé et appareil de codage, procédé et appareil de décodage, et codeur, décodeur et support de stockage - Google Patents

Procédé et appareil de codage, procédé et appareil de décodage, et codeur, décodeur et support de stockage Download PDF

Info

Publication number
WO2024011386A1
WO2024011386A1 PCT/CN2022/105006 CN2022105006W WO2024011386A1 WO 2024011386 A1 WO2024011386 A1 WO 2024011386A1 CN 2022105006 W CN2022105006 W CN 2022105006W WO 2024011386 A1 WO2024011386 A1 WO 2024011386A1
Authority
WO
WIPO (PCT)
Prior art keywords
syntax element
expression format
splicing
isomorphic
spliced
Prior art date
Application number
PCT/CN2022/105006
Other languages
English (en)
Chinese (zh)
Inventor
虞露
金峡钶
朱志伟
戴震宇
Original Assignee
浙江大学
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江大学, Oppo广东移动通信有限公司 filed Critical 浙江大学
Priority to PCT/CN2022/105006 priority Critical patent/WO2024011386A1/fr
Priority to TW112125658A priority patent/TW202408245A/zh
Publication of WO2024011386A1 publication Critical patent/WO2024011386A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the present application relates to the field of image processing technology, and in particular to a coding and decoding method, device, encoder, decoder and storage medium.
  • Visual expressions with different expression formats may appear in the same scene.
  • media objects For example, in the same three-dimensional scene, the scene background and some characters and objects are expressed in video, and another part of the characters are expressed in three-dimensional point cloud or three-dimensional grid.
  • the current encoding and decoding technology encodes and decodes multi-view video, point cloud encoding and grid mesh respectively.
  • a large number of codecs need to be called during the encoding and decoding process, making encoding and decoding expensive.
  • Embodiments of the present application provide a coding and decoding method, device, encoder, decoder, and storage medium.
  • this application provides a decoding method applied to a decoder, including:
  • the spliced graph is split according to the spliced graph information of the spliced graph to obtain at least two types of isomorphic blocks, wherein: The at least two isomorphic blocks correspond to different visual media content expression formats;
  • the spliced graph is a isomorphic spliced graph according to the first syntax element
  • the spliced graph is split according to the spliced graph information of the spliced graph to obtain a homogeneous block, wherein the one Each isomorphic block corresponds to the same visual media content expression format
  • the homogeneous blocks are decoded and reconstructed to obtain visual media content in at least one expression format.
  • this application provides an encoding method, applied to the encoder, including:
  • the at least one isomorphic block is spliced to obtain at least one spliced graph and spliced graph information, wherein the spliced graph information includes a first syntax element, and it is determined that the spliced graph is a heterogeneous one according to the first syntax element.
  • a heterogeneous hybrid mosaic map or a homogeneous mosaic map the heterogeneous hybrid mosaic map includes at least two types of isomorphic blocks, and the isomorphic mosaic map includes one type of isomorphic block;
  • the at least one spliced image and the spliced image information are encoded to obtain a code stream.
  • this application provides a decoding device, applied to a decoder, which includes:
  • a decoding unit configured to decode the code stream to obtain a splicing image and splicing image information, wherein the splicing image information includes a first syntax element, and the splicing image is determined to be a heterogeneous hybrid splicing image or a homogeneous splicing image according to the first syntax element. picture;
  • a first splitting unit configured to split the spliced image according to the spliced image information of the spliced image to obtain at least two kinds of Isomorphic blocks, wherein the at least two isomorphic blocks correspond to different visual media content expression formats;
  • the second splitting unit is configured to split the spliced diagram according to the spliced diagram information of the spliced diagram to obtain an isomorphic spliced diagram when it is determined according to the first syntax element that the spliced diagram is an isomorphic spliced diagram.
  • a processing unit configured to decode and reconstruct the homogeneous blocks to obtain visual media content in at least one expression format.
  • this application provides an encoding device, applied to an encoder, which includes:
  • a processing unit configured to process visual media content in at least one expression format to obtain at least one isomorphic block, wherein different types of isomorphic blocks correspond to different visual media content expression formats;
  • a splicing unit configured to splice the at least one isomorphic block to obtain at least one splicing graph and splicing graph information, wherein the splicing graph information includes a first syntax element, and the splicing graph information is determined according to the first syntax element.
  • the mosaic diagram is a heterogeneous hybrid mosaic diagram or a homogeneous mosaic diagram, the heterogeneous hybrid mosaic diagram includes at least two types of isomorphic blocks, and the isomorphic mosaic diagram includes one type of isomorphic block;
  • An encoding unit used to encode the at least one spliced image and the spliced image information to obtain a code stream.
  • a decoder including a first memory and a first processor; the first memory stores a computer program executable on the first processor to execute the above first aspect or its respective implementations. method within the method.
  • an encoder including a second memory and a second processor; the second memory stores a computer program that can be run on the second processor to execute the above second aspect or its respective implementations. method within the method.
  • the seventh aspect provides a coding and decoding system, including an encoder and a decoder.
  • the encoder is configured to perform the method in the above second aspect or its implementations, and the decoder is used to perform the method in the above first aspect or its implementations.
  • An eighth aspect provides a chip for implementing any one of the above-mentioned first to second aspects or the method in each implementation manner thereof.
  • the chip includes: a processor, configured to call and run a computer program from a memory, so that the device installed with the chip executes any one of the above-mentioned first to second aspects or implementations thereof. method.
  • a ninth aspect provides a computer-readable storage medium for storing a computer program that causes a computer to execute any one of the above-mentioned first to second aspects or the method in each implementation thereof.
  • a computer program product including computer program instructions, which enable a computer to execute any one of the above-mentioned first to second aspects or the methods in each implementation thereof.
  • An eleventh aspect provides a computer program that, when run on a computer, causes the computer to execute any one of the above-mentioned first to second aspects or the method in each implementation thereof.
  • a twelfth aspect provides a code stream, which is generated based on the encoding method of the second aspect.
  • homogeneous blocks in different expression formats are spliced into a heterogeneous mixed splicing picture, and homogeneous blocks in the same expression format are spliced into a heterogeneous mixed splicing image.
  • the splicing picture information includes the first syntax element used to indicate the type of the splicing picture, which improves the decoding efficiency of the splicing picture at the decoding end.
  • Figure 1 is a schematic block diagram of a video encoding and decoding system related to an embodiment of the present application
  • Figure 2A is a schematic block diagram of a video encoder involved in an embodiment of the present application.
  • Figure 2B is a schematic block diagram of a video decoder involved in an embodiment of the present application.
  • Figure 3A is a diagram of the organization and expression framework of multi-viewpoint video data
  • Figure 3B is a schematic diagram of splicing image generation of multi-viewpoint video data
  • Figure 3C is a diagram of the organization and expression framework of point cloud data
  • Figures 3D to 3F are schematic diagrams of different types of point cloud data
  • Figure 4 is a schematic diagram of multi-viewpoint video encoding
  • Figure 5 is a schematic diagram of decoding multi-viewpoint video
  • Figure 6 is a schematic flow chart of an encoding method provided by an embodiment of the present application.
  • Figure 7 is a schematic diagram of a heterogeneous hybrid splicing diagram provided by an embodiment of the present application.
  • Figure 8 is a schematic diagram of a isomorphic splicing diagram provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of the V3C bitstream structure provided by the embodiment of the present application.
  • Figure 10 is a schematic flow chart of a decoding method provided by an embodiment of the present application.
  • Figure 11 is a schematic block diagram of an encoding device provided by an embodiment of the present application.
  • Figure 12 is a schematic block diagram of a decoding device provided by an embodiment of the present application.
  • Figure 13 is a schematic block diagram of an encoder provided by an embodiment of the present application.
  • Figure 14 is a schematic block diagram of a decoder provided by an embodiment of the present application.
  • Figure 15 is a schematic structural diagram of a coding and decoding system provided by an embodiment of the present application.
  • This application can be applied to the fields of image encoding and decoding, video encoding and decoding, hardware video encoding and decoding, dedicated circuit video encoding and decoding, real-time video encoding and decoding, etc.
  • the solution of this application can be combined with the audio and video coding standard (AVS for short), such as H.264/audio video coding (AVC for short) standard, H.265/high-efficiency video coding (AVS for short) high efficiency video coding (HEVC) standard and H.266/versatile video coding (VVC) standard.
  • AVC audio video coding
  • HEVC high efficiency video coding
  • VVC variatile video coding
  • the solution of this application can be operated in conjunction with other proprietary or industry standards, including ITU-TH.261, ISO/IECMPEG-1Visual, ITU-TH.262 or ISO/IECMPEG-2Visual, ITU-TH.263 , ISO/IECMPEG-4Visual, ITU-TH.264 (also known as ISO/IECMPEG-4AVC), including scalable video codec (SVC) and multi-view video codec (MVC) extensions.
  • SVC scalable video codec
  • MVC multi-view video codec
  • the high-degree-of-freedom immersive coding system can be roughly divided into the following links according to the task line: data collection, data organization and expression, data encoding and compression, data decoding and reconstruction, data synthesis and rendering, and finally presenting the target data to the user.
  • the encoding involved in the embodiment of the present application is mainly video encoding and decoding. To facilitate understanding, the video encoding and decoding system involved in the embodiment of the present application is first introduced with reference to Figure 1 .
  • Figure 1 is a schematic block diagram of a video encoding and decoding system related to an embodiment of the present application. It should be noted that Figure 1 is only an example, and the video encoding and decoding system in the embodiment of the present application includes but is not limited to what is shown in Figure 1 .
  • the video encoding and decoding system 100 includes an encoding device 110 and a decoding device 120 .
  • the encoding device is used to encode the video data (which can be understood as compression) to generate a code stream, and transmit the code stream to the decoding device.
  • the decoding device decodes the code stream generated by the encoding device to obtain decoded video data.
  • the encoding device 110 in the embodiment of the present application can be understood as a device with a video encoding function
  • the decoding device 120 can be understood as a device with a video decoding function. That is, the embodiment of the present application includes a wider range of devices for the encoding device 110 and the decoding device 120. Examples include smartphones, desktop computers, mobile computing devices, notebook (eg, laptop) computers, tablet computers, set-top boxes, televisions, cameras, display devices, digital media players, video game consoles, vehicle-mounted computers, and the like.
  • the encoding device 110 may transmit the encoded video data (eg, code stream) to the decoding device 120 via the channel 130 .
  • Channel 130 may include one or more media and/or devices capable of transmitting encoded video data from encoding device 110 to decoding device 120 .
  • channel 130 includes one or more communication media that enables encoding device 110 to transmit encoded video data directly to decoding device 120 in real time.
  • encoding device 110 may modulate the encoded video data according to the communication standard and transmit the modulated video data to decoding device 120.
  • the communication media includes wireless communication media, such as radio frequency spectrum.
  • the communication media may also include wired communication media, such as one or more physical transmission lines.
  • channel 130 includes a storage medium that can store video data encoded by encoding device 110 .
  • Storage media include a variety of local access data storage media, such as optical disks, DVDs, flash memories, etc.
  • the decoding device 120 may obtain the encoded video data from the storage medium.
  • channel 130 may include a storage server that may store video data encoded by encoding device 110 .
  • the decoding device 120 may download the stored encoded video data from the storage server.
  • the storage server may store the encoded video data and may transmit the encoded video data to the decoding device 120, such as a web server (eg, for a website), a File Transfer Protocol (FTP) server, etc.
  • FTP File Transfer Protocol
  • the encoding device 110 includes a video encoder 112 and an output interface 113.
  • the output interface 113 may include a modulator/demodulator (modem) and/or a transmitter.
  • the encoding device 110 may include a video source 111 in addition to the video encoder 112 and the input interface 113 .
  • Video source 111 may include at least one of a video capture device (eg, a video camera), a video archive, a video input interface for receiving video data from a video content provider, a computer graphics system Used to generate video data.
  • a video capture device eg, a video camera
  • a video archive e.g., a video archive
  • video input interface for receiving video data from a video content provider
  • computer graphics system Used to generate video data.
  • the video encoder 112 encodes the video data from the video source 111 to generate a code stream.
  • Video data may include one or more images (pictures) or sequence of pictures (sequence of pictures).
  • the code stream contains the encoding information of an image or image sequence in the form of a bit stream.
  • Encoded information may include encoded image data and associated data.
  • the associated data may include sequence parameter set (SPS), picture parameter set (PPS) and other syntax structures.
  • SPS sequence parameter set
  • PPS picture parameter set
  • An SPS can contain parameters that apply to one or more sequences.
  • a PPS can contain parameters that apply to one or more images.
  • a syntax structure refers to a collection of zero or more syntax elements arranged in a specified order in a code stream.
  • the video encoder 112 transmits the encoded video data directly to the decoding device 120 via the output interface 113 .
  • the encoded video data can also be stored on a storage medium or storage server for subsequent reading by the decoding device 120 .
  • decoding device 120 includes input interface 121 and video decoder 122. In some embodiments, in addition to the input interface 121 and the video decoder 122, the decoding device 120 may also include a display device 123.
  • the input interface 121 includes a receiver and/or a modem. Input interface 121 may receive encoded video data over channel 130.
  • the video decoder 122 is used to decode the encoded video data to obtain decoded video data, and transmit the decoded video data to the display device 123.
  • the display device 123 displays the decoded video data.
  • Display device 123 may be integrated with decoding device 120 or external to decoding device 120 .
  • Display device 123 may include a variety of display devices, such as a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) display, or other types of display devices.
  • LCD liquid crystal display
  • plasma display a plasma display
  • OLED organic light emitting diode
  • Figure 1 is only an example, and the technical solution of the embodiment of the present application is not limited to Figure 1.
  • the technology of the present application can also be applied to unilateral video encoding or unilateral video decoding.
  • FIG. 2A is a schematic block diagram of a video encoder related to an embodiment of the present application. It should be understood that the video encoder 200 can be used to perform lossy compression of images (lossy compression), or can also be used to perform lossless compression (lossless compression) of images.
  • the lossless compression can be visually lossless compression (visually lossless compression) or mathematically lossless compression (mathematically lossless compression).
  • the video encoder 200 can be applied to image data in a luminance-chrominance (YCbCr, YUV) format.
  • YUV ratio can be 4:2:0, 4:2:2 or 4:4:4, Y represents brightness (Luma), Cb(U) represents blue chroma, Cr(V) represents red chroma, U and V represent Chroma, which is used to describe color and saturation.
  • 4:2:0 means that every 4 pixels have 4 luminance components and 2 chrominance components (YYYYCbCr)
  • 4:2:2 means that every 4 pixels have 4 luminance components and 4 Chroma component (YYYYCbCrCbCr)
  • 4:4:4 means full pixel display (YYYYCbCrCbCrCbCrCbCr).
  • the video encoder 200 reads video data, and for each frame of image in the video data, divides one frame of image into several coding tree units (coding tree units, CTU).
  • CTB may be called “Tree block", “Largest Coding unit” (LCU for short) or “coding tree block” (CTB for short).
  • LCU Large Coding unit
  • CTB coding tree block
  • Each CTU can be associated with an equal-sized block of pixels within the image.
  • Each pixel can correspond to one luminance (luminance or luma) sample and two chrominance (chrominance or chroma) samples. Therefore, each CTU can be associated with one block of luma samples and two blocks of chroma samples.
  • a CTU size is, for example, 128 ⁇ 128, 64 ⁇ 64, 32 ⁇ 32, etc.
  • a CTU can be further divided into several coding units (Coding Units, CUs) for encoding.
  • CUs can be rectangular blocks or square blocks.
  • CU can be further divided into prediction unit (PU for short) and transform unit (TU for short), thus enabling coding, prediction, and transformation to be separated and processing more flexible.
  • the CTU is divided into CUs in a quad-tree manner, and the CU is divided into TUs and PUs in a quad-tree manner.
  • Video encoders and video decoders can support various PU sizes. Assuming that the size of a specific CU is 2N ⁇ 2N, the video encoder and video decoder can support a PU size of 2N ⁇ 2N or N ⁇ N for intra prediction, and support 2N ⁇ 2N, 2N ⁇ N, N ⁇ 2N, N ⁇ N or similar sized symmetric PU for inter prediction. The video encoder and video decoder can also support 2N ⁇ nU, 2N ⁇ nD, nL ⁇ 2N and nR ⁇ 2N asymmetric PUs for inter prediction.
  • the video encoder 200 may include: a prediction unit 210, a residual unit 220, a transform/quantization unit 230, an inverse transform/quantization unit 240, a reconstruction unit 250, and a loop filtering unit. 260. Decode the image cache 270 and the entropy encoding unit 280. It should be noted that the video encoder 200 may include more, less, or different functional components.
  • the current block may be called the current coding unit (CU) or the current prediction unit (PU), etc.
  • the prediction block may also be called a predicted image block or an image prediction block
  • the reconstructed image block may also be called a reconstruction block or an image reconstructed image block.
  • prediction unit 210 includes inter prediction unit 211 and intra estimation unit 212. Since there is a strong correlation between adjacent pixels in a video frame, the intra-frame prediction method is used in video encoding and decoding technology to eliminate the spatial redundancy between adjacent pixels. Since there is a strong similarity between adjacent frames in the video, the interframe prediction method is used in video coding and decoding technology to eliminate the temporal redundancy between adjacent frames, thereby improving coding efficiency.
  • the inter-frame prediction unit 211 can be used for inter-frame prediction.
  • Inter-frame prediction can include motion estimation (motion estimation) and motion compensation (motion compensation). It can refer to image information of different frames.
  • Inter-frame prediction uses motion information to find a reference from a reference frame. block, a prediction block is generated based on the reference block to eliminate temporal redundancy; the frames used in inter-frame prediction can be P frames and/or B frames, P frames refer to forward prediction frames, and B frames refer to bidirectional predictions frame.
  • Inter-frame prediction uses motion information to find reference blocks from reference frames and generate prediction blocks based on the reference blocks.
  • the motion information includes the reference frame list where the reference frame is located, the reference frame index, and the motion vector.
  • the motion vector can be in whole pixels or sub-pixels.
  • the reference frame found according to the motion vector is A block of whole pixels or sub-pixels is called a reference block.
  • Some technologies will directly use the reference block as a prediction block, and some technologies will process the reference block to generate a prediction block. Reprocessing to generate a prediction block based on a reference block can also be understood as using the reference block as a prediction block and then processing to generate a new prediction block based on the prediction block.
  • the intra-frame estimation unit 212 only refers to the information of the same frame image and predicts the pixel information in the current coded image block to eliminate spatial redundancy.
  • the frames used in intra prediction may be I frames.
  • Intra-frame prediction has multiple prediction modes. Taking the international digital video coding standard H series as an example, the H.264/AVC standard has 8 angle prediction modes and 1 non-angle prediction mode, and H.265/HEVC has been extended to 33 angles. prediction mode and 2 non-angle prediction modes.
  • the intra-frame prediction modes used by HEVC include planar mode (Planar), DC and 33 angle modes, for a total of 35 prediction modes.
  • the intra-frame modes used by VVC include Planar, DC and 65 angle modes, for a total of 67 prediction modes.
  • Residual unit 220 may generate a residual block of the CU based on the pixel block of the CU and the prediction block of the PU of the CU. For example, residual unit 220 may generate a residual block of a CU such that each sample in the residual block has a value equal to the difference between the sample in the pixel block of the CU and the PU of the CU. Predict the corresponding sample in the block.
  • Transform/quantization unit 230 may quantize the transform coefficients. Transform/quantization unit 230 may quantize transform coefficients associated with the TU of the CU based on quantization parameter (QP) values associated with the CU. Video encoder 200 may adjust the degree of quantization applied to transform coefficients associated with the CU by adjusting the QP value associated with the CU.
  • QP quantization parameter
  • Inverse transform/quantization unit 240 may apply inverse quantization and inverse transform to the quantized transform coefficients, respectively, to reconstruct the residual block from the quantized transform coefficients.
  • Reconstruction unit 250 may add samples of the reconstructed residual block to corresponding samples of one or more prediction blocks generated by prediction unit 210 to produce a reconstructed image block associated with the TU. By reconstructing blocks of samples for each TU of a CU in this manner, video encoder 200 can reconstruct blocks of pixels of the CU.
  • the loop filtering unit 260 is used to process the inversely transformed and inversely quantized pixels to compensate for distortion information and provide a better reference for subsequent encoding of pixels. For example, a deblocking filtering operation can be performed to reduce the number of pixel blocks associated with the CU. block effect.
  • the loop filtering unit 260 includes a deblocking filtering unit and a sample adaptive compensation/adaptive loop filtering (SAO/ALF) unit, where the deblocking filtering unit is used to remove blocking effects, and the SAO/ALF unit Used to remove ringing effects.
  • SAO/ALF sample adaptive compensation/adaptive loop filtering
  • Decoded image cache 270 may store reconstructed pixel blocks.
  • Inter prediction unit 211 may perform inter prediction on PUs of other images using reference images containing reconstructed pixel blocks.
  • intra estimation unit 212 may use the reconstructed pixel blocks in decoded image cache 270 to perform intra prediction on other PUs in the same image as the CU.
  • Entropy encoding unit 280 may receive the quantized transform coefficients from transform/quantization unit 230 . Entropy encoding unit 280 may perform one or more entropy encoding operations on the quantized transform coefficients to generate entropy encoded data.
  • FIG. 2B is a schematic block diagram of a video decoder related to an embodiment of the present application.
  • the video decoder 300 includes an entropy decoding unit 310 , a prediction unit 320 , an inverse quantization/transformation unit 330 , a reconstruction unit 340 , a loop filtering unit 350 and a decoded image cache 360 . It should be noted that the video decoder 300 may include more, less, or different functional components.
  • Video decoder 300 can receive the code stream.
  • Entropy decoding unit 310 may parse the codestream to extract syntax elements from the codestream. As part of parsing the code stream, the entropy decoding unit 310 may parse entropy-encoded syntax elements in the code stream.
  • the prediction unit 320, the inverse quantization/transformation unit 330, the reconstruction unit 340 and the loop filtering unit 350 may decode the video data according to the syntax elements extracted from the code stream, that is, generate decoded video data.
  • prediction unit 320 includes inter prediction unit 321 and intra estimation unit 322.
  • Intra estimation unit 322 may perform intra prediction to generate predicted blocks for the PU. Intra estimation unit 322 may use an intra prediction mode to generate predicted blocks for a PU based on pixel blocks of spatially neighboring PUs. Intra estimation unit 322 may also determine the intra prediction mode of the PU based on one or more syntax elements parsed from the codestream.
  • the inter prediction unit 321 may construct a first reference image list (List 0) and a second reference image list (List 1) according to syntax elements parsed from the code stream. Additionally, if the PU uses inter-prediction encoding, entropy decoding unit 310 may parse the motion information of the PU. Inter prediction unit 321 may determine one or more reference blocks for the PU based on the motion information of the PU. Inter prediction unit 321 may generate a predictive block for the PU based on one or more reference blocks of the PU.
  • Inverse quantization/transform unit 330 may inversely quantize (ie, dequantize) transform coefficients associated with a TU. Inverse quantization/transform unit 330 may use the QP value associated with the CU of the TU to determine the degree of quantization.
  • inverse quantization/transform unit 330 may apply one or more inverse transforms to the inverse quantized transform coefficients to produce a residual block associated with the TU.
  • Reconstruction unit 340 uses the residual blocks associated with the TU of the CU and the prediction blocks of the PU of the CU to reconstruct the pixel blocks of the CU. For example, reconstruction unit 340 may add samples of the residual block to corresponding samples of the prediction block to reconstruct the pixel block of the CU to obtain a reconstructed image block.
  • Loop filtering unit 350 may perform deblocking filtering operations to reduce blocking artifacts for blocks of pixels associated with the CU.
  • Video decoder 300 may store the reconstructed image of the CU in decoded image cache 360 .
  • the video decoder 300 may use the reconstructed image in the decoded image cache 360 as a reference image for subsequent prediction, or transmit the reconstructed image to a display device for presentation.
  • the basic process of video encoding and decoding is as follows: at the encoding end, an image frame is divided into blocks.
  • the prediction unit 210 uses intra prediction or inter prediction to generate a prediction block of the current block.
  • the residual unit 220 may calculate a residual block based on the prediction block and the original block of the current block, that is, the difference between the prediction block and the original block of the current block.
  • the residual block may also be called residual information.
  • the residual block undergoes transformation and quantization processes such as transformation/quantization unit 230 to remove information that is insensitive to human eyes to eliminate visual redundancy.
  • the residual block before transformation and quantization by the transformation/quantization unit 230 may be called a time domain residual block, and the time domain residual block after transformation and quantization by the transformation/quantization unit 230 may be called a frequency residual block. or frequency domain residual block.
  • the entropy encoding unit 280 receives the quantized change coefficient output from the change quantization unit 230, and may perform entropy encoding on the quantized change coefficient to output a code stream. For example, the entropy encoding unit 280 may eliminate character redundancy according to the target context model and probability information of the binary code stream.
  • the entropy decoding unit 310 can parse the code stream to obtain the prediction information, quantization coefficient matrix, etc. of the current block.
  • the prediction unit 320 uses intra prediction or inter prediction for the current block based on the prediction information to generate a prediction block of the current block.
  • the inverse quantization/transform unit 330 uses the quantization coefficient matrix obtained from the code stream to perform inverse quantization and inverse transformation on the quantization coefficient matrix to obtain a residual block.
  • the reconstruction unit 340 adds the prediction block and the residual block to obtain a reconstruction block.
  • the reconstructed blocks constitute a reconstructed image, and the loop filtering unit 350 performs loop filtering on the reconstructed image based on the image or based on the blocks to obtain a decoded image.
  • the encoding end also needs similar operations as the decoding end to obtain the decoded image.
  • the decoded image may also be called a reconstructed image, and the reconstructed image may be used as a reference frame for inter-frame prediction for
  • the block division information determined by the encoding end as well as mode information or parameter information such as prediction, transformation, quantization, entropy coding, loop filtering, etc., are carried in the code stream when necessary.
  • the decoding end determines the same block division information as the encoding end by parsing the code stream and analyzing the existing information, prediction, transformation, quantization, entropy coding, loop filtering and other mode information or parameter information, thereby ensuring the decoded image obtained by the encoding end It is the same as the decoded image obtained by the decoding end.
  • the current encoding and decoding methods include at least the following two:
  • Method 1 For multi-viewpoint videos, MPEG (Moving Picture Experts Group) immersive video (MIV) technology is used for encoding and decoding, and for point clouds, point cloud video compression (Video based Point Cloud) is used. Compression (VPCC for short) technology for encoding and decoding.
  • MPEG Motion Picture Experts Group
  • MIV Motion Picture Experts Group
  • point cloud video compression Video based Point Cloud
  • Compression VPCC for short
  • FIG. 3A In order to reduce the transmission pixel rate while retaining scene information as much as possible to ensure that there is enough information for rendering the target view, the scheme adopted by MPEG-I is shown in Figure 3A.
  • a limited number of viewpoints are selected as the basic viewpoints and as much as possible
  • the base viewpoint is transmitted as a complete image, and the redundant pixels between the remaining non-base viewpoints and the base viewpoint are removed, that is, only the effective information of non-repeated expressions is retained, and then the effective information is extracted into sub-block images and base views
  • the viewpoint image is reorganized to form a larger rectangular image, which is called a spliced image.
  • Figure 3A and Figure 3B give a schematic process of generating a spliced image.
  • the spliced image is sent to the codec for compression and reconstruction, and the auxiliary data related to the sub-block image splicing information is also sent to the encoder to form a code stream.
  • the encoding method of VPCC is to project point clouds into two-dimensional images or videos, and convert three-dimensional information into two-dimensional information encoding.
  • Figure 3C is the coding block diagram of VPCC.
  • the code stream is roughly divided into four parts.
  • the geometric code stream is the code stream generated by geometric depth map encoding, which is used to represent the geometric information of the point cloud;
  • the attribute code stream is the code stream generated by texture map encoding. , used to represent the attribute information of the point cloud;
  • the occupancy code stream is the code stream generated by the occupancy map encoding, which is used to indicate the effective area in the depth map and texture map;
  • These three types of videos all use video encoders for encoding and decoding.
  • the auxiliary information code stream is the code stream generated by encoding the auxiliary information of the sub-block image, which is the part related to the patch data unit in the V3C standard, indicating the position and size of each sub-block image.
  • Method 2 Multi-viewpoint videos and point clouds are encoded and decoded using the frame packing technology in Visual Volumetric Video-based Coding (V3C).
  • V3C Visual Volumetric Video-based Coding
  • the encoding end includes the following steps:
  • Step 1 When encoding the acquired multi-view video, perform some pre-processing to generate multi-view video sub-blocks (patch). Then, organize the multi-view video sub-blocks to generate a multi-view video splicing image.
  • multi-viewpoint videos are input into TIMV for packaging, and a multi-viewpoint video splicing image is output.
  • TIMV is a reference software for MIV.
  • Packaging in the embodiment of this application can be understood as splicing.
  • the multi-viewpoint video mosaic includes a multi-view video texture mosaic and a multi-view video geometry mosaic, that is, it only contains multi-view video sub-blocks.
  • Step 2 Input the multi-viewpoint video splicing image into the frame packer and output the multi-viewpoint video mixed splicing image.
  • the multi-viewpoint video hybrid splicing image includes a multi-viewpoint video texture blending splicing image, a multi-viewpoint video geometry blending splicing image, and a multi-viewpoint video texture and geometry blending splicing image.
  • the multi-viewpoint video splicing image is frame packed to generate a multi-viewpoint video hybrid splicing image.
  • Each multi-viewpoint video splicing image occupies a region of the multi-viewpoint video hybrid splicing image.
  • a flag pin_region_type_id_minus2 must be transmitted for each region in the code stream. This flag records the information whether the current area belongs to a multi-viewpoint video texture splicing map or a multi-viewpoint video geometric splicing map. This information needs to be used at the decoding end.
  • Step 3 Use a video encoder to encode the multi-viewpoint video mixed splicing image to obtain a code stream.
  • the decoding end includes the following steps:
  • Step 1 During multi-viewpoint video decoding, input the obtained code stream into the video decoder for decoding to obtain a reconstructed multi-viewpoint video mixed splicing image.
  • Step 2 Input the reconstructed multi-viewpoint video mixed splicing image into the frame depacker and output the reconstructed multi-viewpoint video splicing image.
  • the flag pin_region_type_id_minus2 is obtained from the code stream. If it is determined that the pin_region_type_id_minus2 is V3C_AVD, it means that the current region is a multi-viewpoint video texture mosaic, and then the current region is split and output as a reconstructed multi-viewpoint video texture mosaic.
  • pin_region_type_id_minus2 is V3C_GVD, it means that the current region is a multi-viewpoint video geometric mosaic, and the current region is split and output as a reconstructed multi-viewpoint video geometric mosaic.
  • Step 3 Decode the reconstructed multi-viewpoint video splicing image to obtain the reconstructed multi-viewpoint video.
  • the multi-viewpoint video texture splicing image and the multi-viewpoint video geometric splicing image are decoded to obtain the reconstructed multi-viewpoint video.
  • the above uses multi-viewpoint video as an example to analyze and introduce frame packing technology.
  • the frame packing encoding and decoding method for point clouds is basically the same as the above-mentioned multi-viewpoint video. You can refer to it.
  • TMC a VPCC reference software
  • the cloud is packaged to obtain a point cloud splicing image.
  • the point cloud splicing image is input into the frame packer for frame packaging to obtain a point cloud hybrid splicing image.
  • the point cloud hybrid splicing image is spliced to obtain a point cloud code stream. I will not go into details here. .
  • V3C unit header syntax is shown in Table 1:
  • V3C unit header semantics as shown in Table 2:
  • the visual media content with multiple different expression formats will be encoded and decoded separately.
  • the current packaging technology is to compress the point cloud to form a point cloud compression code stream (i.e. a V3C code stream), and to compress the multi-viewpoint video Information is compressed to obtain a multi-view video compressed code stream (i.e. another V3C code stream), and then the system layer multiplexes the compressed code stream to obtain a fused three-dimensional scene multiplexed code stream.
  • the point cloud compression code stream and the multi-viewpoint video compression code stream are decoded separately. It can be seen from this that when encoding and decoding visual media content in multiple different expression formats, the existing technology uses many codecs and the encoding and decoding cost is high.
  • the embodiments of the present application splice homogeneous blocks with different expression formats into a heterogeneous mixed splicing diagram, and splice homogeneous blocks with the same expression format into a homogeneous splicing diagram.
  • the resulting Heterogeneous hybrid splicing images and/or homogeneous splicing images are encoded and written into the code stream.
  • Homogeneous splicing images (such as at least one of multi-viewpoint splicing images, point cloud splicing images and grid splicing images) can coexist in the code stream.
  • heterogeneous hybrid splicing images to expand the application scenarios of encoding and decoding methods.
  • the splicing picture information includes a first syntax element used to indicate the type of the splicing picture, which can improve the decoding efficiency of the splicing picture at the decoding end.
  • Figure 6 is a schematic flow chart of the encoding method provided by the embodiment of the present application. As shown in Figure 6, the encoding method includes:
  • Step 601 Process the visual media content of at least one expression format to obtain at least one isomorphic block, where different types of isomorphic blocks correspond to different visual media content expression formats;
  • Visual expressions with different expression formats may appear in the same scene.
  • Media objects for example, exist in the same three-dimensional scene.
  • the scene background and some characters and objects are expressed in video, and another part of the characters are expressed in three-dimensional point cloud or three-dimensional grid.
  • the visual media content includes visual media content in at least one expression format such as multi-view video, point cloud, and grid.
  • multi-viewpoint video is single-viewpoint video, that is, the multi-viewpoint video may include multiple viewpoint videos and/or single-viewpoint video.
  • one isomorphic block corresponds to one expression format.
  • the expression format corresponding to at least one isomorphic block includes at least one of the following: multi-view video, point cloud, and grid.
  • At least two isomorphic blocks correspond to at least two different expression formats.
  • the at least two isomorphic blocks in the embodiment of the present application include isomorphic areas of at least two different expression formats such as multi-view video, point cloud, grid, etc. piece.
  • each isomorphic block may include at least one isomorphic block with the same expression format.
  • a homogeneous block in point cloud format includes one or more point cloud blocks
  • a homogeneous area in multi-viewpoint video format includes one or more multi-viewpoint video blocks
  • a homogeneous area in grid format includes one or more point cloud blocks.
  • a block consists of one or more grid blocks.
  • step 601 may be: processing visual media content in an expression format to obtain a homogeneous block.
  • step 601 may include: processing visual media content in at least two expression formats to obtain at least two isomorphic blocks, where different visual media content corresponds to different expression formats.
  • the visual media content in the first expression format is processed to obtain isomorphic blocks in the first expression format
  • the visual multimedia content in the second expression format is processed to obtain isomorphic blocks in the second expression format.
  • the first expression format is one of multi-view video, point cloud, and grid
  • the second expression format is one of multi-view video, point cloud, and grid
  • the first expression format and the second expression format are different expressions. Format.
  • the above-mentioned visual media content includes visual media content in at least one expression format such as multi-viewpoint video, point cloud, grid, etc.
  • the visual media content is processed to obtain isomorphic blocks of an expression format.
  • the visual media content of multiple expression formats is included, the visual media content is processed to obtain isomorphic blocks of multiple expression formats.
  • blocks can also be called tiles, that is, point cloud blocks can also be called point cloud strips, multi-viewpoint video blocks can also be called multi-viewpoint video strips, and grid blocks Also called grid strips.
  • the block may be a mosaic of a specific shape, for example, a mosaic of a rectangular area with a specific length and/or height.
  • at least one sub-tile can be spliced in an orderly manner, such as from large to small according to the area of the sub-tiles, or from large to small according to the length and/or height of the sub-tiles, to obtain the visual media content corresponding to block.
  • a tile can be mapped exactly to an atlas tile.
  • each sub-tile in a block may have a patch ID (patchID) to distinguish different sub-tiles in the same block.
  • patchID patch ID
  • the same block may include sub-patch 1 (patch1), sub-patch 2 (patch2), and sub-patch 3 (patch3).
  • each sub-block in the isomorphic block is a multi-view video sub-block, or is a point cloud sub-block, etc.
  • a subtile for the expression format is the expression format corresponding to the isomorphic block.
  • homogeneous tiles may have tile identifiers (tileIDs) to distinguish different tiles of the same expression format.
  • the point cloud block may include point cloud block 1 or point cloud block 2.
  • multiple visual media contents include point clouds and multi-viewpoint videos.
  • the point clouds are processed to obtain point cloud blocks.
  • Point cloud block 1 includes point cloud sub-blocks 1 to 3; for multi-view points
  • the video is processed to obtain a multi-viewpoint video block, which includes multi-viewpoint video sub-blocks 1 to 4.
  • a homogeneous block of the expression format is obtained.
  • at least two visual media contents need to be processed, at least two isomorphic blocks of expression formats are obtained.
  • embodiments of the present application process the at least two visual media contents, such as packaging (also called splicing) processing, to obtain blocks corresponding to each visual media content in the at least two visual media contents.
  • the block can be obtained by splicing sub-tiles (patches) corresponding to at least two visual media contents. It should be noted that the embodiment of the present application processes at least two visual media contents separately, and the method of obtaining blocks is not limited.
  • the visual media content includes visual media content in two expression formats: multi-view video and point cloud.
  • the visual media content in at least one expression format is processed to obtain at least one isomorphic region. block, including: after projecting and de-redundant processing of the acquired multi-viewpoint video, connecting non-repeating pixel points into video sub-blocks, and splicing the video sub-blocks into multi-viewpoint video blocks; and processing the acquired points
  • the cloud performs parallel projection, and the connected points in the projection surface are composed of point cloud sub-blocks, and the point cloud sub-blocks are spliced into point cloud blocks.
  • a limited number of viewpoints are selected as base viewpoints and express the visible range of the scene as much as possible.
  • the base viewpoints are transmitted as complete images, and the gaps between the remaining non-base viewpoints and the base viewpoints are removed. Redundant pixels, that is, only the effective information of non-repeated expressions is retained, and then the effective information is extracted into sub-block images and basic viewpoint images and reorganized to form a larger strip-shaped image.
  • This strip-shaped image is called a multi-viewpoint video block.
  • the above-mentioned visual media content is media content presented simultaneously in the same three-dimensional space. In some embodiments, the visual media content is media content presented at different times in the same three-dimensional space. In some embodiments, the above-mentioned visual media content may also be media content in different three-dimensional spaces. That is to say, in the embodiments of this application, there are no specific restrictions on the at least two visual media contents mentioned above.
  • Step 602 Splice the at least one isomorphic block to obtain at least one splicing graph and splicing graph information, wherein the splicing graph information includes a first syntax element, and the splicing is determined according to the first syntax element.
  • the picture shows a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram.
  • the heterogeneous hybrid splicing diagram includes at least two types of isomorphic blocks, and the isomorphic splicing diagram includes one type of isomorphic block;
  • the splicing of the at least one homogeneous block to obtain at least one splicing diagram and splicing diagram information includes: heterogeneously splicing homogeneous blocks of at least two expression formats to generate Heterogeneous mixed splicing diagrams and splicing diagram information; isomorphic splicing of homogeneous blocks with the same expression format to generate isomorphic splicing diagrams and splicing diagram information.
  • At least one isomorphic block includes a isomorphic block in a first expression format and a isomorphic block in a second expression format.
  • the method specifically includes: isomorphically splicing the isomorphic blocks of the first expression format to obtain the first isomorphic splicing diagram and splicing diagram information, and isomorphically splicing the isomorphic blocks of the second expression format to obtain the second isomorphic splicing diagram.
  • Homogeneous splicing diagram and splicing diagram information or, performing heterogeneous splicing on the isomorphic blocks of the first expression format and the isomorphic blocks of the second expression format to obtain heterogeneous mixed splicing diagram and splicing diagram information; or, perform heterogeneous splicing on The isomorphic blocks in the first expression format are isomorphically spliced to obtain the first isomorphic splicing diagram and the splicing diagram information, and the isomorphic blocks in the first expression format and the isomorphic blocks in the second expression format are heterogeneously spliced.
  • Blocks are heterogeneously spliced with homogeneous blocks in the second expression format to obtain heterogeneous mixed splicing images and splicing image information.
  • the homogeneous splicing diagram may include one isomorphic block or multiple isomorphic blocks of the same expression format, and the heterogeneous mixed splicing diagram includes at least two isomorphic blocks of at least two expression formats.
  • the first expression format is one of multi-view video, point cloud, and grid
  • the second expression format is one of multi-view video, point cloud, and grid
  • the first expression format and the third expression format are one of multi-view video, point cloud, and grid.
  • the two expression formats are different expression formats. As shown in Figure 7, multi-viewpoint video block 1, multi-viewpoint video block 2 and point cloud block 1 are spliced to obtain a heterogeneous hybrid stitching image.
  • the first expression format is multi-viewpoint video
  • the second expression format is point cloud.
  • the splicing of the at least one homogeneous block to obtain at least one spliced image and spliced image information includes: splicing a part of the multi-viewpoint video block and a part of the point cloud block into a heterogeneous hybrid spliced image; Part of the multi-viewpoint video blocks are spliced into a multi-viewpoint spliced image; another part of the point cloud blocks are spliced into a point cloud spliced image.
  • the mosaic image information includes a first syntax element, and the first syntax element is used to indicate that the mosaic image is a heterogeneous hybrid mosaic image or a homogeneous mosaic image.
  • determining according to the first syntax element that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram includes: if the first syntax element is a first preset value, then determining that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram.
  • the figure shows a heterogeneous mixed splicing diagram including homogeneous blocks of a first expression format and a second expression format, wherein the first expression format and the second expression format are different expression formats; the first syntax element is the second preset value, then it is determined that the splicing diagram is a isomorphic splicing diagram including the isomorphic blocks of the first expression format; the first syntax element is the third preset value, then it is determined that the splicing
  • the figure shows a isomorphic mosaic diagram including the isomorphic blocks of the second expression format. That is to say, by setting different values for the first syntax element, it is used to indicate the mosaic type.
  • the first syntax element can also be set to other values to indicate that the spliced graph is a isomorphic spliced graph that includes isomorphic blocks of other expression formats, or to indicate that the spliced graph includes at least two other expressions. Heterogeneous mosaic graph of homogeneous blocks in format.
  • the first syntax element includes at least two sub-syntax elements.
  • the first syntax element includes: a first sub-grammar element and a second sub-grammar element. According to the first sub-grammar element and the second sub-grammar element, it is determined that the splicing diagram is heterogeneous hybrid splicing.
  • the splicing graph is a heterogeneous hybrid splicing graph or a homogeneous splicing graph according to the first syntax element includes: the first sub-grammar element is a fourth preset value, then it is determined that the splicing graph is a heterogeneous hybrid splicing graph or a isomorphic splicing graph. If the spliced graph includes isomorphic blocks of the first expression format; if the second sub-syntax element is the fifth preset value, it is determined that the spliced graph includes isomorphic blocks of the second expression format.
  • the spliced graph includes isomorphic blocks of the first expression format, that is, it is determined that the spliced graph includes the first expression format.
  • the spliced diagram includes homogeneous blocks in the first expression format and isomorphic blocks in the second expression format, that is, the spliced diagram is determined to be a heterogeneous hybrid splicing including homogeneous blocks in the first expression format and the second expression format. picture.
  • the method further includes: when the first sub-grammar element is a sixth preset value, it is determined that the splicing diagram does not include isomorphic blocks of the first expression format; the second sub-grammar element If the element is the seventh preset value, it is determined that the mosaic image does not include isomorphic blocks in the second expression format.
  • the first sub-grammar element is a fourth preset value and the second sub-grammar element is a fifth preset value
  • the mosaic diagram includes isomorphic blocks of the first expression format and the second A heterogeneous mixed mosaic diagram of homogeneous blocks in an expression format
  • the first sub-grammar element is a fourth preset value and the second sub-grammar element is a seventh preset value
  • the mosaic diagram includes all
  • the first sub-grammar element is the sixth preset value and the second sub-grammar element is the fifth preset value
  • the mosaic diagram is A isomorphic mosaic diagram including the isomorphic blocks of the second expression format.
  • the expression format of the isomorphic block in the splicing diagram can also be determined based on the values of the two sub-grammatical elements.
  • multiple syntax elements can also be used to indicate the expression formats of the isomorphic blocks in the splicing diagram. For example, when three expression formats are included, three syntax elements are set, and when four expression formats are included, four syntax elements are set. Multiple values can also be set through one syntax element to represent multiple expression formats.
  • the first syntax element is located in a parameter set of the code stream.
  • the parameter set of the code stream may be V3C_VPS
  • the first syntax element may be ptl_profile_toolset_idc in V3C_VPS.
  • the mosaic graph sequence parameter set corresponding to the mosaic graph includes the first syntax element.
  • the splicing graph sequence parameter set corresponding to the splicing graph includes the first sub-syntax element and the second sub-syntax element.
  • the first sub-syntax element is asps_vpcc_extension_present_flag in the splicing diagram sequence parameter set
  • the second sub-syntax element is asps_miv_extension_present_flag.
  • the first syntax element can be located in the parameter set of the code stream, and the decoding end can parse the splicing pattern type of each splicing pattern earlier.
  • the first syntax element may also be located in the mosaic sequence parameter set corresponding to each mosaic image, and the decoding end obtains and then determines the mosaic image type when parsing each mosaic image.
  • the heterogeneous hybrid mosaic graph of the embodiment of the present application includes at least one of the following: a single attribute heterogeneous hybrid mosaic graph and a multi-attribute heterogeneous hybrid mosaic graph.
  • the single-attribute heterogeneous hybrid splicing diagram refers to the heterogeneous hybrid splicing diagram in which the attribute information of all homogeneous blocks included is the same.
  • a single attribute heterogeneous hybrid mosaic image only includes homogeneous blocks of attribute information, such as only multi-view video texture blocks and point cloud texture blocks.
  • a single-attribute heterogeneous hybrid mosaic image only includes homogeneous blocks of geometric information, such as only multi-view video geometry blocks and point cloud geometry blocks.
  • a multi-attribute heterogeneous hybrid mosaic map refers to a heterogeneous hybrid mosaic map that includes at least two homogeneous blocks with different attribute information.
  • a multi-attribute heterogeneous hybrid mosaic map includes both homogeneous blocks with attribute information. Also includes isomorphic blocks of geometric information.
  • any attribute or blocks under any two attributes of at least two of the point cloud, multi-viewpoint video and grid can be spliced into one image to obtain a heterogeneous hybrid spliced image. This application does not limit this.
  • the single-attribute homogeneous blocks in the first expression format and the single-attribute blocks in the second expression format are spliced to obtain a heterogeneous hybrid spliced image.
  • the first expression format and the second expression format are any one of multi-view video, point cloud, and grid, and the first expression format and the second expression format are different.
  • the first expression format and the second expression format The attribute information is the same.
  • the single attribute isomorphic block of the multi-view video includes at least one of a multi-view video texture block, a multi-view video geometry block, and the like.
  • the single attribute isomorphic block of the point cloud includes at least one of a point cloud texture block, a point cloud geometry block, a point cloud occupancy block, and the like.
  • the single attribute isomorphic block of the grid includes at least one of a grid texture block and a grid geometry block.
  • At least two of the multi-viewpoint video geometry blocks, point cloud geometry blocks, and grid geometry blocks are spliced into one image to obtain a heterogeneous hybrid spliced image.
  • This heterogeneous mixed mosaic diagram is called a single attribute heterogeneous mixed mosaic diagram.
  • at least two of the multi-viewpoint video texture blocks, point cloud texture blocks, and grid texture blocks are spliced into one image to obtain a heterogeneous hybrid spliced image.
  • This heterogeneous mixed mosaic diagram is called a single attribute heterogeneous mixed mosaic diagram.
  • the multi-attribute isomorphic blocks in the first expression format and the multi-attribute isomorphic blocks in the second expression format are spliced to obtain a heterogeneous hybrid spliced image.
  • the first expression format and the second expression format are any one of multi-view video, point cloud, and grid, and the first expression format and the second expression format are different.
  • the first expression format and the second expression format The attribute information is not exactly the same.
  • the multi-viewpoint video texture block is spliced into one picture with at least one of the point cloud geometry block and the mesh geometry block to obtain a heterogeneous hybrid spliced picture.
  • a multi-viewpoint video geometry block is spliced into one picture with at least one of a point cloud texture block and a mesh texture block to obtain a heterogeneous hybrid spliced picture.
  • the point cloud texture block and at least one of the multi-viewpoint video geometry block and the mesh geometry block are spliced into one image to obtain a heterogeneous hybrid spliced image.
  • the point cloud geometry block is spliced into one picture with at least one of the multi-viewpoint video texture block and the mesh texture block to obtain a heterogeneous hybrid spliced picture.
  • point cloud geometry blocks, multi-viewpoint video texture blocks, and multi-viewpoint video texture blocks are spliced into one image to obtain a heterogeneous hybrid spliced image.
  • point cloud geometry blocks, point cloud texture blocks, multi-viewpoint video texture blocks, and multi-viewpoint video texture blocks are spliced into one image to obtain a heterogeneous hybrid spliced image.
  • the obtained heterogeneous hybrid mosaic graph is called a multi-attribute heterogeneous hybrid mosaic graph.
  • the following takes the first expression format as multi-viewpoint video and the second expression format as point cloud as an example to introduce the splicing method in detail.
  • the multi-view video block includes a multi-view video texture block and a multi-view video geometry block
  • the point cloud block includes a point cloud texture block, a point cloud geometry block and a point cloud occupancy block.
  • Method 1 Splice the multi-viewpoint video texture block, multi-viewpoint video geometry block, point cloud texture block, point cloud geometry block and point cloud occupancy block into a heterogeneous hybrid splicing image.
  • Method 2 According to the preset heterogeneous splicing method, splice the multi-view video texture block, multi-view video geometry block, point cloud texture block, point cloud geometry block and point cloud occupancy block to obtain M
  • M is a positive integer greater than or equal to 1.
  • the second method can include at least the following examples: Example 1, splicing multi-view video texture blocks and point cloud texture blocks to obtain a heterogeneous mixed texture splicing map, and combining multi-view video geometry blocks and point cloud geometry The blocks are spliced to obtain a heterogeneous mixed geometry splicing map, and the point cloud occupancy blocks are separately used as a mixed splicing map.
  • Example 2 Splice multi-view video texture blocks and point cloud texture blocks to obtain a heterogeneous mixed texture splicing map. Splice multi-view video geometry blocks, point cloud geometry blocks and point cloud occupancy blocks. A mosaic of heterogeneous mixed geometry and occupancy is obtained.
  • Example 3 Splice the multi-view video texture block, the point cloud texture block and the point cloud occupancy block to obtain a sub-heterogeneous hybrid stitching image, which combines the multi-view video geometry block and the point cloud geometry block. Perform splicing to obtain another sub-heterogeneous hybrid splicing picture. Further, after obtaining M heterogeneous mixed spliced images, video coding can be performed on the M heterogeneous mixed spliced images respectively to obtain video compression sub-streams.
  • the isomorphic splicing graph of the embodiment of the present application includes at least one of the following: a single attribute isomorphic splicing graph and a multi-attribute isomorphic splicing graph.
  • the first attribute isomorphic blocks of the first expression format are spliced to obtain an isomorphic splicing graph.
  • the first attribute isomorphic block and the second attribute isomorphic block of the first expression format are spliced to obtain an isomorphic splicing diagram.
  • a single-attribute isomorphic splicing diagram refers to a isomorphic splicing diagram in which all isomorphic blocks included have the same expression format and the same attribute information.
  • a single-attribute isomorphic mosaic image only includes isomorphic blocks that express attribute information in the first format.
  • a single-attribute isomorphic mosaic image only includes multi-view video texture blocks, or only point cloud texture blocks.
  • a single-attribute isomorphic mosaic image only includes isomorphic blocks of geometric information, such as only multi-view video geometric blocks, or only point cloud geometric blocks.
  • a multi-attribute isomorphic spliced graph refers to an isomorphic spliced graph that includes at least two isomorphic blocks with the same expression format but different attribute information.
  • a multi-attribute isomorphic spliced graph includes both isomorphic blocks with attribute information. , and also includes isomorphic blocks of geometric information.
  • a multi-attribute isomorphic mosaic image includes multi-viewpoint video texture blocks and multi-viewpoint video collection blocks.
  • a multi-attribute isomorphic mosaic image includes a point cloud geometry block and a point cloud texture block. As shown in Figure 8, a multi-attribute isomorphic mosaic image includes a point cloud texture block 1 and a point cloud geometry area. Block 1 and Point Cloud Geometry Block 2.
  • the spliced image information may also include syntax elements, according to which the spliced image is determined to be a single-attribute heterogeneous hybrid spliced image, a multi-attribute heterogeneous hybrid spliced image, a single-attribute isomorphic spliced image, or a multi-attribute homogeneous spliced image. Construct a mosaic diagram.
  • Step 603 Encode the at least one spliced image and the spliced image information to obtain a code stream.
  • the code stream includes a video compression sub-stream and a splicing image information sub-stream.
  • the encoding of the at least one spliced image and the spliced image information to obtain a code stream includes: encoding the at least one spliced image to obtain a video compression sub-stream; and encoding the spliced image information of the at least one spliced image. Encoding is performed to obtain a splicing image information sub-stream; the video compression sub-stream and the splicing image information sub-stream are synthesized into the code stream.
  • Hybrid splicing can reduce the number of 2D video encoders such as HEVC, VVC, AVC, and AVS that need to be called, reduce implementation costs, and improve ease of use.
  • the spliced image information when it is determined based on the first syntax element that the spliced image is a heterogeneous hybrid spliced image, the spliced image information also includes a second syntax element, and the spliced image is determined based on the second syntax element
  • the encoding end writes the second syntax element into the code stream, which can help improve the decoding accuracy of the decoding end, and at the same time enable the V3C standard to support visual media content in different expression formats such as multi-view videos and point clouds in the same compressed code stream. .
  • determining the expression format of the i-th block in the mosaic diagram based on the second syntax element includes: if the second syntax element is an eighth preset value, then determining the i-th block The expression format of the i-th block is the first expression format; if the second syntax element is the ninth preset value, it is determined that the expression format of the i-th block is the second expression format.
  • the expression format type corresponding to the i-th block in the splicing diagram can be indicated by setting different values to the second syntax element.
  • the second syntax element is set to the eighth default value; if the i-th block is a point cloud block, the second syntax element is set to the eighth default value; If the block is a multi-viewpoint video block, the second syntax element is set to the ninth default value.
  • the embodiments of this application do not limit the specific values of the eighth preset value and the ninth preset value.
  • the eighth preset value is 0.
  • the ninth default value is 1.
  • encoding the at least one spliced image and the spliced image information to obtain a code stream includes: if the expression format of the i-th block is a first expression format, determining the i-th block The sub-tiles in each block are encoded using the encoding standard corresponding to the first expression format to obtain a code stream corresponding to the visual media content of the first expression format; if the expression format of the i-th block is the In the second expression format, it is determined that the sub-tiles in the i-th block are encoded using the encoding standard corresponding to the second expression format, and a code stream corresponding to the visual media content of the second expression format is obtained.
  • the second syntax element is located in the mosaic block data unit header of the i-th block of the mosaic map. In some embodiments, the second syntax element may also be located in a sub-patch data unit (patch_data_unit). For example, on the premise that the second syntax element (ath_toolset_type) is known to be 1, it is determined that the current sub-tile is encoded using the multi-view video coding standard. On the premise that the second syntax element (ath_toolset_type) is known to be 0, it is determined that the current sub-tile is encoded using the point cloud encoding standard.
  • encoding the at least one spliced image and the spliced image information to obtain a code stream includes: calling a video encoder to encode the at least one spliced image to obtain a video compression sub-stream.
  • At least two visual media contents are first processed separately (that is, packaged) to obtain multiple isomorphic blocks.
  • at least two homogeneous blocks with different expression formats are spliced into a heterogeneous mixed spliced graph, and at least one homogeneous block with exactly the same expression format is spliced into a homogeneous spliced graph.
  • the isomorphic splicing image is encoded to obtain the video compression sub-stream.
  • the video encoder can be called only once for encoding, thereby reducing the number of 2D video encoders such as HEVC, VVC, AVC, and AVS that need to be called, reducing encoding costs and improving ease of use.
  • the video encoder used to perform video encoding on the heterogeneous hybrid splicing image and the homogeneous splicing image to obtain the video compression sub-stream can be the video encoder shown in Figure 2A above. That is to say, in the embodiment of the present application, the heterogeneous hybrid splicing image or the homogeneous splicing image is used as a frame image. Block division is first performed, and then intra-frame or inter-frame prediction is used to obtain the predicted value of the coding block. The predicted value of the coding block and The original values are subtracted to obtain the residual value. After transforming and quantizing the residual value, the video compression sub-stream is obtained.
  • the mosaic image information corresponding to each mosaic image is generated.
  • the spliced image information is encoded to obtain the spliced image information sub-stream.
  • the splicing diagram information includes a first syntax element used to indicate the type of the splicing diagram, and a second syntax element used to indicate the expression format of each isomorphic block in the splicing diagram.
  • the embodiments of the present application do not limit the method of encoding the spliced image information. For example, conventional data compression encoding methods such as equal-length encoding or variable-length encoding may be used for compression.
  • the video compression sub-stream and the splicing image information sub-stream are written in the same code stream to obtain the final code stream.
  • the embodiments of the present application not only support heterogeneous source formats such as video, point cloud, grid, etc., but also support homogeneous source formats in the same compressed code stream.
  • the method further includes: encoding the parameter set of the code stream to obtain a code stream parameter set sub-stream.
  • the encoding end synthesizes the video compression sub-stream, the splicing image information sub-stream and the parameter set sub-stream into a code stream.
  • the parameter set sub-code stream of the code stream includes a third syntax element, and the code stream corresponding to the visual media content including at least one expression format in the code stream is determined according to the third syntax element. That is to say, the encoding end sends the third syntax element to indicate whether the code stream contains visual media content in at least two expression formats at the same time.
  • the encoding end processes the visual media content in an expression format to obtain a homogeneous block. , splicing a kind of isomorphic blocks to obtain a isomorphic splicing graph.
  • the third syntax element indicates that the code stream includes code streams corresponding to visual media content in at least two expression formats, it can be understood that the encoding end obtains at least two isomorphic blocks for the visual media content in at least two expression formats. Two homogeneous blocks are spliced to obtain a homogeneous spliced image and/or a heterogeneous hybrid spliced image.
  • the method includes: isomorphically splicing the isomorphic blocks of the first expression format to obtain the first Isomorphic splicing diagram: perform isomorphic splicing on the isomorphic blocks of the second expression format to obtain the second isomorphic splicing diagram; or, perform isomorphic splicing on the isomorphic blocks of the first expression format and the isomorphic blocks of the second expression format.
  • the isomorphic blocks of the expression format are heterogeneously spliced to obtain a heterogeneous mixed splicing diagram; or the isomorphic blocks of the second expression format are isomorphically spliced to obtain a second isomorphic splicing diagram.
  • the homogeneous blocks and the homogeneous blocks in the second expression format are heterogeneously spliced to obtain a heterogeneous mixed splicing diagram.
  • setting the third syntax element to a different value indicates that the code stream includes a code stream corresponding to the visual media content of at least one expression format. That is to say, certain preset values of the third syntax element can indicate that the code stream includes code streams corresponding to visual media content in one or more expression formats.
  • determining the code stream corresponding to the visual media content including at least one expression format in the code stream according to the third syntax element includes: the third syntax element is a first value, and determining the The code stream simultaneously includes a code stream corresponding to the visual media content in the first expression format and a code stream corresponding to the visual media content in the second expression format; the third syntax element is a second value, which determines that the code stream includes all The code stream corresponding to the visual media content in the first expression format; the third syntax element is a third value, which determines that the code stream includes the code stream corresponding to the visual media content in the second expression format.
  • the parameter set of the code stream may be V3C_VPS
  • the third syntax element may be ptl_profile_toolset_idc in V3C_VPS.
  • the third syntax element when the third syntax element is set to a first value, the first value is used to indicate that the code stream also contains multi-view video codes. Streams and point cloud code streams.
  • the third syntax element when the third syntax element is set to the second value, the second value is used to indicate that the code stream only contains the point cloud code stream.
  • the third syntax element is set to a third value, and the third value is used to indicate that the code stream only contains a multi-view video code stream.
  • V3C_VPS in the existing V3C standard can be reused, and ptl_profile_toolset_idc is preconfigured with values such as 0/1, 64/65/66, 128/129/130/132/133/134 to indicate the current code stream
  • the code stream type included in .
  • the embodiment of the present application adds the value of the third syntax element in the parameter set to indicate that the code stream contains the code stream corresponding to the visual media content in that expression format, which can help improve
  • the decoding accuracy of the decoder also enables the V3C standard to support visual media content containing one or more expression formats such as multi-view videos, point clouds, grids, etc. in the same compressed code stream.
  • Table 3 shows an example of available toolset profile components (Available toolset profile components).
  • Table 3 provides a list of toolset profile components defined for V3C and their corresponding identification syntax element values, such as ptl_profile_toolset_idc and ptc_one_v3c_frame_only_flag. This definition may be used for this document only.
  • the syntax element ptl_profile_toolset_idc provides the main definition of the toolset profile.
  • Additional syntax elements such as ptc_one_v3c_frame_only_flag can specify additional characteristics or restrictions of the defined profile.
  • ptc_one_v3c_frame_only_flag can be used to support only a single V3C frame.
  • the parameter set of the code stream further includes a first syntax element, wherein the first syntax element is used to indicate the type of each mosaic picture, specifically to indicate that the mosaic picture is the heterogeneous hybrid mosaic picture or The isomorphic splicing diagram; writing the first syntax element into the parameter set of the code stream.
  • the first syntax element vps_toolset_type
  • V3C_VPS V3C_VPS
  • vps_toolset_type is used to determine whether each spliced image and its corresponding V3C unit should belong to a point cloud spliced image/multi-viewpoint spliced image/point cloud + multi-viewpoint heterogeneous mixture.
  • Mosaic diagram At the same time, in order to be compatible with previous standards, the following new syntax and semantics are implemented, as well as constraints on the old semantics.
  • the first syntax element is a first preset value, which determines that the mosaic graph is a heterogeneous hybrid mosaic graph including homogeneous blocks of the first expression format and the second expression format, wherein the first An expression format and the second expression format are different expression formats;
  • the first syntax element is a second preset value, which determines that the splicing diagram is a isomorphic splicing including isomorphic blocks of the first expression format Figure;
  • the first syntax element is a third preset value, which determines that the mosaic graph is a isomorphic mosaic graph including isomorphic blocks of the second expression format.
  • the first syntax element is a first preset value, and the first preset value is used to indicate that the spliced image includes point cloud blocks. and a heterogeneous hybrid spliced image of multi-viewpoint video blocks, the first syntax element is a second preset value, and the second preset value is used to indicate that the spliced image includes a homogeneous spliced image of multi-viewpoint video blocks (which may be called a multi-viewpoint video block).
  • viewpoint video mosaic the first syntax element is a third preset value, and the third preset value is used to indicate that the mosaic includes a isomorphic mosaic of point cloud blocks (which may be called a point cloud mosaic).
  • Table 4 shows the syntax of the general V3C parameter set (General V3C parameter set syntax).
  • the V3C parameter set has a new syntax element vps_toolset_type.
  • vps_toolset_type[j] can be used to represent the type of splicing diagram with index j.
  • the decoding end can obtain the vps_toolset_type from the V3C parameter set.
  • the vps_toolset_type it can quickly determine whether each stitched image and its corresponding V3C unit should belong to point cloud/multi-viewpoint/point cloud+ Multiple viewpoints to determine which coding method the spliced image should meet.
  • the mosaic graph sequence parameter set corresponding to the mosaic graph includes the first syntax element.
  • the splicing graph sequence parameter set corresponding to the splicing graph includes the first sub-syntax element and the second sub-syntax element.
  • the first sub-syntax element and the second sub-syntax element are used to indicate a splicing diagram type, wherein the splicing diagram is the heterogeneous hybrid splicing diagram or the isomorphic splicing diagram.
  • the first sub-grammar element is a fourth preset value and the second sub-grammar element is a fifth preset value, and it is determined that the mosaic diagram includes the isomorphic block of the first expression format and the th
  • a heterogeneous hybrid splicing diagram of homogeneous blocks in two expression formats the first sub-grammar element is a fourth preset value and the second sub-grammar element is a seventh preset value, and it is determined that the splicing diagram includes The isomorphic mosaic diagram of the isomorphic blocks of the first expression format; the first sub-grammar element is the sixth preset value and the second sub-grammar element is the fifth preset value, determining the mosaic diagram It is a isomorphic mosaic diagram including the isomorphic blocks of the second expression format.
  • the first sub-grammar element is asps_vpcc_extension_present_flag in the splicing diagram sequence parameter set
  • the second sub-grammar element is asps_miv_extension_present_flag.
  • the NAL-ASPS of the V3C_AD code stream may contain asps_miv_extension_present_flag and asps_vpcc_extension_present_flag.
  • the first sub-syntax element and the second sub-syntax element are set to specific values to indicate that the spliced image is the heterogeneous hybrid spliced image or the isomorphic spliced image.
  • Table 5 shows the syntax of the general atlas sequence parameter set RBSP syntax.
  • the splicing map sequence parameter set can be understood as splicing map information.
  • the encoding end uses the syntax elements asps_vpcc_extension_present_flag and asps_miv_extension_present_flag in the splicing map sequence parameter set to represent The type of splicing image.
  • the encoding end can obtain these two syntax elements from the parameter set of the splicing image by parsing the code stream. Based on the values of these two syntax elements, it is determined that the splicing image should belong to point cloud/multi-view/point cloud+multi-view. This determines which encoding method requirements the spliced image should meet.
  • each isomorphic block is a multi-view mosaic.
  • the mosaic map block data unit header of the i-th block includes a second syntax element.
  • the mosaic map is a heterogeneous hybrid mosaic map
  • the information also includes a second syntax element, according to which the expression format of the i-th block in the spliced image is determined.
  • the embodiment of the present application sets a second syntax element to indicate the expression format of the i-th block in the heterogeneous hybrid splicing image, which can help improve the decoding accuracy of the decoder.
  • the V3C standard can support visual media content in different expression formats such as multi-view videos and point clouds in the same compressed code stream.
  • the second syntax element may be ath_toolset_type in the mosaic map tile data unit header (atlas_tile_header).
  • the second syntax element is the eighth preset value, and it is determined that the expression format of the i-th block is the first expression format; the second syntax element is the ninth preset value, and it is determined that the expression format of the i-th block is the first expression format.
  • the expression format of the i-th block is the second expression format.
  • Table 6 shows the Atlas tile header syntax.
  • the encoding end adds a new syntax element ath_toolset_type to the Atlas tile header syntax to indicate the block type.
  • the decoded code stream can be spliced.
  • the picture block data unit header syntax obtains ath_toolset_type to determine whether the current block belongs to multi-view video decoding or point cloud decoding.
  • the second syntax element may also be located in the sub-patch data unit (patch_data_unit).
  • the second syntax element (ath_toolset_type) is known to be 1
  • it is determined that the current sub-tile is encoded using the multi-viewpoint video encoding method.
  • the second syntax element (ath_toolset_type) is known to be 0
  • the current sub-tile is encoded using the point cloud encoding method.
  • the sub-patch data unit syntax can be shown in Table 7:
  • a vps_toolset_type[j] value of 1 indicates that the value of the syntax element of the toolset profile component of the atlas with index j should comply with ISO/IEC 23090-12 Table A-1-1 (i.e. The values specified in Table 8);
  • vps_toolset_type[j] 2 indicating that the value of the syntax element of the atlas toolset profile component with index j should comply with the values specified in ISO/IEC 23090-5 Table H-3, but vps_extension_present_flag, vps_packing_information_present_flag, vps_miv_extension_present_flag, Except for the values of vuh_unit_type, vps_atlas_count_minus1, their values should comply with the values specified in ISO/IEC 23090-12 Table A-1-1;
  • a vps_toolset_type[j] value of 3 indicates that the value of the syntax element of the atlas toolset grade component with index j should comply with the extended ISO/IEC 23090-12 Table A-1-2 (i.e. Table 9-1 and Table 9 -2); Table A-1-1 and Table A-1-2 respectively represent the relevant syntax restrictions of toolbox level components for multi-viewpoints and the toolbox level for heterogeneous data under the integrated code stream. Restrictions on component-related syntax.
  • a vps_toolset_type[j] value of 0 or any value from 4 to 7 indicates that the value is reserved for future use by ISO/IEC and should not appear in bitstreams conforming to this version of this document. Decoders conforming to this version of this document should ignore such reserved unit types. Allowed values for the syntax element value of the MIV toolset configuration file.
  • Ath_toolset_type indicates that the value of the syntax element of the tool set level component of the current tile should conform to the value specified in Table A-1 of the ISO/IEC 23090-12 extension.
  • the value range of ath_toolset_type should be between 0 and 1.
  • FIG. 9 is a schematic diagram of the V3C bitstream structure provided by the embodiment of the present application.
  • the V3C parameter set () (V3C_parameter_set()) of V3C_VPS can include ptl_profile_toolset_idc. If ptl_profile_toolset_idc is 128/129/130/132/133/134, it means that the current code stream also contains a point cloud code stream (such as VPCC basic or VPCC extended, etc.) and multi-view video streams (such as MIV main or MIV Extended or MIV Geometry Absent, etc.).
  • ptl_profile_toolset_idc is 128/129/130/132/133/134, it means that the current code stream also contains a point cloud code stream (such as VPCC basic or VPCC extended, etc.) and multi-view video streams (such as MIV main or MIV Extended or MIV Geometry Absent, etc.).
  • V3C parameter set () (V3C_parameter_set()) of V3C_VPS can include the first syntax element (vps_toolset_type).
  • vps_toolset_type is 1, which means that the current splicing diagram only exists
  • a value of 2 means that only point cloud blocks exist in the current spliced image
  • a value of 3 means that both multi-view point blocks and point cloud blocks exist in the current spliced image.
  • the splicing map sequence parameter set () (Atlas_sequence_parameter_set_rbsp()) in the NAL_ASPS in the atlas sub-bitstream () (Atlas_sub_bitstream()) of V3C_AD may include asps_vpcc_extension_present_flag and asps_miv_extension_present_flag.
  • ptl_profile_toolset_idc is 128/129/130/132/133/134
  • asps_vpcc_extension_present_flag X
  • asps_miv_extension_present_flag Y.
  • X is 0 and Y is 1, it means that the spliced image only contains multi-viewpoint video blocks; when X is 1 and Y is 0, it means that the spliced image only contains point cloud blocks; Blocks and point cloud blocks.
  • the ACL NAL unit type (ACL_NAL_unit_type) in Atlas_sub_bitstream() of V3C_AD includes splicing image information.
  • the mosaic map tile data unit (atlas_tile_data_unit()) may include ath_toolset_type. If ath_toolset_type is no (that is, 0), it means that the current block belongs to a point cloud block. If atdu_type_flag is yes (that is, 1), it means that the current block belongs to a multi-viewpoint video block.
  • the sub-patch information data () includes a sub-patch data unit (patch_data_unit). If ath_toolset_type is no (that is, 0), it means that the current sub-tile is implemented using the point cloud video encoding method. When ath_toolset_type is yes (that is, 1), it means that the current sub-tile is implemented using a multi-viewpoint video coding method.
  • the spliced image By obtaining the first syntax element of each spliced image, determining whether the spliced image includes both point cloud blocks and multi-viewpoint video blocks based on the first syntax element value, and determining that both point cloud blocks and multi-viewpoint video blocks exist in the spliced image.
  • the encoding method of the present application is introduced above by taking the encoding end as an example.
  • the video decoding method provided by the embodiment of the present application is described below by taking the decoding end as an example.
  • Figure 10 is a schematic flow chart of a decoding method provided by an embodiment of the present application. As shown in Figure 10, the decoding method in this embodiment of the present application includes:
  • Step 1001 Decode the code stream to obtain a spliced image and spliced image information, wherein the spliced image information includes a first syntax element, and the spliced image is determined to be a heterogeneous hybrid spliced image or a homogeneous spliced image according to the first syntax element;
  • determining according to the first syntax element that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram includes: if the first syntax element is a first preset value, then determining that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram.
  • the figure shows a heterogeneous mixed splicing diagram including homogeneous blocks of a first expression format and a second expression format, wherein the first expression format and the second expression format are different expression formats; the first syntax element is the second preset value, then it is determined that the splicing diagram is a isomorphic splicing diagram including the isomorphic blocks of the first expression format; the first syntax element is the third preset value, then it is determined that the splicing
  • the figure shows a isomorphic mosaic diagram including the isomorphic blocks of the second expression format.
  • the first syntax element includes: a first sub-syntax element and a second sub-syntax element, and it is determined that the splicing graph is heterogeneous according to the first sub-syntax element and the second sub-syntax element.
  • determining according to the first syntax element that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram further includes: determining that the first sub-grammar element is a sixth preset value. If the mosaic graph does not include the isomorphic blocks of the first expression format; if the second sub-syntax element is the seventh preset value, it is determined that the mosaic graph does not include the isomorphic blocks of the second expression format.
  • determining that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram according to the first syntax element includes: the first sub-grammar element is a fourth preset value and the second sub-grammar element is the fifth preset value, it is determined that the splicing diagram is a heterogeneous hybrid splicing diagram including homogeneous blocks of the first expression format and homogeneous blocks of the second expression format, and the first sub-grammar element is the fourth
  • the preset value and the second sub-grammar element is the seventh preset value, which determines that the mosaic diagram is a isomorphic mosaic diagram including isomorphic blocks of the first expression format; the first sub-grammar element is The sixth preset value and the second sub-syntax element are the fifth preset value, which determines that the mosaic graph is a isomorphic mosaic graph including isomorphic blocks of the second expression format.
  • the first syntax element is located in a parameter set sub-codestream of the codestream.
  • the mosaic map sequence parameter set corresponding to the mosaic map includes the first syntax element.
  • the at least one expression format includes: at least one of multi-view video, point cloud, and mesh.
  • the first expression format is one of multi-view video, point cloud and grid
  • the second expression format is one of multi-view video, point cloud and grid
  • the first expression format and the second expression format are different.
  • the code stream further includes a parameter set sub-code stream of the code stream, the parameter set sub-code stream of the code stream includes a third syntax element, and it is determined according to the third syntax element that the code stream includes A code stream corresponding to visual media content in at least one expression format.
  • the method further includes: decoding the parameter set sub-code stream of the code stream to obtain the parameter set of the code stream, and obtaining the third syntax element from the parameter set of the code stream.
  • the method further includes: the third syntax element is a first value, determining that the code stream includes both a code stream corresponding to the visual media content in the first expression format and a visual code stream in the second expression format.
  • the third syntax element is a second value, and it is determined that the code stream includes the code stream corresponding to the visual media content of the first expression format;
  • the third syntax element is a third value , determining the code stream corresponding to the visual media content including the second expression format in the code stream.
  • decoding the code stream to obtain at least one spliced image includes: determining according to the third syntax element that the code stream includes code streams corresponding to visual media content in at least two expression formats, decoding the The code stream obtains a heterogeneous hybrid splicing image.
  • decoding the code stream to obtain at least one spliced image includes: determining according to the third syntax element that the code stream includes code streams corresponding to visual media content in at least two expression formats, decoding the The code stream obtains isomorphic splicing images of at least two expression formats. That is to say, when the code stream includes code streams corresponding to visual media content in at least two expression formats, each expression format corresponds to a isomorphic splicing diagram.
  • decoding the code stream to obtain at least one spliced image includes: determining according to the third syntax element that the code stream includes code streams corresponding to visual media content in at least two expression formats, decoding the The code stream obtains heterogeneous mixed splicing images and isomorphic splicing images of at least two expression formats. That is to say, when the code stream includes code streams corresponding to visual media content in at least two expression formats, the isomorphic blocks of some expression formats construct a heterogeneous hybrid splicing diagram, and the isomorphic blocks of another part of the expression formats construct an isomorphic Mosaic diagram.
  • the heterogeneous hybrid mosaic diagram includes at least one of the following: a single attribute heterogeneous hybrid mosaic diagram and a multi-attribute heterogeneous hybrid mosaic diagram;
  • the isomorphic mosaic diagram includes at least one of the following: single attribute isomorphism Mosaic graphs and multi-attribute isomorphic mosaic graphs.
  • the code stream includes a video compression sub-stream and a splicing picture information sub-stream
  • decoding the code stream to obtain at least one splicing picture and splicing picture information includes: decoding the video compression sub-stream , obtain the at least one spliced image; decode the spliced image information sub-stream to obtain the spliced image information of the at least one spliced image.
  • the code stream includes a code stream corresponding to visual media content in at least two expression formats, decode the video compression sub-stream, and decode the code stream to obtain a heterogeneous hybrid splicing image and Isomorphic splicing diagram; or, determine according to the third syntax element that the code stream includes code streams corresponding to visual media content of at least two expression formats, decode the video compression sub-code stream, and obtain the same code stream of at least two expression formats. Construct a mosaic diagram.
  • Step 1002 When it is determined that the spliced image is a heterogeneous hybrid spliced image according to the first syntax element, split the spliced image according to the spliced image information of the spliced image to obtain at least two types of isomorphic blocks, Wherein, the at least two isomorphic blocks correspond to different visual media content expression formats;
  • Step 1003 When it is determined that the spliced graph is a isomorphic spliced graph according to the first syntax element, the spliced graph is split according to the spliced graph information of the spliced graph to obtain a homogeneous block, wherein, The one isomorphic block corresponds to the same visual media content expression format;
  • Step 1004 Decode and reconstruct the isomorphic blocks to obtain visual media content in at least one expression format.
  • the method further includes: when determining that the spliced image is a heterogeneous hybrid spliced image according to the first syntax element, the spliced image information further includes a second syntax element. According to the second syntax element The element determines the expression format of the i-th block in the mosaic diagram.
  • determining the expression format of the i-th block in the mosaic diagram based on the second syntax element includes: the second syntax element is an eighth preset value, and determining the i-th block
  • the expression format is the first expression format; the second syntax element is the ninth preset value, and the expression format of the i-th block is determined to be the second expression format.
  • the second syntax element is located in the mosaic block data unit header of the i-th block of the mosaic map.
  • decoding and reconstructing the isomorphic blocks to obtain visual media content in at least one expression format includes: if the expression format of the i-th block is the first expression format, determining The sub-tiles in the i-th block are decoded and reconstructed using the decoding method corresponding to the first expression format to obtain the visual media content of the first expression format; if the expression format of the i-th block is In the second expression format, it is determined that the sub-tiles in the i-th block are decoded and reconstructed using the decoding method corresponding to the second expression format to obtain the visual media content of the second expression format.
  • the decoded code stream obtains a multi-viewpoint video splicing image, a point cloud splicing image, and a heterogeneous hybrid splicing image.
  • the heterogeneous hybrid splicing image is split, and the reconstructed multi-viewpoint video blocks and point cloud blocks are output;
  • the multi-viewpoint video is split Splicing image, output the reconstructed multi-view video block; split the point cloud splicing image according to the splicing image information corresponding to the point cloud splicing image, and output the reconstructed point cloud block; pass all the acquired multi-view point video blocks through the multi-view video Decoding generates a reconstructed multi-view video; all acquired point cloud blocks are decoded to generate a reconstructed point cloud.
  • homogeneous blocks of different expression formats are spliced into a heterogeneous mixed splicing picture, and homogeneous blocks of the same expression format are spliced into a heterogeneous mixed splicing image.
  • the splicing picture information includes the first syntax element used to indicate the type of the splicing picture, which improves the decoding efficiency of the splicing picture at the decoding end.
  • FIG. 11 is a schematic block diagram of an encoding device provided by an embodiment of the present application.
  • the encoding device 110 is applied to an encoder. As shown in Figure 11, the encoding device 110 includes:
  • the processing unit 1101 is configured to process visual media content in at least one expression format to obtain at least one isomorphic block, where different types of isomorphic blocks correspond to different visual media content expression formats;
  • the splicing unit 1102 is used to splice the at least one isomorphic block to obtain at least one spliced image and spliced image information, wherein the spliced image information includes a first syntax element, which is determined according to the first syntax element.
  • the mosaic diagram is a heterogeneous hybrid mosaic diagram or a homogeneous mosaic diagram, the heterogeneous hybrid mosaic diagram includes at least two types of isomorphic blocks, and the isomorphic mosaic diagram includes one type of isomorphic block;
  • the encoding unit 1103 is used to encode the at least one splicing picture and the splicing picture information to obtain a code stream.
  • determining according to the first syntax element that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram includes: if the first syntax element is a first preset value, then determining that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram.
  • the figure shows a heterogeneous mixed splicing diagram including homogeneous blocks of a first expression format and a second expression format, wherein the first expression format and the second expression format are different expression formats; the first syntax element is the second preset value, then it is determined that the splicing diagram is a isomorphic splicing diagram including the isomorphic blocks of the first expression format; the first syntax element is the third preset value, then it is determined that the splicing
  • the figure shows a isomorphic mosaic diagram including the isomorphic blocks of the second expression format.
  • the first syntax element includes: a first sub-syntax element and a second sub-syntax element, and it is determined that the splicing graph is heterogeneous according to the first sub-syntax element and the second sub-syntax element.
  • Hybrid mosaic or isomorphic mosaic
  • Determining that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram based on the first syntax element includes: if the first sub-grammar element is a fourth preset value, then it is determined that the splicing diagram includes the first expression isomorphic blocks of the format; and/or, if the second sub-syntax element is the fifth preset value, it is determined that the splicing diagram includes a isomorphic block of the second expression format; wherein, the first expression format and the second expression format are different expression formats.
  • determining according to the first syntax element that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram further includes: determining that the first sub-grammar element is a sixth preset value. If the mosaic graph does not include the isomorphic blocks of the first expression format; if the second sub-syntax element is the seventh preset value, it is determined that the mosaic graph does not include the isomorphic blocks of the second expression format.
  • the first syntax element is located in a parameter set sub-codestream of the codestream.
  • the mosaic graph sequence parameter set corresponding to the mosaic graph includes the first syntax element.
  • the spliced image information when it is determined based on the first syntax element that the spliced image is a heterogeneous hybrid spliced image, the spliced image information also includes a second syntax element, and the spliced image is determined based on the second syntax element
  • determining the expression format of the i-th block in the mosaic diagram based on the second syntax element includes: the second syntax element is an eighth preset value, and determining the i-th block The expression format of the block is the first expression format; the second syntax element is the ninth preset value, which determines that the expression format of the i-th block is the second expression format.
  • the second syntax element is located in the mosaic block data unit header of the i-th block of the mosaic map.
  • the encoding unit 1103 is configured to, if the expression format of the i-th block is a first expression format, determine that the sub-tile in the i-th block adopts the first expression format. Encode with the corresponding encoding method to obtain the code stream corresponding to the visual media content of the first expression format; if the expression format of the i-th block is the second expression format, determine the neutron of the i-th block The tiles are encoded using the encoding method corresponding to the second expression format to obtain a code stream corresponding to the visual media content of the second expression format.
  • the parameter set sub-code stream of the code stream includes a third syntax element, and the code stream corresponding to the visual media content including at least one expression format in the code stream is determined according to the third syntax element.
  • determining the code stream corresponding to visual media content including at least one expression format in the code stream according to the third syntax element includes: the third syntax element is a first value, determining The code stream simultaneously includes a code stream corresponding to the visual media content in the first expression format and a code stream corresponding to the visual media content in the second expression format; the third syntax element is a second value, which determines the code stream in the code stream.
  • the code stream includes the code stream corresponding to the visual media content in the first expression format; the third syntax element is a third value, which determines that the code stream includes the code stream corresponding to the visual media content in the second expression format.
  • the third syntax element is used to indicate that the code stream includes a code stream corresponding to visual media content in at least two expression formats.
  • the encoding unit 1103 is used to encode the at least one spliced image to obtain a video compression sub-stream; encode the spliced image information of the at least one spliced image to obtain the spliced image information sub-stream. Code stream; synthesize the video compression sub-stream and the splicing image information sub-stream into the code stream.
  • the at least one expression format includes: at least one of multi-view video, point cloud, and mesh.
  • the heterogeneous hybrid mosaic diagram includes at least one of the following: a single attribute heterogeneous hybrid mosaic diagram and a multi-attribute heterogeneous hybrid mosaic diagram;
  • the isomorphic mosaic diagram includes at least one of the following: single attribute isomorphism Mosaic graphs and multi-attribute isomorphic mosaic graphs.
  • FIG. 12 is a schematic block diagram of a decoding device provided by an embodiment of the present application.
  • the decoding device 120 is applied to a decoder. As shown in Figure 12, the decoding device 120 includes:
  • the decoding unit 1201 is used to decode the code stream to obtain the splicing image and the splicing image information, wherein the splicing image information includes a first syntax element, and it is determined according to the first syntax element that the splicing image is a heterogeneous hybrid splicing image or isomorphic mosaic;
  • the splitting unit 1202 is configured to split the spliced image according to the spliced image information of the spliced image to obtain at least two homogeneous ones when it is determined according to the first syntax element that the spliced image is a heterogeneous hybrid spliced image. Constructing blocks, wherein the at least two isomorphic blocks correspond to different visual media content expression formats;
  • the splitting unit 1202 is configured to split the spliced diagram according to the spliced diagram information of the spliced diagram to obtain a homogeneous spliced diagram when it is determined according to the first syntax element that the spliced diagram is a homogeneous spliced diagram.
  • the processing unit 1203 is configured to decode and reconstruct the homogeneous blocks to obtain visual media content in at least one expression format.
  • determining according to the first syntax element that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram includes: if the first syntax element is a first preset value, then determining that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram.
  • the figure shows a heterogeneous mixed splicing diagram including homogeneous blocks of a first expression format and a second expression format, wherein the first expression format and the second expression format are different expression formats; the first syntax element is the second preset value, then it is determined that the splicing diagram is a isomorphic splicing diagram including the isomorphic blocks of the first expression format; the first syntax element is the third preset value, then it is determined that the splicing
  • the figure shows a isomorphic mosaic diagram including the isomorphic blocks of the second expression format.
  • the first syntax element includes: a first sub-syntax element and a second sub-syntax element, and it is determined that the splicing graph is heterogeneous according to the first sub-syntax element and the second sub-syntax element.
  • Hybrid mosaic or isomorphic mosaic
  • Determining that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram based on the first syntax element includes: if the first sub-grammar element is a fourth preset value, then it is determined that the splicing diagram includes the first expression isomorphic blocks of the format; and/or, if the second sub-syntax element is the fifth preset value, it is determined that the splicing diagram includes a isomorphic block of the second expression format; wherein, the first expression format and the second expression format are different expression formats.
  • determining according to the first syntax element that the splicing diagram is a heterogeneous hybrid splicing diagram or a homogeneous splicing diagram further includes: determining that the first sub-grammar element is a sixth preset value. If the mosaic graph does not include the isomorphic blocks of the first expression format; if the second sub-syntax element is the seventh preset value, it is determined that the mosaic graph does not include the isomorphic blocks of the second expression format.
  • the first syntax element is located in a parameter set sub-codestream of the codestream.
  • the mosaic graph sequence parameter set corresponding to the mosaic graph includes the first syntax element.
  • the spliced image information when it is determined based on the first syntax element that the spliced image is a heterogeneous hybrid spliced image, the spliced image information also includes a second syntax element, and the spliced image is determined based on the second syntax element
  • determining the expression format of the i-th block in the mosaic diagram based on the second syntax element includes: the second syntax element is an eighth preset value, and determining the i-th block The expression format of the block is the first expression format; the second syntax element is the ninth preset value, which determines that the expression format of the i-th block is the second expression format.
  • the second syntax element is located in the mosaic block data unit header of the i-th block of the mosaic map.
  • the processing unit 1203 is configured to, if the expression format of the i-th block is a first expression format, determine that the sub-tiles in the i-th block adopt the first expression format.
  • the corresponding decoding method performs decoding and reconstruction to obtain the visual media content of the first expression format; if the expression format of the i-th block is the second expression format, determine the sub-block in the i-th block using The decoding method corresponding to the second expression format performs decoding and reconstruction to obtain the visual media content of the second expression format.
  • the parameter set sub-code stream of the code stream includes a third syntax element, and the code stream corresponding to the visual media content including at least one expression format in the code stream is determined according to the third syntax element.
  • determining the code stream corresponding to visual media content including at least one expression format in the code stream according to the third syntax element includes: the third syntax element is a first value, determining The code stream simultaneously includes a code stream corresponding to the visual media content in the first expression format and a code stream corresponding to the visual media content in the second expression format; the third syntax element is a second value, which determines the code stream in the code stream.
  • the code stream includes the code stream corresponding to the visual media content in the first expression format; the third syntax element is a third value, which determines that the code stream includes the code stream corresponding to the visual media content in the second expression format.
  • the decoding unit 1201 is configured to determine, according to the third syntax element, that the code stream includes a code stream corresponding to visual media content in at least two expression formats, and decode the code stream to obtain a heterogeneous hybrid Mosaic diagram.
  • the code stream includes a video compression sub-stream and a splicing image information sub-stream
  • the decoding unit 1201 is used to decode the video compression sub-stream to obtain the at least one splicing image; decoding The splicing picture information sub-stream is used to obtain the splicing picture information of the at least one splicing picture.
  • the at least one expression format includes: at least one of multi-view video, point cloud, and mesh.
  • the heterogeneous hybrid mosaic diagram includes at least one of the following: a single attribute heterogeneous hybrid mosaic diagram and a multi-attribute heterogeneous hybrid mosaic diagram;
  • the isomorphic mosaic diagram includes at least one of the following: single attribute isomorphism Mosaic graphs and multi-attribute isomorphic mosaic graphs.
  • the software unit may be located in a mature storage medium in this field such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, register, etc.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps in the above method embodiment in combination with its hardware.
  • Figure 13 is a schematic block diagram of the encoder provided by an embodiment of the present application. As shown in Figure 13, the encoder 1310 includes:
  • Figure 14 is a schematic block diagram of a decoder provided by an embodiment of the present application. As shown in Figure 14, the decoder 1410 includes:
  • the processor may include, but is not limited to:
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the memory includes but is not limited to:
  • Non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electrically removable memory. Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory. Volatile memory may be Random Access Memory (RAM), which is used as an external cache.
  • RAM Random Access Memory
  • RAM static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM DDR SDRAM
  • ESDRAM enhanced synchronous dynamic random access memory
  • SLDRAM synchronous link dynamic random access memory
  • Direct Rambus RAM Direct Rambus RAM
  • each functional module in this embodiment can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the above integrated units can be implemented in the form of hardware or software function modules.
  • FIG. 15 shows a schematic structural diagram of a coding and decoding system provided by an embodiment of the present application.
  • the encoding and decoding system 150 may include an encoder 1501 and a decoder 1502.
  • the encoder 1501 may be a device integrated with the encoding device described in the previous embodiment;
  • the decoder 1502 may be a device integrated with the decoding device described in the previous embodiment.
  • both the encoder 1501 and the decoder 1502 can use the color component information of adjacent reference pixels and the pixels to be predicted to implement the calculation of the weighting coefficient corresponding to the pixel to be predicted; Moreover, different reference pixels can have different weighting coefficients. Applying this weighting coefficient to the chroma prediction of the pixels to be predicted in the current block can not only improve the accuracy of chroma prediction and save code rate, but also improve the encoding and decoding performance. .
  • An embodiment of the present application also provides a chip for implementing the above encoding and decoding method.
  • the chip includes: a processor, configured to call and run a computer program from a memory, so that the electronic device installed with the chip executes the above encoding and decoding method.
  • Embodiments of the present application also provide a computer storage medium in which a computer program is stored.
  • the computer program is executed by the second processor, the encoding method of the encoder is implemented; or, when the computer program is executed by the first processor, the encoding method of the encoder is implemented.
  • the decoding method of the decoder In other words, embodiments of the present application also provide a computer program product containing instructions, which when executed by a computer causes the computer to perform the method of the above method embodiments.
  • This application also provides a code stream, which is generated according to the above encoding method.
  • the code stream includes the above first syntax element, or includes a second syntax element and a third syntax element.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted over a wired connection from a website, computer, server, or data center (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website, computer, server or data center.
  • the computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media.
  • the available media may be magnetic media (such as floppy disks, hard disks, magnetic tapes), optical media (such as digital video discs (DVD)), or semiconductor media (such as solid state disks (SSD)), etc.
  • the disclosed systems, devices and methods can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or may be Integrated into another system, or some features can be ignored, or not implemented.
  • the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.
  • a unit described as a separate component may or may not be physically separate.
  • a component shown as a unit may or may not be a physical unit, that is, it may be located in one place, or it may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in various embodiments of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.

Abstract

La présente demande concerne un procédé et un appareil de codage, un procédé et un appareil de décodage, ainsi qu'un codeur, un décodeur et un support de stockage. Pour un scénario d'application qui comprend un contenu multimédia visuel dans un ou plusieurs formats d'expression, des blocs isomorphes dans différents formats d'expression sont assemblés dans une image assemblée hybride hétérogène, des blocs isomorphes dans le même format d'expression sont assemblés dans une image assemblée isomorphe, et les images assemblées obtenues et les informations d'images assemblées sont écrites dans un flux de code. Une image assemblée isomorphe (par exemple au moins l'une d'une image assemblée multivue, d'une image assemblée de nuage de points et d'une image assemblée de grille) et une image assemblée hybride hétérogène sont simultanément présentes dans un flux de code, de telle sorte que le procédé de codage et le procédé de décodage sont applicables aux scénarios d'application de contenu multimédia visuel dans une pluralité de formats d'expression, ce qui permet d'étendre la portée d'application. De plus, le flux de code comprend un premier élément syntaxique, de telle sorte que l'efficacité d'une extrémité de décodage décodant une image assemblée peut être améliorée. Puisque des blocs isomorphes dans différents formats d'expression sont assemblés dans une image assemblée hybride hétérogène pour le codage et le décodage, le nombre de décodeurs invoqués peut être réduit, ce qui permet de réduire le coût de mise en œuvre et d'améliorer la facilité d'utilisation.
PCT/CN2022/105006 2022-07-11 2022-07-11 Procédé et appareil de codage, procédé et appareil de décodage, et codeur, décodeur et support de stockage WO2024011386A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2022/105006 WO2024011386A1 (fr) 2022-07-11 2022-07-11 Procédé et appareil de codage, procédé et appareil de décodage, et codeur, décodeur et support de stockage
TW112125658A TW202408245A (zh) 2022-07-11 2023-07-10 一種編解碼方法、裝置、編碼器、解碼器、儲存媒介及碼流

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/105006 WO2024011386A1 (fr) 2022-07-11 2022-07-11 Procédé et appareil de codage, procédé et appareil de décodage, et codeur, décodeur et support de stockage

Publications (1)

Publication Number Publication Date
WO2024011386A1 true WO2024011386A1 (fr) 2024-01-18

Family

ID=89535246

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/105006 WO2024011386A1 (fr) 2022-07-11 2022-07-11 Procédé et appareil de codage, procédé et appareil de décodage, et codeur, décodeur et support de stockage

Country Status (2)

Country Link
TW (1) TW202408245A (fr)
WO (1) WO2024011386A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112188180A (zh) * 2019-07-05 2021-01-05 浙江大学 一种处理子块图像的方法及装置
US20210209347A1 (en) * 2020-01-02 2021-07-08 Sony Corporation Texture map generation using multi-viewpoint color images
CN114071116A (zh) * 2020-07-31 2022-02-18 阿里巴巴集团控股有限公司 视频处理方法、装置、电子设备及存储介质
CN114189697A (zh) * 2021-12-03 2022-03-15 腾讯科技(深圳)有限公司 一种视频数据处理方法、装置以及可读存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112188180A (zh) * 2019-07-05 2021-01-05 浙江大学 一种处理子块图像的方法及装置
US20210209347A1 (en) * 2020-01-02 2021-07-08 Sony Corporation Texture map generation using multi-viewpoint color images
CN114071116A (zh) * 2020-07-31 2022-02-18 阿里巴巴集团控股有限公司 视频处理方法、装置、电子设备及存储介质
CN114189697A (zh) * 2021-12-03 2022-03-15 腾讯科技(深圳)有限公司 一种视频数据处理方法、装置以及可读存储介质

Also Published As

Publication number Publication date
TW202408245A (zh) 2024-02-16

Similar Documents

Publication Publication Date Title
US11151742B2 (en) Point cloud data transmission apparatus, point cloud data transmission method, point cloud data reception apparatus, and point cloud data reception method
US11170556B2 (en) Apparatus for transmitting point cloud data, a method for transmitting point cloud data, an apparatus for receiving point cloud data and a method for receiving point cloud data
US20220159261A1 (en) Point cloud data transmission device, point cloud data transmission method, point cloud data reception device, and point cloud data reception method
TWI523492B (zh) 在視訊寫碼中之非巢套式補充增強資訊訊息
JP2022500931A (ja) ポイントクラウドコーディングにおける改善された属性レイヤとシグナリング
TW201830965A (zh) 用於時間延展性支持之修改適應性迴路濾波器時間預測
US11968393B2 (en) Point cloud data transmitting device, point cloud data transmitting method, point cloud data receiving device, and point cloud data receiving method
TW201729592A (zh) 視訊寫碼中具有非正方形預測單元之線性模型預測
KR20180051594A (ko) 개선된 컬러 재맵핑 정보 보충 강화 정보 메시지 프로세싱
BR112018007529B1 (pt) Alinhamento de grupo de amostra de ponto de operação em formato de arquivo de fluxos de bits de multicamada
KR102659806B1 (ko) V-pcc용 스케일링 파라미터
US10574959B2 (en) Color remapping for non-4:4:4 format video content
TWI713354B (zh) 用於顯示器調適之色彩重映射資訊sei信息發信號
CN115443652A (zh) 点云数据发送设备、点云数据发送方法、点云数据接收设备和点云数据接收方法
WO2023142127A1 (fr) Procédés et appareils de codage et de décodage, dispositif et support de stockage
WO2022166462A1 (fr) Procédé de codage/décodage et dispositif associé
CN113273193A (zh) 用于分块配置指示的编码器,解码器及对应方法
WO2023071557A1 (fr) Procédé et appareil d'encapsulation de fichier multimédia, dispositif et support de stockage
WO2024011386A1 (fr) Procédé et appareil de codage, procédé et appareil de décodage, et codeur, décodeur et support de stockage
WO2023044868A1 (fr) Procédé de codage vidéo, procédé de décodage vidéo, dispositif, système et support de stockage
WO2023201504A1 (fr) Procédé et appareil de codage, procédé et appareil de décodage, dispositif et support de stockage
WO2024077806A1 (fr) Procédé et appareil de codage, procédé et appareil de décodage, et codeur, décodeur et support de stockage
CN114846789A (zh) 用于指示条带的图像分割信息的解码器及对应方法
WO2024077616A1 (fr) Procédé de codage et de décodage et appareil de codage et de décodage, dispositif et support de stockage
WO2024093305A1 (fr) Procédé et appareil de codage d'images, et procédé et appareil de décodage d'images

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22950517

Country of ref document: EP

Kind code of ref document: A1