WO2023050396A1 - 多平面图像的生成、数据处理、编码和解码方法、装置 - Google Patents

多平面图像的生成、数据处理、编码和解码方法、装置 Download PDF

Info

Publication number
WO2023050396A1
WO2023050396A1 PCT/CN2021/122390 CN2021122390W WO2023050396A1 WO 2023050396 A1 WO2023050396 A1 WO 2023050396A1 CN 2021122390 W CN2021122390 W CN 2021122390W WO 2023050396 A1 WO2023050396 A1 WO 2023050396A1
Authority
WO
WIPO (PCT)
Prior art keywords
pmpi
data
smpi
scene
frame
Prior art date
Application number
PCT/CN2021/122390
Other languages
English (en)
French (fr)
Inventor
杨铀
蒋小广
刘琼
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Priority to KR1020247014022A priority Critical patent/KR20240089119A/ko
Priority to PCT/CN2021/122390 priority patent/WO2023050396A1/zh
Priority to CN202180102720.7A priority patent/CN117999582A/zh
Publication of WO2023050396A1 publication Critical patent/WO2023050396A1/zh
Priority to US18/609,944 priority patent/US20240223767A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • H04N19/463Embedding additional information in the video signal during the compression process by compressing encoding parameters before transmission

Definitions

  • Embodiments of the present disclosure relate to, but are not limited to, image processing technologies, and more specifically, relate to a method and device for generating, data processing, encoding, and decoding multi-plane images.
  • Multiplane image is a non-redundant scene representation method.
  • MPI decomposes the scene into a series of layers, which are planar layers or spherical layers.
  • layers which are planar layers or spherical layers.
  • the depth range [dmin,dmax] of MPI needs to be set in advance according to the depth of field data of the scene.
  • dmin is the minimum depth, that is, the distance from the layer closest to the reference viewpoint to the reference viewpoint
  • dmax is the maximum depth, that is, the farthest from the reference viewpoint.
  • Each layer in MPI is divided into two parts: color map (Color frame) and transparency map (Transparency frame).
  • the color map and transparency map of a layer contain the texture information and transparency information of the scene at the position of the plane layer respectively.
  • MPI can be used for immersive video, but the effect needs to be improved.
  • An embodiment of the present disclosure provides a method for generating a multi-plane image, including:
  • the PMPI includes a plurality of sub-multi-plane images sMPI for respectively representing a plurality of the scene regions, the starting depth of the sMPI is determined at least according to the depth information of the scene regions represented by the sMPI .
  • An embodiment of the present disclosure also provides a data processing method for a multi-plane image, including:
  • the PMPI includes a plurality of sub-multi-plane images sMPI to respectively represent a plurality of scene areas into which the three-dimensional scene is divided;
  • the original storage data of the PMPI is converted into encapsulated and compressed storage PCS data, and the PCS data is used to determine the depth of the effective layer of the pixel in the PMPI and the color and transparency of the pixel on the effective layer.
  • An embodiment of the present disclosure also provides a method for encoding a multi-plane image, including:
  • the PMPI includes a plurality of sub-multi-plane images sMPI to respectively represent a plurality of scene areas divided into three-dimensional scenes
  • the PCS data includes image parameters, and texture attribute parts and Data in the Transparency property section;
  • the PMPI is encoded based on the PCS data to obtain encoded image parameters and atlas data.
  • An embodiment of the present disclosure also provides a decoding method for a multi-plane image, including:
  • the PMPI includes a plurality of sub-multi-plane images sMPI to respectively represent multiple scene areas divided into three-dimensional scenes
  • the encoded code stream includes image parameters and atlas data of the PMPI.
  • An embodiment of the present disclosure also provides a code stream, wherein the code stream is generated by encoding a block multi-plane image PMPI, and the code stream includes image parameters and atlas data of the PMPI; the PMPI A plurality of sub-multiplanar images sMPI are included to respectively represent a plurality of scene regions into which the three-dimensional scene is divided.
  • An embodiment of the present disclosure also provides a device for generating a multi-plane image, including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the computer program described in any embodiment of the present disclosure is implemented.
  • a method for generating multiplanar images including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the computer program described in any embodiment of the present disclosure is implemented.
  • An embodiment of the present disclosure also provides a multi-plane image data processing device, including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the computer program described in any embodiment of the present disclosure is implemented.
  • a multi-plane image data processing device including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the computer program described in any embodiment of the present disclosure is implemented.
  • An embodiment of the present disclosure also provides a multi-plane image encoding device, including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the implementation as described in any embodiment of the present disclosure Coding method for multi-planar images.
  • An embodiment of the present disclosure also provides a multi-plane image decoding device, including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the multi-plane image as described in any embodiment of the present disclosure is implemented. Decoding method for planar images.
  • An embodiment of the present disclosure also provides a non-transitory computer-readable storage medium, the computer-readable storage medium stores a computer program, wherein, when the computer program is executed by a processor, any implementation of the present disclosure can be realized.
  • FIG. 1 is a schematic structural diagram of an exemplary MPI composed of four plane layers
  • FIGS. 2A to 2F are schematic diagrams of six consecutive plane layers in an exemplary MPI, showing a color map and a transparency map of each plane layer;
  • Fig. 3 is a schematic diagram of representing a three-dimensional scene using common MPI
  • FIG. 4 is a schematic diagram of using PMPI to characterize a three-dimensional scene according to an embodiment of the present disclosure
  • Fig. 5 is the flowchart of the PMPI generation method of an embodiment of the present disclosure
  • FIG. 6 is a schematic diagram of determining the initial depth of sMPI through pooling according to an embodiment of the present disclosure
  • FIG. 7 is a schematic diagram of an exemplary PMPI generated using an embodiment of the present disclosure.
  • Fig. 8 is a schematic diagram of a video compression process
  • Fig. 9 is a schematic diagram of a kind of PCS data converted from MPI original storage data
  • Fig. 10 is a flowchart of a data processing method of PMPI according to an embodiment of the present disclosure
  • Fig. 11 is a schematic diagram of a kind of PCS data converted from PMPI original storage data according to an embodiment of the present disclosure
  • FIG. 12 is a schematic diagram of another PCS data converted from PMPI original storage data according to an embodiment of the present disclosure.
  • FIG. 13 is a schematic diagram of another PCS data converted from PMPI original storage data according to an embodiment of the present disclosure.
  • FIG. 14 is a schematic structural diagram of a PMPI encoding device according to an embodiment of the present disclosure.
  • FIG. 15 is a flow chart of a PMPI encoding method according to an embodiment of the present disclosure.
  • Fig. 16 is a flowchart of a PMPI decoding method according to an embodiment of the present disclosure
  • FIG. 17 is a schematic diagram of an apparatus for generating PMPI according to an embodiment of the present disclosure.
  • words such as “exemplary” or “for example” are used to mean an example, illustration or illustration. Any embodiment described in this disclosure as “exemplary” or “for example” should not be construed as preferred or advantageous over other embodiments.
  • "And/or” in this article is a description of the relationship between associated objects, which means that there can be three relationships, for example, A and/or B, which can mean: A exists alone, A and B exist simultaneously, and there exists alone B these three situations.
  • “A plurality” means two or more than two.
  • words such as “first” and “second” are used to distinguish the same or similar items with basically the same function and effect. Those skilled in the art can understand that words such as “first” and “second” do not limit the number and execution order, and words such as “first” and “second” do not necessarily limit the difference.
  • Multiplanar Imagery is a hierarchical representation of 3D scenes without redundancy.
  • a 3D scene is decomposed into a set of planar or spherical layers, sampled from different depths from a given reference point. Each layer is obtained by projecting the part of the 3D scene contained around the layer position onto the same reference camera.
  • This reference camera is at the given reference viewpoint.
  • the reference camera is the perspective camera; when using spherical layers, the reference camera is the spherical (usually equirectangular) camera.
  • MPI decomposes the scene into a series of plane layers or spherical layers. Take for example an MPI composed of planar layers that are positively parallel and located at different depths with respect to the reference viewpoint.
  • the depth range [dmin, dmax] of the plane layer needs to be set in advance according to the depth range of the actual scene.
  • the MPI includes S plane layers, and the size of each plane layer is W ⁇ H, the size of the MPI can be expressed as W ⁇ H ⁇ S.
  • W is the number of pixels in the width direction of the MPI
  • H is the number of pixels in the height direction of the MPI
  • the MPI contains W ⁇ H pixels
  • the planar image resolution is W ⁇ H.
  • the exemplary MPI shown in Figure 1 includes 4 layers, but the number of plane layers or spherical layers that MPI includes, that is, the number of layers, can also be 2, 3, 5 or more than 5, such as 100, 200 Wait a minute.
  • Each layer of MPI includes a color map and a transparency map, which are used to record the color and transparency of pixels on this layer. A pixel can have different colors and transparency on different layers.
  • MPI is a hierarchical representation of a 3D scene, that is, the sampling of a 3D scene.
  • the points on the MPI plane layer are sampling points. From the examples in Figure 2A to Figure 2F, it can be seen that most of the sampling points in this MPI are located in the 3D scene Invalid positions of , these positions have no visible surface, and the transparency is 0. Only a small number of sampling points are located in the valid area of the 3D scene. There are visible surfaces at the positions of these valid areas, and the transparency is not 0.
  • MPI can be used for immersive video. From the perspective of immersive experience, it is the effective area in the 3D scene that plays a decisive role. However, most of the sampling points of MPI are wasted, resulting in low sampling efficiency and low resolution of the final immersive video. lower.
  • the depth range [dmin,dmax] of MPI is set according to the global depth of the scene.
  • the depth range is enough to cover most of the effective information of the scene.
  • the depth of the layer closest to the reference viewpoint is called the initial depth dmin of MPI.
  • the depth of the layer farthest from the reference viewpoint may be referred to as the termination depth dmax of the MPI.
  • the parallel lines in the figure are used to indicate the depth position of each layer in the MPI in the three-dimensional scene.
  • MPI Since one geometry in this scene is far away from other geometries, in order to represent the main information of the scene (four geometries), MPI must use a larger depth range, and the resulting plane layer (four in the figure as an example ) is relatively sparse. For the three geometries located in the foreground region, valid information will only appear on the two plane layers with greater depth in MPI. MPI is less efficient at sampling.
  • an embodiment of the present disclosure proposes an MPI with depth adaptive change characteristics.
  • the MPI shown in FIG. 1 and FIG. 3 the MPI shown in FIG. 1 and FIG.
  • the MPI proposed by the embodiments of the present disclosure is called a patch multiplane image (PMPI: Patch multiplane image) for the common MPI, that is, the MPI representing the entire 3D scene.
  • PMPI Patch multiplane image
  • the PMPI of the embodiment of the present disclosure is such an MPI: it includes a plurality of sub multiplane images (sMPI: sub multiplane image) to respectively represent a plurality of scene areas divided into three-dimensional scenes, and each sMPI is included in the scene area represented by the sMPI Multiple layers obtained by sampling at different depths, the starting depth of each sMPI is determined at least according to the depth information of the scene area represented by the sMPI.
  • PMPI can be regarded as an extension of ordinary MPI.
  • the basic unit of ordinary MPI is multiple layers of the same size, which are used to represent a complete 3D scene, while PMPI uses multiple sMPIs to represent multiple scenes divided into a 3D scene.
  • Each scene area can be regarded as a block of the 3D scene, and each sMPI can also be regarded as a block of the PMPI, so the PMPI is a hierarchical block representation of the 3D scene.
  • a single sMPI is also a kind of MPI, but it represents the scene area divided into the 3D scene, and the scene area can also be regarded as a 3D scene, but the size and shape are different from the original 3D scene.
  • the way sMPI characterizes the scene area can still use the way ordinary MPI characterizes the three-dimensional scene.
  • an sMPI includes multiple layers sampled at different depths in the scene area. The size and shape of the multiple layers are the same, and each layer includes a color map and A transparency map, multiple layers can be distributed according to set rules (such as equal spacing or equal viewing distance), and so on. Similar to MPI, the depth range of sMPI is also set according to the principle of including most of the valid information in the scene area.
  • the end depths of multiple sMPIs in PMPI can be set to be the same, and the start depths can be set to be different according to the depth information of the represented scene area, so as to introduce the depth information of the scene, increase the adaptive ability for the scene depth, and make more Sample points are placed at valid locations in the scene.
  • the 3D scene and reference viewpoint shown in FIG. 4 are the same as those in FIG. 3 .
  • the number of scene areas is set to 2
  • a vertical plane is used to divide the 3D scene into two scene areas, one of which includes the scene area located on the left side of the reference viewpoint.
  • Two geometric bodies, the scene area is referred to as the first scene area hereinafter, one geometric body in the first scene area is closest to the reference viewpoint, and the other geometric body is farther away from the reference viewpoint.
  • Another scene area includes two geometries located on the right side of the reference viewpoint.
  • this scene area is called the second scene area.
  • the two geometries in the second scene area are far away from the reference viewpoint.
  • the above-mentioned first scene area and the second scene area are respectively represented by one sMPI, and each sMPI contains 4 plane layers, because the depth difference of the two geometric bodies in the first scene area is relatively large, which is used to represent the first scene area.
  • the scene depth range of sMPI in the scene area needs to be set larger.
  • the depth difference between the two geometric bodies in the second scene area is small and is close to the rear, and the depth range of the sMPI used to characterize the second scene area can be set to be relatively small.
  • the start depth of the sMPI used to characterize the first scene area is smaller, and the start depth of the sMPI used to characterize the second scene area is larger.
  • the PMPI representation of the resulting 3D scene is shown in Fig. 4. It can be seen from the figure that the 4 layers of sMPI used to represent the second scene area become dense and all are located near the geometry, so compared with the MPI in Figure 3, PMPI has more sampling points located in valid positions. That is to say, the layered and block-based PMPI is used to represent the scene, and the sampling efficiency is higher than that of the ordinary MPI layered representation. It should be noted that the division method of the three-dimensional scene in the example shown in FIG. 4 is only schematic, and is used to illustrate the difference between PMPI and common MPI with a simple example.
  • An embodiment of the present disclosure provides a method for generating a multi-plane image, as shown in FIG. 5 , including:
  • Step 310 dividing the 3D scene into multiple scene areas
  • Step 320 generate a block multi-planar image PMPI, the PMPI includes a plurality of sub-multi-planar images sMPI for respectively representing a plurality of the scene regions, and the starting depth of the sMPI is at least according to the scene region represented by the sMPI Depth information is determined.
  • the starting depth of the sMPI is determined according to at least depth information of a scene region represented by the sMPI, including: determining the starting depth of the sMPI according to a minimum depth of a first region,
  • the first area is the scene area represented by the sMPI, or the first area is an area jointly formed by the scene area represented by the sMPI and adjacent areas of the scene area represented by the sMPI.
  • the adjacent area of the scene area represented by the sMPI may include one or more scene areas located around the scene area represented by the sMPI in the 3D scene, or the adjacent area includes the scene area located in the 3D scene located in the 3D scene.
  • the area composed of multiple rows of pixels and/or multiple columns of pixels surrounding the scene area represented by the sMPI does not require to be a complete scene area.
  • the starting depth of the sMPI when the starting depth of the sMPI is determined according to the minimum depth of the first region, the starting depth of the sMPI is set as the minimum depth of the first region.
  • a depth slightly smaller than the minimum depth may also be selected on the basis of the minimum depth of the first region as the starting depth of the sMPI, for example, subtract a set value from the value of the minimum depth, or subtract the value from the value of the minimum depth and multiply a value obtained by a set ratio, so that the determined The starting depth of sMPI has a certain margin.
  • the value of the above minimum depth may be the minimum depth value of the scene area represented by the sMPI in the depth map of the 3D scene, or the minimum depth value in the depth map may be rounded, normalized, etc. for coding.
  • Depth values in 3D scenes can be represented by grayscale values.
  • the first area is the scene area represented by the sMPI
  • the starting depth of the sMPI is determined according to the minimum depth of the scene area represented by the sMPI.
  • the depth values in the 3D scene depth map may have deviations
  • the minimum depth of the region determines the start depth of the sMPI, and the determined start depth is always less than or equal to the start depth determined only according to the minimum depth of the scene region represented by the sMPI. The impact brought by the above deviation can be avoided as much as possible, so that the sMPI can completely sample the effective area in the represented scene area.
  • the initial depth of each sMPI may be calculated by using pooling during actual operation.
  • a 3D scene is divided into 36 scene areas with a 6 ⁇ 6 grid
  • the solid line area on the left side of Figure 6 represents the original depth map of the 3D scene, which is also divided into the There are 36 scene areas, and each grid in the figure represents a scene area.
  • the minimum depth of each scene area can be determined according to the depth information of the scene area in the depth map.
  • the pooling size is 5 ⁇ 5, and the pooling step is 1. In order to make the grid number of the pooled depth map still 6x6, it is expanded centered on the original depth map.
  • the expanded depth map includes 10 ⁇ 10 grids, the expanded grid is represented by a dotted line, and the minimum depth of each expanded grid is copied to the original depth map closest to the grid
  • the minimum depth of the grid (the distance between two grids can be taken as the length of the line connecting the centers of the two grids).
  • the minimum depth of each grid in the depth map is equal to 5 ⁇
  • the minimum depth of 5 grids is set as the starting depth of the sMPI characterizing the grid (ie the scene area). That is, the operation of determining the starting depth of the sMPI according to the minimum depth of the area formed by the scene area represented by the sMPI and the adjacent areas of the scene area represented by the sMPI is realized.
  • the sMPI includes multiple layers sampled at different depths of the represented scene region, and each of the multiple layers includes a color map and a transparency map.
  • Step 1 determining the start depth and end depth of the plurality of divided sMPIs, wherein the start depths of the plurality of sMPIs are respectively determined at least according to the depth information of the respectively represented scene areas, and the plurality of sMPIs
  • the termination depth is set to be the same;
  • the PMPI can be regarded as an ordinary MPI, and a termination depth is jointly set for multiple sMPIs in the PMPI according to the setting method of the termination depth of the ordinary MPI.
  • Step 2 for each sMPI in a plurality of said sMPIs, determine the depth of each layer included in the sMPI according to the start depth and end depth of the sMPI, and the number of layers of the sMPI and the distribution rules of the layers, in the The depth of each layer of the scene area represented by the sMPI is sampled to obtain the color map and transparency map of each layer included in the sMPI.
  • the multiple sMPIs correspond to the multiple scene regions one by one, and the number of layers and layer distribution rules of the multiple sMPIs are set to be the same, so as to simplify processing and improve coding efficiency.
  • the distribution rule of the layers may be equidistant distribution or equidistant distribution, for example. But the present disclosure is not limited thereto.
  • the number of layers and layer distribution rules of the multiple sMPIs in the PMPI may also be different. In this case, some coding complexity will be added, but the 3D scene can be represented more flexibly.
  • the multiple layers may be planar layers or spherical layers.
  • the dividing the 3D scene into multiple scene areas includes: dividing the 3D scene into multiple scene areas according to preset scene division rules, wherein, according to the scene division rules
  • the following one or any combination information of the divided multiple scene areas may be determined: the number of scene areas, the shape of the scene area, the size of the scene area, and the position of the scene area; wherein, the sizes of the multiple scene areas are the same
  • the shapes of the multiple scene regions are one or a combination of regular shapes or irregular shapes, and the regular shapes include one or any combination of triangles, rectangles, pentagons, and hexagons.
  • the dividing the 3D scene into a plurality of scene areas according to a preset scene division rule includes: dividing the 3D scene into M ⁇ N grids using an M ⁇ N grid Scene area, M, N are positive integers, and M ⁇ N ⁇ 2.
  • the divided M ⁇ N scene areas are rectangular areas with the same size.
  • This division method can easily determine the scene area where each pixel in the PMPI is located according to the coordinates of the pixels (such as looking up a table or calculating through a simple formula), and there is no need to additionally identify the scene area where the pixel is located or the sMPI it belongs to.
  • the present disclosure is not limited to this division method.
  • the generated PMPI original storage data includes frame parameters and frame data of a PMPI frame
  • the frame parameters of the PMPI frame in the original storage data include one or any combination of the following parameters:
  • the distribution rule of the layer uniformly set for sMPI in the PMPI frame
  • the termination depth uniformly set for sMPI in PMPI frames
  • the frame data of each PMPI frame in the original storage data includes: the color map data and the transparency map data of each layer in each sMPI included in the PMPI frame.
  • the starting depth of the PMPI generated by the embodiment of the present disclosure is more flexible and changeable, and can be adaptively changed when the depth of field in different regions of the scene changes.
  • the resulting result is that the sampling points of PMPI are gathered on the visible surface of the scene, and the sampling efficiency is improved.
  • the distribution of plane layers of PMPI is denser on the whole, which is equivalent to ordinary MPI with more layers, but the number of sampling points does not increase.
  • a denser depth layer makes the final immersive video generated according to PMPI retain more details and better quality.
  • MPI can be displayed as immersive video after video compression.
  • Figure 8 shows the corresponding video processing.
  • the three-dimensional scene images (such as images captured by a 3D camera) collected by the video acquisition device are preprocessed to obtain MPI, and the MPI is compressed and encoded and then transmitted as a code stream.
  • the code stream is decoded and post-processed, and displayed and played in the form of immersive video.
  • MPEG Moving Picture Experts Group
  • RV Moving Picture Experts Group
  • PCS compressed package compression storage
  • Image data such as data and reference viewpoint camera parameters can be used as the input of the immersive video test model (TMIV: Test model of immersive video) in MPEG.
  • TMIV Test model of immersive video
  • an MPI with image resolution of W ⁇ H and number of layers as S that is, an MPI frame with a size of W ⁇ H ⁇ S as an example, it can be converted into PCS data.
  • the PCS data records the relevant parameters of each pixel in the MPI. ,include:
  • N i, j the number of effective layers of the pixel (i, j);
  • C i,j,k the color data such as the color value at the kth effective layer position of the pixel (i,j);
  • D i,j,k the index (index) of the kth effective layer of the pixel (i,j) (D i,j,k ⁇ [1,S]);
  • T i,j,k Transparency value at the kth effective layer position of pixel (i,j).
  • pixel (i, j) is contained in S layers of MPI, and the layer whose transparency value of pixel (i, j) is not 0 is the effective layer of pixel (i, j).
  • FIG. 9 shows an example of PCS data of MPI encapsulated according to the above parameters.
  • the start depth and end depth of multiple MPI frames within a set time are the same, and the distribution rules of multiple layers in MPI are known.
  • the depth of each layer in the MPI can be calculated according to the starting depth and the ending depth.
  • the start depth and end depth of the MPI frame can be recorded in the frame parameters of the MPI frame, and do not need to be written into the PCS data of a single MPI frame, so the PCS data of a single MPI does not need to additionally record the effective layer of the pixel depth information.
  • PMPI can also be compressed and encoded as a video frame.
  • PMPI can be directly generated according to the image of the three-dimensional scene, or it can be generated on the basis of ordinary MPI.
  • Before encoding PMPI it is also necessary to convert the original storage data of PMPI into PCS data.
  • PMPI includes multiple sMPIs.
  • the depth of the layer closest to the reference viewpoint is the starting depth of the sMPI
  • the depth of the layer farthest from the reference viewpoint is the depth of the sMPI
  • the end depth of other layers is between the start depth and end depth of the sMPI, and multiple layers can also be distributed according to set rules such as equidistant or equidistant. Therefore, after knowing the start depth and end depth of sMPI, the depth of each layer of sMPI can be calculated.
  • the termination depths of sMPIs in different PMPIs are set to be the same.
  • the starting depth of sMPI in different PMPIs is related to the depth information of the scene area it represents, and is not preset and fixed. Therefore, when converting the original stored data of the PMPI in the embodiment of the present disclosure into PCS data, it is necessary to provide the initial depth information so that the decoding end can accurately calculate the depth of the effective layer of the pixel.
  • an embodiment of the present disclosure provides a data processing method for a multi-plane image, as shown in FIG. 10 , including:
  • Step 410 acquire the original storage data of the block multi-plane image PMPI, the PMPI includes a plurality of sub-multi-plane images sMPI to respectively represent a plurality of scene areas divided into three-dimensional scenes;
  • Step 420 converting the original storage data of the PMPI into encapsulated and compressed storage PCS data, the PCS data is used to determine the depth of the effective layer of the pixel in the PMPI and the color and transparency of the pixel on the effective layer.
  • the PMPI is generated using the generation method described in any embodiment of the present disclosure, each pixel in the PMPI is included in one sMPI, and the multiple layers included in the sMPI Both record the color value and transparency value of the pixel.
  • the pixels in the ordinary PMI are included in all layers of the ordinary PMI, and the PMPI is divided into blocks, so the pixels in the PMPI are included in all the layers of a sMPI.
  • the sMPI containing the pixel is called the sMPI where the pixel is located.
  • the color value and transparency value of the pixel are recorded in all layers of the sMPI where the pixel is located, but only some of these layers may be valid layers for the pixel.
  • the effective layer of a pixel in MPI may be a layer in MPI whose transparency of the pixel is greater than a set threshold (eg, 0).
  • the effective layer of the pixel in the PMPI in the embodiment of the present disclosure may follow the provisions in the above standards.
  • the effective layer of the pixel in the PMPI refers to the layer in the sPMI that contains the pixel in the PMPI and the transparency of the pixel is greater than a set threshold (such as 0).
  • An effective layer of a pixel can have one or more layers, depending on the actual scene.
  • the PCS data includes frame data and frame parameters of a PMPI frame.
  • the frame data of a PMPI frame in the PCS data includes:
  • the following data of each pixel in the PMPI frame the color data of the pixel on each valid layer, the transparency data and the layer index of the valid layer in the sMPI where the pixel is located.
  • the frame data of a PMPI frame in the PCS data includes:
  • the following data of each pixel in the PMPI frame the index of the sMPI where the pixel is located, the color data and transparency data of the pixel on each effective layer and the layer index of the effective layer in the sMPI where the pixel is located.
  • the frame data of a PMPI frame in the PCS data includes: the following data of each pixel in the PMPI frame, the starting depth of the sMPI where the pixel is located; The color data, transparency data, and layer index of the effective layer in the sMPI where the pixel resides on the layer.
  • a parameter may be added to the frame data of a PMPI frame in the PCS data, that is, the number of effective layers of each pixel in the PMPI frame. Increasing this parameter is beneficial to improve the efficiency of data encoding and parsing.
  • the PCS data also includes frame parameters of the PMPI frame, and the frame parameters of the PMPI frame in the PCS data include one or any combination of the following parameters:
  • the distribution rule of the layer uniformly set for sMPI in the PMPI frame
  • the termination depth uniformly set for sMPI in PMPI frames
  • the frame parameters of the PMPI frame in this example can be applied to the embodiments shown in FIG. 11 , FIG. 12 and FIG. 13 , and will not be repeated below.
  • the first PCS data format applicable to the PMPI of the embodiment of the present disclosure is proposed.
  • the frame data of a PMPI in the PCS data includes:
  • the image resolution of the PMPI is W ⁇ H
  • M ⁇ N grid division is adopted
  • the number of sMPIs included in the PMPI is M ⁇ N
  • the number of layers of each sMPI is S.
  • the PCS data format of the PMPI is as shown in Figure 11, and the frame data of a PMPI frame in the PMPI includes:
  • DP x,y used to characterize the starting depth of the sMPI of the scene area represented by the grid (x,y), x ⁇ [1,M], y ⁇ [1,N];
  • N i, j the number of effective layers of the pixel (i, j), i ⁇ [1,H], j ⁇ [1,W];
  • C i,j,k the color data such as the color value at the kth effective layer position of the pixel (i,j);
  • D i,j,k the index (index) of the kth effective layer of the pixel (i,j), D i,j,k ⁇ [1,S];
  • T i,j,k Transparency data such as transparency value at the kth effective layer position of pixel (i,j).
  • the starting depth of each sMPI in the PMPI frame is written in the frame data of the PMPI frame, and the sMPI where the pixel (i, j) is located can be determined according to i, j and the division rule, combined with the sMPI in the frame parameter
  • the termination depth, number of layers and distribution rules of layers can calculate the depth of each layer of sMPI where the pixel (i, j) is located, and then determine the pixel (i, j) according to the indexes of all effective layers of the pixel (i, j). The depth of each effective layer of (i, j) is used for subsequent encoding processing.
  • a second PCS data format suitable for the PMPI of the embodiment of the present disclosure is proposed.
  • the frame data of a PMPI in the PCS data includes:
  • the image resolution of the PMPI is W ⁇ H
  • the number of divided sMPIs is M
  • the number of layers of each sMPI is S.
  • the PCS data format of the PMPI is as shown in Figure 12, and the frame data of a PMPI frame in the PMPI includes:
  • N i, j the number of effective layers of the pixel (i, j), i ⁇ [1,H], j ⁇ [1,W];
  • I i,j the index of the sMPI where the pixel (i,j) is located;
  • C i,j,k the color data such as the color value at the kth effective layer position of the pixel (i,j);
  • D i,j,k the index (index) of the kth effective layer of the pixel (i,j), D i,j,k ⁇ [1,S];
  • T i,j,k Transparency data such as transparency value at the kth effective layer position of pixel (i,j).
  • This embodiment is compared with the previous embodiment.
  • the starting depth of each sMPI in the PMPI frame is written, and the index of the sMPI where the pixel (i,j) is located is also written, so it can be applied to the three-dimensional scene with grid division It can also be applied to the case of non-grid division of 3D scenes.
  • a second PCS data format suitable for the PMPI of the embodiment of the present disclosure is proposed.
  • the frame data of a PMPI frame in the PCS data includes the following parameters of each pixel in the PMPI:
  • the image resolution of the PMPI is W ⁇ H
  • the number of sMPIs included in the PMPI is M
  • the number of layers of each sMPI is S.
  • the PCS data format of the PMPI is as shown in Figure 13, and the frame data of a PMPI frame in the PMPI includes the following parameters:
  • N i, j the number of effective layers of the pixel (i, j), i ⁇ [1,H], j ⁇ [1,W];
  • E i,j the starting depth of the sMPI where the pixel (i,j) is located;
  • C i,j,k the color data such as the color value at the kth effective layer position of the pixel (i,j);
  • D i,j,k the index (index) of the kth effective layer of the pixel (i,j), D i,j,k ⁇ [1,S];
  • T i,j,k Transparency value at the kth effective layer position of pixel (i,j).
  • the data of the starting depths of the plurality of sMPIs is expressed as the starting depth of the sMPI where each pixel is located in the PMPI, that is, the starting depth of the sMPI where the pixel is located is directly written into the frame data of the PMPI frame, It is convenient to determine the depth of the effective layer of pixels, but may affect the efficiency of coding.
  • the PCS data of PMPI is used to determine the depth of the effective layer of the pixel in the PMPI and the parameters of the color and transparency of the pixel on the effective layer.
  • other data formats can also be used. This disclosure There is no limit to this.
  • the above PCS data format applicable to PMPI adds information about the initial depth of sMPI to the PCS data, so that the decoder can calculate the depth of the effective layer of pixels based on the initial depth of sMPI, thereby accurately recovering the image of PMPI.
  • FIG 14 is a structural diagram of an MPI encoding device that can be used in an embodiment of the present disclosure.
  • the input data of the MPI encoding device 10 is the PCS data of the source MPI (such as PMPI), and the PCS data includes but is not limited to image parameters (View parameters ) (also called view parameters, such as reference viewpoint camera parameters, etc.), the data of the Texture Attribute component and the data of the Transparency Attribute component, etc.
  • View parameters also called view parameters, such as reference viewpoint camera parameters, etc.
  • the data of the Texture Attribute component and the data of the Transparency Attribute component etc.
  • the MPI encoding device 10 includes:
  • the MPI mask generation (Create mask from MPI) unit 101 is configured to generate an MPI mask according to input data.
  • the pixels (also referred to as sampling points) in the MPI layer may be screened according to a transparency threshold to obtain a mask of each layer. This is for distinguishing positions with high transparency (also referred to as pixels) and positions with low transparency (also referred to as pixels) on each layer, and masking positions with high transparency to reduce the amount of data.
  • the MPI mask generation unit 101 performs the above-mentioned operations on all MPI frames within an intra-period.
  • the MPI frame size is W ⁇ H ⁇ S
  • the number of frames included in an intra-period is M.
  • the MPI mask aggregation (Aggregate MPI masks) unit 103 is configured to take a union of multiple masks located on the same layer among the M W ⁇ H ⁇ S masks to obtain a W ⁇ H ⁇ S mask.
  • Effective pixel clustering (Cluster Active pixels) unit 105 is configured to cluster the regions (effective information regions) whose transparency is greater than a threshold value in the mask of each layer into a series of clusters (cluster);
  • the cluster segmentation (Split Clusters) unit 107 is configured to divide the cluster obtained by the clustering of the effective pixel clustering unit 105, and obtain the cluster after the segmentation process;
  • the block packaging (Pack patches) unit 109 is configured to recombine the texture map and transparency map corresponding to each patch (patch, such as a rectangular area containing clusters) into a picture, and encode it as atlas data for transmission.
  • Video data generation (Generate video data) unit 111 is set to generate video data and transmit according to the atlas data that block encapsulation unit 109 outputs, and described video data comprises texture attribute video data (Texture attribute video data (raw)), transparency attribute video data (Transparency attribute video data (raw)), etc.
  • the parameter encoding unit 113 is configured to encode according to the source MPI data to obtain encoded image parameters, and the encoded image parameters may include an image parameter list (View parameters list), a parameter set (Parameter set) and the like.
  • the sampling points in the MPI are first screened according to a transparency threshold to obtain a mask of each plane layer.
  • the size of MPI is W ⁇ H ⁇ S
  • the number of frames contained in a set period of time (intra-period) is M
  • the above operation is performed on all MPI frames in the period of time to obtain M W ⁇ H ⁇ S mask (mask).
  • the masks on the same plane layer are combined to obtain a W ⁇ H ⁇ S mask.
  • the regions (effective information regions) whose transparency is greater than the threshold value in the mask of each layer are clustered and divided into a series of clusters. Cluster obtains small patches through steps such as fusion and decomposition.
  • the texture map (ie color map) and transparency map corresponding to each block patch are recombined into a picture, encoded as atlas data for transmission.
  • An embodiment of the present disclosure provides a multi-plane image coding method, which can be used for PMPI coding. As shown in FIG. 15, the coding method includes:
  • Step 510 receiving PCS data of PMPI, said PMPI includes a plurality of sMPIs to respectively represent a plurality of scene areas divided into three-dimensional scenes, said PCS data includes image parameters, and data of texture attribute part and transparency attribute part;
  • Step 520 Encode the PMPI based on the PCS data to obtain encoded image parameters and atlas data.
  • the PCS data includes the initial depth information of the sMPI, which can be encapsulated in the data of the image parameter and/or the texture attribute part and the transparency attribute part, and the initial depth information of the sMPI is written into the code stream during encoding, Can be encapsulated in encoded image parameters and/or atlas data.
  • the encoding of the PMPI based on the PCS data includes: performing encoding processing on a plurality of sMPIs included in the PMPI, wherein each sMPI can be encoded according to The encoding process is performed in the same encoding manner as ordinary MPI (that is, to represent the entire 3D scene), and the encoding manner of ordinary MPI may follow the provisions in relevant standards.
  • the PCS data of the PMPI is converted from the original stored data of the PMPI according to the data processing method described in any embodiment of the present disclosure.
  • the image parameters in the PCS data and the encoded image parameters all include at least one of the following data: part or all of the frame parameters of the PMPI frame in the PCS data, the starting depth of each sMPI in the PMPI frame ;
  • the data of the texture attribute part and the transparency attribute part in the PCS data include part or all of the frame data of the PMPI frame in the PCS data, and the data of the texture attribute part include color data;
  • the atlas data includes data and parameters of a block (patch) determined during encoding, the data includes color data and transparency data, and the parameters include one or any combination of the following: identification information of the layer to which the data belongs, identification information of the layer to which the data belongs The starting depth, the identification information of the sMPI to which the data belongs, the starting depth of the sMPI to which the data belongs, and the identification information of the PMPI to which the data belongs.
  • the starting depth of the sMPI in the PMPI may be written into encoded image parameters and/or atlas data.
  • An embodiment of the present disclosure provides a multi-plane image decoding method, as shown in FIG. 16 , including:
  • Step 610 receiving the coded code stream of the block multi-plane image PMPI, the coded code stream includes image parameters and atlas data of the PMPI;
  • the image parameters and/or atlas data of the PMPI in the encoded code stream include the initial depth information of the sMPI.
  • Step 620 decoding the encoded code stream to obtain the image parameters of the PMPI, and the data of the texture attribute part and the transparency attribute part;
  • the PMPI includes a plurality of sub-multi-plane images SMPI to respectively represent a plurality of scene regions into which the three-dimensional scene is divided.
  • the image parameters of the PMPI include one or any combination of the following parameters:
  • the distribution rule of the layer uniformly set for sMPI in the PMPI frame
  • the termination depth uniformly set for sMPI in PMPI frames
  • the atlas data includes block data and parameters determined during encoding, the data includes color data and transparency data, and the parameters include one or any combination of the following: the layer to which the data belongs The identification information of the data, the initial depth of the layer to which the data belongs, the identification information of the sMPI to which the data belongs, the initial depth of the sMPI to which the data belongs, and the identification information of the PMPI to which the data belongs.
  • An embodiment of the present disclosure also provides a code stream, wherein the code stream is generated by encoding a block multi-plane image PMPI, and the code stream includes image parameters and atlas data of the PMPI; the PMPI includes The multiple sub-multiplanar images sMPI are used to respectively represent the multiple scene regions into which the 3D scene is divided.
  • the image parameters and/or atlas data of the PMPI in the code stream include the initial depth information of the sMPI.
  • the image parameters of the PMPI include one or any combination of the following parameters:
  • the distribution rule of the layer uniformly set for sMPI in the PMPI frame
  • the termination depth uniformly set for sMPI in PMPI frames
  • the atlas data includes block data and parameters determined during encoding, the data includes color data and transparency data, and the parameters include one or any combination of the following: the layer to which the data belongs The identification information of the data, the initial depth of the layer to which the data belongs, the identification information of the sMPI to which the data belongs, the initial depth of the sMPI to which the data belongs, and the identification information of the PMPI to which the data belongs.
  • An embodiment of the present disclosure also provides a device for generating a multi-plane image, as shown in FIG. 17 , including a processor 5 and a memory 6 storing a computer program operable on the processor 5, wherein the When the processor 5 executes the computer program, the method for generating a multi-plane image according to any embodiment of the present disclosure is realized.
  • An embodiment of the present disclosure also provides a multi-plane image data processing device, which can also be referred to FIG. 17 , including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the implementation of the present disclosure
  • a multi-plane image data processing device which can also be referred to FIG. 17 , including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the implementation of the present disclosure
  • the data processing method of the multi-plane image described in any embodiment.
  • An embodiment of the present disclosure also provides a multi-plane image encoding device, which can also be referred to FIG. 17 , including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the implementation of the present disclosure A method for encoding multi-plane images described in any embodiment.
  • An embodiment of the present disclosure also provides a multi-plane image decoding device, which can also be referred to FIG. 17 , including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the implementation of the present disclosure
  • a multi-plane image decoding device which can also be referred to FIG. 17 , including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the implementation of the present disclosure.
  • An embodiment of the present disclosure also provides a non-transitory computer-readable storage medium, the computer-readable storage medium stores a computer program, wherein, when the computer program is executed by a processor, any implementation of the present disclosure can be realized.
  • the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over, as one or more instructions or code, a computer-readable medium and executed by a hardware-based processing unit.
  • Computer-readable media may include computer-readable storage media that correspond to tangible media such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, eg, according to a communication protocol. In this manner, a computer-readable medium may generally correspond to a non-transitory tangible computer-readable storage medium or a communication medium such as a signal or carrier wave.
  • Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this disclosure.
  • a computer program product may comprise a computer readable medium.
  • such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk or other magnetic storage, flash memory, or may be used to store instructions or data Any other medium that stores desired program code in the form of a structure and that can be accessed by a computer.
  • any connection could also be termed a computer-readable medium. For example, if a connection is made from a website, server or other remote source for transmitting instructions, coaxial cable, fiber optic cable, dual wire, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium.
  • disk and disc include compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk, or blu-ray disc, etc. where disks usually reproduce data magnetically, while discs use lasers to Data is reproduced optically. Combinations of the above should also be included within the scope of computer-readable media.
  • processors can be implemented by one or more processors such as one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuits.
  • DSPs digital signal processors
  • ASICs application specific integrated circuits
  • FPGAs field programmable logic arrays
  • processors may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein.
  • the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec.
  • the techniques may be fully implemented in one or more circuits or logic elements.
  • the technical solutions of the embodiments of the present disclosure may be implemented in a wide variety of devices or devices, including a wireless handset, an integrated circuit (IC), or a set of ICs (eg, a chipset).
  • IC integrated circuit
  • Various components, modules, or units are described in the disclosed embodiments to emphasize functional aspects of devices configured to perform the described techniques, but do not necessarily require realization by different hardware units. Rather, as described above, the various units may be combined in a codec hardware unit or provided by a collection of interoperable hardware units (comprising one or more processors as described above) in combination with suitable software and/or firmware.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Graphics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)

Abstract

本公开提供一种多平面图像的生成、数据处理、编码和解码方法、装置,所述方法使用分块多平面图像PMPI来表征三维场景,所述PMPI包括多个子多平面图像sMPI以分别表征对三维场景划分得到的多个场景区域,多个sMPI的起始深度根据其表征的三维场景区域的深度信息确定,每一sMPI包括在所表征场景区域的不同深度采样得到的多个层。本公开的方法和装置可以提高采样效率和视频分辨率。

Description

多平面图像的生成、数据处理、编码和解码方法、装置 技术领域
本公开实施例涉及但不限于图像处理技术,更具体地,涉及一种多平面图像的生成、数据处理、编码和解码方法、装置。
背景技术
多平面图像(MPI:Multiplane image)是一种无冗余的场景表征方式。在一个给定的参考视点作为坐标原点的空间坐标系内,MPI将场景分解为一系列的层,这些层为平面层或球面层。以由平面层组成的MPI为例,如图1所示,多个平面层相对于参考视点正向平行并且位于不同的深度上。MPI的深度范围[dmin,dmax]需要根据场景的景深数据提前设定,其中,dmin为最小深度,即离参考视点最近的层到参考视点的距离,dmax为最大深度,即离参考视点最远的层到参考视点的距离。MPI中的每一个层分为两部分:颜色图(Color frame)和透明度图(Transparency frame)。一个层的颜色图和透明度图分别包含了场景在该平面层位置处的纹理信息和透明度信息,MPI可用于沉浸视频,但效果还有待提升。
发明概述
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。
本公开一实施例提供了一种多平面图像的生成方法,包括:
将三维场景划分成多个场景区域;及
生成分块多平面图像PMPI,所述PMPI包括用于分别表征多个所述场景区域的多个子多平面图像sMPI,所述sMPI的起始深度至少根据所述sMPI所表征场景区域的深度信息确定。
本公开实施例还提供了一种多平面图像的数据处理方法,包括:
获取分块多平面图像PMPI的原始存储数据,所述PMPI包括多个子多平面图像sMPI以分别表征三维场景划分成的多个场景区域;
将所述PMPI的原始存储数据转换为封装压缩存储PCS数据,所述PCS数据用于确定所述PMPI中像素的有效层的深度及像素在有效层上的颜色和透明度。
本公开一实施例还提供了一种多平面图像的编码方法,包括:
接收分块多平面图像PMPI的封装压缩存储PCS数据,所述PMPI包括多个子多平面图像sMPI以分别表征三维场景划分成的多个场景区域,所述PCS数据包括图像参数,及纹理属性部分和透明度属性部分的数据;
基于所述PCS数据对所述PMPI进行编码,得到编码后的图像参数和图集数据。
本公开一实施例还提供了一种多平面图像的解码方法,包括:
对分块多平面图像PMPI的编码码流进行解码,获取所述PMPI的图像参数,及纹理属性部分和透明度属性部分的数据;
其中,所述PMPI包括多个子多平面图像sMPI以分别表征三维场景划分成的多个场景区域,所述编码码流中包括PMPI的图像参数和图集数据。
本公开一实施例还提供了一种码流,其中,所述码流通过对分块多平面图像PMPI编码生成,所述码流中包括所述PMPI的图像参数和图集数据;所述PMPI包括多个子多平面图像sMPI以分别表征三维场景划分成的多个场景区域。
本公开一实施例还提供了一种多平面图像的生成装置,包括处理器以及存储有计算机程序的存储器,其中,所述处理器执行所述计算机程序时实现如本公开任一实施例所述的多平面图像的生成方法。
本公开一实施例还提供了一种多平面图像的数据处理装置,包括处理器以及存储有计算机程序的存储器,其中,所述处理器执行所述计算机程序时实现如本公开任一实施例所述的多平面图像的数据处理方法。
本公开一实施例还提供了一种多平面图像的编码装置,包括处理器以及存储有计算机程序的存储器,其中,所述处理器执行所述计算机程序时实现如本公开任一实施例所述的多平面图像的编码方法。
本公开一实施例还提供了多平面图像的解码装置,包括处理器以及存储有计算机程序的存储器,其中,所述处理器执行所述计算机程序时实现如本公开任一实施例所述的多平面图像的解码方法。
本公开一实施例还提供了一种非瞬态计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其中,所述计算机程序时被处理器执行时实现如本公开任一实施例所述的多平面图像的生成方法、数据处理方法、解码方法或编码方法。
在阅读并理解了附图和详细描述后,可以明白其他方面。
附图概述
附图用来提供对本公开实施例的理解,并且构成说明书的一部分,与本公开实施例一起用于解释本公开的技术方案,并不构成对本公开技术方案的限制。
图1是4个平面层组成的一示例性的MPI的结构示意图;
图2A至图2F是一个示例性的MPI中连续6个平面层的示意图,示出了每一平面层的颜色图和透明度图;
图3是使用普通MPI表征一个三维场景的示意图;
图4是使用本公开实施例PMPI表征一个三维场景的示意图;
图5是本公开一实施例PMPI生成方法的流程图;
图6是本公开一实施例通过池化确定sMPI的起始深度的示意图;
图7是使用本公开实施例生成的一示例性的PMPI的示意图;
图8是一种视频压缩处理过程的示意图;
图9是从MPI原始存储数据转换得到的一种PCS数据的示意图;
图10是本公开一实施例PMPI的数据处理方法的流程图;
图11是本公开实施例从PMPI原始存储数据转换得到的一种PCS数据的示意图;
图12是本公开实施例从PMPI原始存储数据转换得到的另一种PCS数据的示意图;
图13是本公开实施例从PMPI原始存储数据转换得到的又一种PCS数据的示意图;
图14是本公开一实施例PMPI编码装置的结构示意图;
图15是本公开一实施例PMPI编码方法的流程图;
图16是本公开一实施例PMPI解码方法的流程图;
图17是本公开一实施例PMPI的生成装置的示意图。
详述
本公开描述了多个实施例,但是该描述是示例性的,而不是限制性的,并且对于本领域的普通技术人员来说显而易见的是,在本公开所描述的实施例包含的范围内可以有更多的实施例和实现方案。
本公开的描述中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本公开中被描述为“示例性的”或者“例如”的任何实施例不应被解释为比其他实施例更优选或更具优势。本文中的“和/或”是对关联对象的关联关系的一种描述,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。“多个”是指两个或多于两个。另外,为了便于清楚描述本公开实施例的技术方案,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。
在描述具有代表性的示例性实施例时,说明书可能已经将方法和/或过程呈现为特定的步骤序列。然而,在该方法或过程不依赖于本文所述步骤的特定顺序的程度上,该方法或过程不应限于所述的特定顺序的步骤。如本领域普通技术人员将理解的,其它的步骤顺序也是可能的。因此,说明书中阐述的步骤的特定顺序不应被解释为对权利要求的限制。此外,针对该方法和/或过程的权利要求不应限于按照所写顺序执行它们的步骤,本领域技术人员可以容易地理解,这些顺序可以变化,并且仍然保持在本公开实施例的精神和范围内。
多平面图像(MPI)是一种无冗余的三维场景的分层表征方式。三维场景被分解为一组平面或球面层,从给定参考点的不同深度采样。每一层通过将包含在层位置周围的3D场景部分投影到同一参考摄影机上获得。此参考摄影机位于给定的参考视点。使用平面层时,参考摄影机是透视摄影机;使用球形层时,参考摄影机是球形(通常为等矩形)摄影机。
请参照图1,在一个给定的参考视点(如参考相机)作为坐标原点的空间坐标系内,MPI将场景分解为一系列平面层或球面层。以由平面层组成的MPI为例,平面层相对于参考视点正向平行并且位于不同的深度上。平面层的深度范围[dmin,dmax]需要根据实际场景的深度范围提前设定。假定MPI包括S个平面层,每一平面层的大小为W×H,则MPI的大小可以表示为W×H×S。其中,W为MPI宽度方向上的像素个数,H为MPI高度方向上的像素个数,该MPI包含W×H个像素,平面图像分辨率为W×H。图1所示的示例性的MPI包括4个层,但MPI包括的平面层或球面层的个数即层数也可以是2个,3个,5个或5个以上,如100个、200个等等。MPI的每一层均包括一个颜色图和一个透明度图,用于记录像素在该层上的颜色和透明度,一个像素在不同层上可以具有不同的颜色和透明度。
在现实场景中,通常大部分空间区域中是没有可见表面的,即为有效区域。在MPI中直观体现为MPI多个层中的颜色图和透明度图的大部分区域为无效值,即不包含可见信息,如图2A至图2F所示是一个MPI的第40个平面层到第45个平面层共6个连续的平面层,其中图2A-1展示的是第40个平面层的颜色图,图2A-2展示的是第40个平面层的透明度图,图2B-1展示的是第41个平面层的颜色图,图2B-2展示的是第41个平面层的透明度图,其他图依此类推。其中,各透明度图中的黑色部分是无效区域。
MPI是三维场景的分层表征,也即对三维场景的采样,MPI平面层上的点是采样点,从图2A至图2F的示例可以看出,该MPI中的大部分采样点位于三维场景的无效位置,这些位置没有可见表面,透明度为0,只有小部分采样点位于三维场景的有效区域,这些有效区域所在位置存在可见表面,透明度不为0。MPI可用于沉浸视频,从沉浸式体验的角度而言,起决定性作用的是三维场景中的有效区域,而MPI的大部分采样点被浪费,导致采样效率低,最终呈现的沉浸视频分辨率也较低。
MPI的深度范围[dmin,dmax]根据场景的全局深度而设定,深度范围足以囊括场景大部分有效信息即可,其中,离参考视点最近的层的深度称为MPI的起始深度dmin,离参考视点最远的层的深度可以称为MPI的终止深度dmax。以图3所示的简单场景为例,图中的平行线用于指示MPI中各层在三维场景内的深度位置。由于该场景中有1个几何体离其他几何体较远,MPI为了表征该场景的主要信息(四个几何体),必须采用较大的深度范围,由此得到的平面层(图中以4个为例)较为稀疏。对于位于远景区域的三个几何体而言,有效信息只会出现在MPI中深度较大的两个平面层上。MPI的采样效率较低。
为了解决MPI采样效率低的问题,本公开实施例提出一种具有深度自适应变化特性的MPI,为了区别于图1、图3所示的MPI,文中将图1、图3所示的MPI称为普通MPI,也即表征整个三维场景的MPI,将本公开实施例提出的MPI称为分块多平面图像(PMPI:Patch multiplane image)。本公开实施例的PMPI是这样的一种MPI:包括多个子多平面图像(sMPI:sub multiplane image)以分别表征三维场景划分成的多个场景区域,每一sMPI包括在该sMPI所表征场景区域的不同深度采样得到的多个层,每一sMPI的起始深度至少根据该sMPI所表征场景区域的深度信息确定。PMPI可以视为对普通MPI的扩展,普通MPI的基本单元是大小相同的多个层,用于表征一个完整的三维场景,而PMPI用多个sMPI分别表征对一个三维场景划分成的多个场景区域,每一个场景区域可以视为三维场景的一个块,而每一sMPI也可以视为PMPI的一个块,因此PMPI是三维场景的一种分层分块表征方式。
单个sMPI也是一种MPI,只是其表征的是三维场景划分成的场景区域,而场景区域也可视为一种三维场景,只是大小和形状与原始的三维场景不同。sMPI表征场景区域的方式仍可以采用普通MPI表征三维场景的方式,如一个sMPI包括在场景区域的不同深度采样的多个层,多个层的大小和形状相同,每一层包括一个颜色图和一个透明度图,多个层之间可以按设定规则(如等间距或等视距)分布,等等。与MPI类似,sMPI的深度范围也遵循包括场景区域中大部分有效信息的原则而设置。
PMPI中多个sMPI的终止深度可以设置为相同,而起始深度可以根据所表征场景区域的深度信息设置为不同,从而引入场景的深度信息,增加针对场景深度的自适应能力,使更多的采样点放置在场景的有效位置上。
图4所示的三维场景和参考视点与图3相同。采用本公开实施例的PMPI表征该三维场景时,假定将场景区域的个数设置为2,用一竖向平面将三维场景划分为两个场景区域,其中一个场景区域包括位于参考视点左侧的两个几何体,以下将该场景区域称为第一场景区域,第一场景区域中的一个几何体离参考视点最近,另一个几何体离参考视点较远。另一个场景区域包括位于参考视点右侧的两个几何体,以下将该场景区域称为第二场景区域,第二场景区域中的两个几何体均离参考视点较远,
在PMPI中,上述第一场景区域和第二场景区域分别用一个sMPI表征,每一sMPI包含4个平面层,因为第一场景区域中两个几何体的深度差异较大,用于表征该第一场景区域的sMPI的场景深度范围需要设置的较大。第二场景区域中两个几何体的深度差异较小且靠近后方,用于表征该第二场景区域的sMPI的深度范围范围可以设置的较小。在不同sMPI的终止深度设置为相同时,用于表征第一场景区域的sMPI的起始深度较小,而用于表征第二场景区域的sMPI的起始深度较大。由此得到的三维场景的PMPI表征如图4所示。从该图可以看出,用于表征第二场景区域的sMPI的4层之间变得密集且均位于几何体附近,因此与图3的MPI相比,PMPI有更多的采样点位于有效位置。也就是说,用分层分块的PMPI表征该场景,相对普通MPI的分层表征方式,采样效率更高。需要说明的是,图4所示的示例对三维场景的划分方式仅仅是示意性的,是为了用一个简单示例说明PMPI与普通MPI的不同。
本公开一实施例提供了一种多平面图像的生成方法,如图5所示,包括:
步骤310,将三维场景划分成多个场景区域;
步骤320,生成分块多平面图像PMPI,所述PMPI包括用于分别表征多个所述场景区域的多个子多平面图像sMPI,所述sMPI的起始深度至少根据所述sMPI所表征场景区域的深度信息确定。
在本公开一示例性的实施例中,所述sMPI的起始深度至少根据所述sMPI所表征场景区域的深度信息确定,包括:所述sMPI的起始深度根据第一区域的最小深度确定,所述第一区域为所述sMPI所表征场景区域,或者所述第一区域为所述sMPI所表征场景区域和所述sMPI所表征场景区域的相邻区域所共同组成的区域。其中,所述sMPI所表征场景区域的相邻区域可以包括所述三维场景中位于所述sMPI所表征场景区域周边的一个或多个场景区域,或者,相邻区域包括所述三维场景中位于所述sMPI所表征场景区域周边的多行像素和/或多列像素组成的区域,不要求是完整的场景区域。
本实施例的一个示例中,根据第一区域的最小深度确定所述sMPI的起始深度时,将所述sMPI的起始深度设置为所述第一区域的最小深度。在本实施例的另一示例中,根据第一区域的最小深度确定所述sMPI的起始深度时,也可以在第一区域的最小深度的基础上,取一个比该最小深度略小的深度作为所述sMPI的起始深度,例如用该最小深度的值减去一个设定的值,或者用该最小深度的值减去该值乘以一设定比例得到的值,使得确定的所述sMPI的起始深度具有一定的裕量。上述最小深度的取值可以是三维场景的深度图中所述sMPI所表征场景区域的最小深度值,也可以对深度图中的该最小深度值进行四舍五入、归一化等变换以便于编码。三维场景中的深度值可以用灰度值表示。
本实施例的一个示例中,第一区域为所述sMPI所表征场景区域,所述sMPI的起始深度是根据所述sMPI所表征场景区域的最小深度确定。考虑到三维场景深度图中的深度值可能会存在偏差,所以在本实施例的另一个示例中,根据所述sMPI所表征场景区域和所述sMPI所表征场景区域的相邻区域所共同组成的区域的最小深度确定所述sMPI的起始深度,此时确定的起始深度总是小于或等于仅仅根据所述sMPI所表征场景区域的最小深度确定的起始深度。能够尽量避免上述偏差带来的影响,使得sMPI能够完整采样到所表征场景区域中的有效区域。
在该另一示例中,实际运算时可以采用池化的方式来计算出每一个sMPI的起始深度。如图6所示,假定将一个三维场景用6×6的网格划分为36个场景区域,图6左侧的实线区域代表三维场景的原始深度图,该深度图也被划分成所述36个场景区域,图中每一个网格代表一个场景区域。每一个场景区域的最小深度均可以根据深度图中该场景区域的深度信息确定。本示例池化时的池化尺寸为5×5,池化步长为1。为了使池化后的深度图的网格数仍然是6x6,以原始深度图为中心进行扩展。如图6所示,扩展后的深度图包括10×10个网格,扩展出的网格用虚线表示,扩展出的每一网格的最小深度复制为距离该网格最近的原始深度图中的网格的最小深度(两个网格之间的距离可取为两个网格中心的连线的长度)。对网格的最小深度执行最小池化操作,得到图6右侧的池化后的深度图,该深度图中每一网格的最小深度等于原始深度图中以该网格为中心的5×5个网格的最小深度,将该最小深度设置为表征该网格(即该场景区域)的sMPI的起始深度。即实现了根据所述sMPI所表征场景区域和所述sMPI所表征场景区域的相邻区域所共同组成的区域的最小深度确定所述sMPI的起始深度的运算。
在本公开一示例性的实施例中,所述sMPI包括在所表征场景区域的不同深度采样得到的多个层,所述多个层均包括颜色图和透明度图。
本实施例中,采用以下步骤来生成PMPI:
步骤一,确定划分成的多个所述sMPI的起始深度和终止深度,其中,多个所述sMPI的起始深度至少根据各自所表征场景区域的深度信息分别确定,多个所述sMPI的终止深度设置为相同;
可以将PMPI视为普通MPI,按照普通MPI的终止深度的设置方式为PMPI中的多个sMPI共同设置一个终止深度。
步骤二,对多个所述sMPI中的每一sMPI,根据该sMPI的起始深度和终止深度,以及该sMPI的层数和层的分布规则确定该sMPI包括的每一层的深度,在该sMPI所表征场景区域的所述每一层的深度处采样,得到该sMPI包括的每一层的颜色图和透明度图。
本实施例中,多个所述sMPI与多个所述场景区域一一对应,多个所述sMPI的层数和层的分布规则设置为相同,以简化处理,提高编码效率。所述层的分布规则如可以是等间距分布或等视距分布。但本公开不局限于此,在其他实施例中,PMPI中多个sMPI的层数和层的分布规则也可以不同,此时会增加一些编码复杂度,但可以更为灵活地表征三维场景。
本实施例中,所述多个层可以为平面层或球面层。
在本公开一示例性的实施例中,所述将三维场景划分成多个场景区域,包括:根据预设的场景划分规则将三维场景划分成多个场景区域,其中,根据所述场景划分规则可以确定划分成的多个场景区域的以下一种或任意组合信息:场景区域的个数、场景区域的形状、场景区域的大小、场景区域的位置;其中,多个所述场景区域的大小相同或不同,多个所述场景区域的形状为规则形状或不规则形状中的一种或者组合,所述规则形状包括三角形、矩形、五边形、六边形中的一种或任意组合。
在本实施例的一个示例中,所述根据预设的场景划分规则将所述三维场景划分成多个场景区域,包括:使用M×N的网格将所述三维场景划分为M×N个场景区域,M,N为正整数,且M×N≥2。该示例中,划分出的M×N个场景区域为大小相同的矩形区域。这种划分方式根据像素的坐标即容易确定PMPI中每一像素所在的场景区域(如查表或者通过简单的公式计算),可以不需要对像素所在的场景区域或者所属的sMPI加以额外标识。但本公开并不局限于此种划分方式。
在本公开一示例性的实施例中,所述生成的PMPI的原始存储数据包括PMPI帧的帧参数和帧数据;
其中,所述原始存储数据中的PMPI帧的帧参数包括以下参数中的一种或任意组合:
PMPI帧的分辩率;
PMPI帧中sMPI的个数;
为PMPI帧中sMPI统一设置的层数;
为PMPI帧中sMPI统一设置的层的分布规则;
为PMPI帧中sMPI统一设置的终止深度;
使用M×N网格划分三维场景时M,N的取值;
其中,所述原始存储数据中的每一PMPI帧的帧数据包括:该PMPI帧包括的每一sMPI中每一层的颜色图数据和透明度图数据。
相比于普通MPI,本公开实施例生成的PMPI的起始深度更加灵活多变,可以在场景不同区域的景深变化时自适应地变化。由此产生的结果是PMPI的采样点聚集于场景的可见表面,采样效率得到提升。在PMPI中sMPI的层数与普通MPI的层数相同的情况下,PMPI的 平面层分布在总体上更为密集,效果上相当于提供了更多层数的普通MPI,但采样点数没有增加。可参见图7,更密集的深度层使得根据PMPI生成的最终的沉浸视频的细节保留更多,质量更好。
MPI经过视频压缩后,可以展示为沉浸视频。图8展示了相应的视频处理过程。在编码端,视频采集装置采集的三维场景图像(如3D相机拍摄的图像)经过预处理得到MPI,MPI经压缩编码后作为码流传输。在解码端,对码流解码和后处理,以沉浸视频形式显示和播放。
在动态图像专家组(MPEG:Moving Picture Experts Group)关于沉浸视频的标准MPEG-I中,MPI的颜色图和透明度图等的原始存储(Raw storge)数据经过压缩之后得到的封装压缩存储(PCS)数据以及参考视点相机参数等图像数据可以作为MPEG中的沉浸式视频测试模型(TMIV:Test model of immersive video)的输入。MPI在输入TMIV之前,需要对其进行预处理,将其进行转换为PCS数据。
以图像分辨率为W×H,层数为S的MPI,即尺寸为W×H×S的MPI帧为例,可以转化为PCS数据形式,PCS数据中记录了MPI中每一像素的相关参数,包括:
N i,j:像素(i,j)的有效层的个数;
C i,j,k:像素(i,j)的第k个有效层位置处的颜色数据如颜色值;
D i,j,k:像素(i,j)的第k个有效层的索引(index)(D i,j,k∈[1,S]);
T i,j,k:像素(i,j)的第k个有效层位置处的透明度值。
对于普通MPI而言,像素(i,j)包含在MPI的S个层中,像素(i,j)的透明度值不为0的层为像素(i,j)的有效层。
图9所示是按照上述参数封装的MPI的PCS数据的一个示例。
可以看出,尺寸为W×H×S的MPI的原始存储数据并不会完全保留到PCS数据中。在实际情况中,一个像素(pixel)在一些平面层的值是无效的(即该像素是完全透明的,没有有效信息)。因而对于像素(i,j),只需要将像素(i,j)在S个平面层的N i,j个有效平面层的信息保留下来即可。值得注意的是,每一个像素对应的有效平面层数是不确定的。显然,压缩后的PCS数据减少了MPI占用的存储空间。除此之外,PCS数据的存储方式减少了后续解码过程中的存储器访问次数。已知MPI每个平面的尺寸W×H,只需要两次存储器访问操作即可将整个MPI帧读取到内存中。
对于普通MPI而言,在设定时间内的多个MPI帧的起始深度和终止深度是相同的,且MPI中多个层的分布规则是已知的。根据起始深度和终止深度就可以计算得到MPI中每一层的深度。MPI帧的起始深度和终止深度可以记录在MPI帧的帧参数中,不需要写入到单个MPI帧的PCS数据中,因此单个MPI的PCS数据中并不需要额外地记录像素的有效层的深度信息。
PMPI与普通MPI一样,也可以作为视频帧进行压缩编码,PMPI可以根据三维场景的图像直接生成,也可以在普通MPI的基础上生成。对PMPI编码之前,也需要将PMPI的原始存储数据转换为PCS数据。
如上文可知,PMPI包括多个sMPI,每一sMPI包括的多个层中,离参考视点最近的一个层的深度为该sMPI的起始深度,离参考视点最远的一个层的深度为该sMPI的终止深度,其他层的深度介于该sMPI的起始深度和终止深度之间,多个层也可以按照设定的规则例如等间距或等视距分布。因此在获知sMPI的起始深度和终止深度后,可以计算出sMPI每一层的深度。不同PMPI中的sMPI的终止深度设置为相同。但是不同PMPI中sMPI的起始深度 与其表征的场景区域的深度信息有关,并不是预先设定的和固定不变的。因此将本公开实施例PMPI的原始存储数据转化为PCS数据时,需要提供起始深度信息以使解码端准确计算出像素的有效层的深度。
为此,本公开一实施例提供了一种多平面图像的数据处理方法,如图10所示,包括:
步骤410,获取分块多平面图像PMPI的原始存储数据,所述PMPI包括多个子多平面图像sMPI以分别表征三维场景划分成的多个场景区域;
步骤420,将所述PMPI的原始存储数据转换为封装压缩存储PCS数据,所述PCS数据用于确定所述PMPI中像素的有效层的深度及像素在有效层上的颜色和透明度。
本公开一示例性的实施例中,所述PMPI采用如本公开任一实施例所述的生成方法生成,所述PMPI中的每一像素包含在一个sMPI中,且该sMPI包括的多个层均记录有该像素的颜色值和透明度值。普通PMI中的像素包含在普通PMI的所有层中,而PMPI是分块的,因此PMPI中的像素包含在一个sMPI的所有层中,文中将包含该像素的sMPI称为该像素所在的sMPI。在像素所在sMPI的所有层中均记录有该像素的颜色值和透明度值,但这些层中可能只有部分是该像素的有效层。在MPI的相关视频标准(如MPEG的沉浸式视频的相关标准)中,MPI中像素的有效层可以是MPI中该像素的透明度大于设定阈值(如0)的层。本公开实施例PMPI中像素的有效层可以遵循上述标准中的规定,例如,PMPI中像素的有效层指PMPI中包含该像素的sPMI中该像素的透明度大于设定阈值(如0)的层。一个像素的有效层可以有一层或多层,根据实际场景而定。
本公开一示例性的实施例中,所述PCS数据包括PMPI帧的帧数据和帧参数。
在本实施例的一个示例中,所述PCS数据中一个PMPI帧的帧数据包括:
该PMPI帧中每一sMPI的起始深度;及
该PMPI帧中每一像素的以下数据:该像素在每一有效层上的颜色数据、透明度数据和该有效层在该像素所在sMPI中的层索引。
在本实施例的另一示例中,所述PCS数据中一个PMPI帧的帧数据包括:
该PMPI帧中每一sMPI的起始深度;及
该PMPI帧中每一像素的以下数据:该像素所在sMPI的索引,该像素在每一有效层上的颜色数据、透明度数据和该有效层在该像素所在sMPI中的层索引。
在本实施例的又一示例中,所述PCS数据中一个PMPI帧的帧数据包括:该PMPI帧中每一像素的以下数据,该像素所在sMPI的起始深度;及该像素在每一有效层上的颜色数据、透明度数据和该有效层在该像素所在sMPI中的层索引。
在上述三个示例中,所述PCS数据中一个PMPI帧的帧数据均可以增加一个参数即:该PMPI帧中每一像素的有效层的个数。增加该参数有利于提高数据编码和解析的效率。
在本实施例的一个示例中,所述PCS数据还包括PMPI帧的帧参数,所述PCS数据中的PMPI帧的帧参数包括以下参数中的一种或任意组合:
PMPI帧的分辩率;
PMPI帧中sMPI的个数;
为PMPI帧中sMPI统一设置的层数;
为PMPI帧中sMPI统一设置的层的分布规则;
为PMPI帧中sMPI统一设置的终止深度;
使用M×N网格划分三维场景时M,N的取值。
本示例PMPI帧的帧参数可以适用于图11、图12和图13所示的实施例,以下不再重复说明。
本公开一示例性的实施例中,提出了第一种适用于本公开实施例PMPI的PCS数据格式,所述PCS数据中一个PMPI的帧数据包括:
所述PMPI中每一sMPI的起始深度;及
所述PMPI中每一像素的以下参数:
该像素的有效层的个数;
该像素在每一有效层上的颜色数据、透明度数据和该有效层在该像素所在sMPI中的层索引。
在本实施例中,假定PMPI的图像分辨率为W×H,采用M×N网格划分,PMPI包括的sMPI的个数为M×N,每一sMPI的层数均为S。则该PMPI的PCS数据格式如图11所示,PMPI中一个PMPI帧的帧数据包括:
DP x,y:用于表征网格(x,y)所代表场景区域的sMPI的起始深度,x∈[1,M],y∈[1,N];
N i,j:像素(i,j)的有效层的个数,i∈[1,H],j∈[1,W];
C i,j,k:像素(i,j)的第k个有效层位置处的颜色数据如颜色值;
D i,j,k:像素(i,j)的第k个有效层的索引(index),D i,j,k∈[1,S];
T i,j,k:像素(i,j)的第k个有效层位置处的透明度数据如透明度值。
本实施例在PMPI帧的帧数据中写入了PMPI帧中每一个sMPI的起始深度,而像素(i,j)所在的sMPI可以根据i,j和划分规则确定,结合帧参数中的sMPI的终止深度、层数和层的分布规则,可以计算出像素(i,j)所在sMPI的每一层的深度,再根据像素(i,j)的所有有效层的索引,就可以确定出像素(i,j)的每一个有效层的深度,用于后续的编码处理。
本公开一示例性的实施例中,提出了第二种适用于本公开实施例PMPI的PCS数据格式,所述PCS数据中一个PMPI的帧数据包括:
所述PMPI中每一sMPI的起始深度;及
所述PMPI中每一像素的以下参数:
该像素的有效层的个数;
该像素所在sMPI的索引;
该像素在每一有效层上的颜色数据、透明度数据和该有效层在该像素所在sMPI中的层索引。
在本实施例中,假定PMPI的图像分辨率为W×H,划分成的sMPI的个数为M,每一sMPI的层数均为S。则该PMPI的PCS数据格式如图12所示,PMPI中一个PMPI帧的帧数据包括:
DP m:第m个sMPI的起始深度,m∈[1,M];
N i,j:像素(i,j)的有效层的个数,i∈[1,H],j∈[1,W];
I i,j:像素(i,j)所在的sMPI的索引;
C i,j,k:像素(i,j)的第k个有效层位置处的颜色数据如颜色值;
D i,j,k:像素(i,j)的第k个有效层的索引(index),D i,j,k∈[1,S];
T i,j,k:像素(i,j)的第k个有效层位置处的透明度数据如透明度值。
本实施例与上一实施例相比。在PMPI帧的帧数据中即写入了PMPI帧中每一个sMPI的起始深度,也写入了像素(i,j)所在的sMPI的索引,因而即可以适用于对三维场景用网格划分的情况,也可以适用于对三维场景非网格划分的情况。
本公开一示例性的实施例中,提出了第二种适用于本公开实施例PMPI的PCS数据格式,所述PCS数据中一个PMPI帧的帧数据包括该PMPI中每一像素的以下参数:
该像素的有效层的个数;
该像素所在sMPI的起始深度;
该像素在每一有效层上的颜色数据、透明度数据和该有效层在该像素所在sMPI中的层索引。
在本实施例中,假定PMPI的图像分辨率为W×H,PMPI包括的sMPI的个数为M,每一sMPI的层数均为S。则该PMPI的PCS数据格式如图13所示,PMPI中一个PMPI帧的帧数据包括以下参数:
N i,j:像素(i,j)的有效层的个数,i∈[1,H],j∈[1,W];
E i,j:像素(i,j)所在sMPI的起始深度;
C i,j,k:像素(i,j)的第k个有效层位置处的颜色数据如颜色值;
D i,j,k:像素(i,j)的第k个有效层的索引(index),D i,j,k∈[1,S];
T i,j,k:像素(i,j)的第k个有效层位置处的透明度值。
在本实施例中,所述多个sMPI的起始深度的数据表现为PMPI中每一像素所在sMPI的起始深度,即像素所在sMPI的起始深度直接写入到PMPI帧的帧数据中,其便于确定像素的有效层的深度,但可能会影响编码的效率。
PMPI的PCS数据用于确定所述PMPI中像素的有效层的深度及像素在有效层上的颜色和透明度的参数,除了上述实施例提供的数据格式外,也可以采用其他的数据格式,本公开对此不做局限。
以上适用于PMPI的PCS数据格式通过在PCS数据中添加sMPI的起始深度的相关信息,使得解码端可以结合sMPI的起始深度计算出像素的有效层的深度,从而准确恢复出PMPI的图像。
图14所示是一个可以用于本公开实施例的MPI编码装置的架构图,MPI编码装置10的输入数据是源MPI(如PMPI)的PCS数据,PCS数据包括但不限于图像参数(View parameters)(也可以称为视图参数,如参考视点相机参数等)、纹理属性部分(Texture Attribute component)的数据和透明度属性部分(Transparency Attribute component)的数据等。
如图14所示,MPI编码装置10包括:
MPI掩膜生成(Create mask from MPI)单元101设置为根据输入数据生成MPI掩膜。在一个示例中,可以根据透明度的阈值对MPI层中的像素点(也可称为采样点)进行筛选,得到每一层的掩膜(mask)。这是为了将每一层上透明度大的位置(也可称为像素)和透明度小的位置(也可称为像素)进行区分,将透明度大的位置屏蔽掉,以减少数据量。MPI掩膜生成单元101对一段时间(intra-period)内的所有MPI帧实施上述操作。假设MPI帧尺寸为 W×H×S,一段时间(intra-period)内包含的帧数为M。则经过MPI掩膜生成单元101处理之后,得到M个W×H×S的掩膜(mask)。
MPI掩膜聚合(Aggregate MPI masks)单元103设置为对M个W×H×S的掩膜中位于相同层上的多个掩膜取并集,得到一个W×H×S的掩膜。
有效像素聚类(Cluster Active pixels)单元105设置为将每一层的掩膜中透明度大于阈值的区域(有效信息区域)聚类为一系列的簇(cluster);
簇分割(Split Clusters)单元107设置为将有效像素聚类单元105聚类得到的簇进行分割,得到经过分割处理后的簇;
块封装(Pack patches)单元109设置为将每一个块(patch,如包含簇的矩形区域)对应的纹理图和透明度图重新组合成一张图,编码为图集(atlas)数据进行传输。
视频数据生成(Generate video data)单元111设置为根据块封装单元109输出的atlas数据生成视频数据进行传输,所述视频数据包括纹理属性视频数据(Texture attribute video data(raw))、透明度属性视频数据(Transparency attribute video data(raw))等。
参数编码单元113设置为根据源MPI数据编码,得到编码后图像参数,所述编码后图像参数可以包括图像参数列表(View parameters list)、参数集(Parameter set)等。
基于上述编码装置架构对MPI编码时,先将MPI中的采样点根据透明度的阈值进行筛选,得到每个平面层的掩膜(mask)。假设MPI的尺寸为W×H×S,设定的一段时间(intra-period)内包含的帧数为M,对所述一段时间内的所有MPI帧实施上述操作,得到M个W×H×S的掩膜(mask)。接着将相同平面层上的掩膜取并集,即得到一个W×H×S的掩膜。再将每一层的掩膜中透明度大于阈值的区域(有效信息区域)通过聚类、分割为一系列簇(cluster)。Cluster经过融合、分解等步骤得到小的块(patch)。然后将每一个块patch对应的纹理图(即颜色图)和透明度图分别重新组合成一张图,编码为atlas数据进行传输。
本公开一实施例提供了一种多平面图像的编码方法,可用于PMPI的编码,如图15所示,所述编码方法包括:
步骤510,接收PMPI的PCS数据,所述PMPI包括多个sMPI以分别表征三维场景划分成的多个场景区域,所述PCS数据包括图像参数,及纹理属性部分和透明度属性部分的数据;
步骤520,基于所述PCS数据对所述PMPI进行编码,得到编码后的图像参数和atlas数据。
本实施例中,所述PCS数据包含sMPI的起始深度信息,可以封装在图像参数和/或纹理属性部分和透明度属性部分的数据中,编码时将sMPI的起始深度信息写入码流,可以封装在编码后的图像参数和/或atlas数据中。
在本公开一示例性的实施例中,所述基于所述PCS数据对所述PMPI进行编码,包括:对所述PMPI包括的多个sMPI分别进行编码处理,其中,每一sMPI的编码可以按照与普通MPI(即表征整个三维场景)相同的编码方式进行编码处理,普通MPI的编码方式可以遵循相关标准中的规定。
在本公开一示例性的实施例中,所述PMPI的PCS数据按照本公开任一实施例所述的数据处理方法、从所述PMPI的原始存储数据转换得到。
在本公开一示例性的实施例中,
所述PCS数据中的图像参数和所述编码后的图像参数均包括以下数据中的至少一种:所述PCS数据中PMPI帧的部分或全部帧参数、PMPI帧中每一sMPI的起始深度;
所述PCS数据中的纹理属性部分和透明度属性部分的数据,包括所述PCS数据中PMPI 帧的部分或全部帧数据,所述纹理属性部分的数据包括颜色数据;
所述atlas数据包括编码时确定的块(patch)的数据和参数,所述数据包括颜色数据和透明度数据,所述参数包括以下一种或任意组合:数据所属层的标识信息、数据所属层的起始深度、数据所属sMPI的标识信息、数据所属sMPI的起始深度和数据所属PMPI的标识信息。
在本公开实施例中,PMPI中的sMPI的起始深度可以写入到编码后的图像参数和/或atlas数据中。
本公开一实施例提供了一种多平面图像的解码方法,如图16所示,包括:
步骤610,接收分块多平面图像PMPI的编码码流,所述编码码流中包括PMPI的图像参数和atlas数据;
本实施例中,编码码流中的所述PMPI的图像参数和/或图集数据包含sMPI的起始深度信息。
步骤620,对所述编码码流进行解码,获取所述PMPI的图像参数,及纹理属性部分和透明度属性部分的数据;
其中,所述PMPI包括多个子多平面图像SMPI以分别表征三维场景划分成的多个场景区域。
在本公开一示例性的实施例中,所述PMPI的图像参数包括以下参数中的一种或任意组合:
PMPI帧的分辩率;
PMPI帧中sMPI的个数;
为PMPI帧中sMPI统一设置的层数;
为PMPI帧中sMPI统一设置的层的分布规则;
为PMPI帧中sMPI统一设置的终止深度;
使用M×N网格划分三维场景时M,N的取值;
PMPI帧中每一sMPI的起始深度。
在本公开一示例性的实施例中,所述atlas数据包括编码时确定的块的数据和参数,所述数据包括颜色数据和透明度数据,所述参数包括以下一种或任意组合:数据所属层的标识信息、数据所属层的起始深度、数据所属sMPI的标识信息、数据所属sMPI的起始深度和数据所属PMPI的标识信息。
本公开一实施例还提供了一种码流,其中,所述码流通过对分块多平面图像PMPI编码生成,所述码流中包括所述PMPI的图像参数和atlas数据;所述PMPI包括多个子多平面图像sMPI以分别表征三维场景划分成的多个场景区域。本实施例中,码流中所述PMPI的图像参数和/或图集数据包含sMPI的起始深度信息。
在本公开一示例性的实施例中,所述PMPI的图像参数包括以下参数中的一种或任意组合:
PMPI帧的分辩率;
PMPI帧中sMPI的个数;
为PMPI帧中sMPI统一设置的层数;
为PMPI帧中sMPI统一设置的层的分布规则;
为PMPI帧中sMPI统一设置的终止深度;
使用M×N网格划分三维场景时M,N的取值;
PMPI帧中每一sMPI的起始深度。
在本公开一示例性的实施例中,所述atlas数据包括编码时确定的块的数据和参数,所述数据包括颜色数据和透明度数据,所述参数包括以下一种或任意组合:数据所属层的标识信息、数据所属层的起始深度、数据所属sMPI的标识信息、数据所属sMPI的起始深度和数据所属PMPI的标识信息。
本公开一实施例还提供了一种多平面图像的生成装置,如图17所示,包括处理器5以及存储有可在所述处理器5上运行的计算机程序的存储器6,其中,所述处理器5执行所述计算机程序时实现如本公开任一实施例所述的多平面图像的生成方法。
本公开一实施例还提供了一种多平面图像数据处理装置,也可参见图17,包括处理器以及存储有计算机程序的存储器,其中,所述处理器执行所述计算机程序时实现如本公开任一实施例所述的多平面图像的数据处理方法。
本公开一实施例还提供了一种多平面图像的编码装置,也可参见图17,包括处理器以及存储有计算机程序的存储器,其中,所述处理器执行所述计算机程序时实现如本公开任一实施例所述的多平面图像的编码方法。
本公开一实施例还提供了一种多平面图像的解码装置,也可参见图17,包括处理器以及存储有计算机程序的存储器,其中,所述处理器执行所述计算机程序时实现如本公开任一实施例所述的多平面图像的解码方法。
本公开一实施例还提供了一种非瞬态计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其中,所述计算机程序时被处理器执行时实现如本公开任一实施例所述的多平面图像的生成方法、数据处理方法、编码方法或解码方法。
在一个或多个示例性实施例中,所描述的功能可以硬件、软件、固件或其任一组合来实施。如果以软件实施,那么功能可作为一个或多个指令或代码存储在计算机可读介质上或经由计算机可读介质传输,且由基于硬件的处理单元执行。计算机可读介质可包含对应于例如数据存储介质等有形介质的计算机可读存储介质,或包含促进计算机程序例如根据通信协议从一处传送到另一处的任何介质的通信介质。以此方式,计算机可读介质通常可对应于非暂时性的有形计算机可读存储介质或例如信号或载波等通信介质。数据存储介质可为可由一个或多个计算机或者一个或多个处理器存取以检索用于实施本公开中描述的技术的指令、代码和/或数据结构的任何可用介质。计算机程序产品可包含计算机可读介质。
举例来说且并非限制,此类计算机可读存储介质可包括RAM、ROM、EEPROM、CD-ROM或其它光盘存储装置、磁盘存储装置或其它磁性存储装置、快闪存储器或可用来以指令或数据结构的形式存储所要程序代码且可由计算机存取的任何其它介质。而且,还可以将任何连接称作计算机可读介质举例来说,如果使用同轴电缆、光纤电缆、双绞线、数字订户线(DSL)或例如红外线、无线电及微波等无线技术从网站、服务器或其它远程源传输指令,则同轴电缆、光纤电缆、双纹线、DSL或例如红外线、无线电及微波等无线技术包含于介质的定义中。然而应了解,计算机可读存储介质和数据存储介质不包含连接、载波、信号或其它瞬时(瞬态)介质,而是针对非瞬时有形存储介质。如本文中所使用,磁盘及光盘包含压缩光盘(CD)、激光光盘、光学光盘、数字多功能光盘(DVD)、软磁盘或蓝光光盘等,其中磁盘通常以磁性方式再生数据,而光盘使用激光以光学方式再生数据。上文的组合也应包含在计算机可读介质的范围内。
可由例如一个或多个数字信号理器(DSP)、通用微处理器、专用集成电路(ASIC)现场可编程逻辑阵列(FPGA)或其它等效集成或离散逻辑电路等一个或多个处理器来执行指令。因此,如本文中所使用的术语“处理器”可指上述结构或适合于实施本文中所描述的技术的任一其它结构中的任一者。另外,在一些方面中,本文描述的功能性可提供于经配置以用于编码和解码的专用硬件和/或软件模块内,或并入在组合式编解码器中。并且,可将所述技术完全实施于一个或多个电路或逻辑元件中。
本公开实施例的技术方案可在广泛多种装置或设备中实施,包含无线手机、集成电路(IC)或一组IC(例如,芯片组)。本公开实施例中描各种组件、模块或单元以强调经配置以执行所描述的技术的装置的功能方面,但不一定需要通过不同硬件单元来实现。而是,如上所述,各种单元可在编解码器硬件单元中组合或由互操作硬件单元(包含如上所述的一个或多个处理器)的集合结合合适软件和/或固件来提供。

Claims (30)

  1. 一种多平面图像的生成方法,包括:
    将三维场景划分成多个场景区域;及
    生成分块多平面图像PMPI,所述PMPI包括用于分别表征多个所述场景区域的多个子多平面图像sMPI,所述sMPI的起始深度至少根据所述sMPI所表征场景区域的深度信息确定。
  2. 根据权利要求1所述的生成方法,其中:
    所述sMPI的起始深度至少根据所述sMPI所表征场景区域的深度信息确定,包括:
    所述sMPI的起始深度根据第一区域的最小深度确定,所述第一区域为所述sMPI所表征场景区域,或者所述第一区域为所述sMPI所表征场景区域和所述sMPI所表征场景区域的相邻区域所共同组成的区域。
  3. 根据权利要求2所述的生成方法,其中:
    所述sMPI的起始深度根据第一区域的最小深度确定,包括:所述sMPI的起始深度设置为所述第一区域的最小深度。
  4. 根据权利要求2或3所述的生成方法,其中:
    所述sMPI所表征场景区域的相邻区域包括所述三维场景中位于所述sMPI所表征场景区域周边的一个或多个场景区域。
  5. 根据权利要求1所述的生成方法,其中:
    所述sMPI包括在所表征场景区域的不同深度采样得到的多个层,所述多个层均包括颜色图和透明度图。
  6. 根据权利要求5所述的生成方法,其中:
    所述生成PMPI,包括:
    确定多个所述sMPI的起始深度和终止深度,其中,多个所述sMPI的起始深度至少根据各自所表征场景区域的深度信息分别确定,多个所述sMPI的终止深度设置为相同;
    对多个所述sMPI中的每一sMPI,根据该sMPI的起始深度和终止深度,以及该sMPI的层数和层的分布规则确定该sMPI包括的每一层的深度,在该sMPI所表征场景区域的所述每一层的深度处采样,得到该sMPI包括的每一层的颜色图和透明度图。
  7. 根据权利要求6所述的生成方法,其中:
    多个所述sMPI与多个所述场景区域一一对应,多个所述sMPI的层数和层的分布规则设置为相同。
  8. 根据权利要求1所述的生成方法,其中:
    所述将三维场景划分成多个场景区域,包括:根据预设的场景划分规则将三维场景划分成多个场景区域,其中,根据所述场景划分规则可以确定划分成的多个场景区域的以下一种或任意组合信息:场景区域的个数、场景区域的形状、场景区域的大小、场景区域的 位置;
    其中,多个所述场景区域的大小相同或不同,多个所述场景区域的形状为规则形状或不规则形状中的一种或者组合,所述规则形状包括三角形、矩形、五边形、六边形中的一种或任意组合。
  9. 根据权利要求1或8所述的生成方法,其中:
    所述根据预设的场景划分规则将所述三维场景划分成多个场景区域,包括:使用M×N的网格将所述三维场景划分为M×N个场景区域,M,N为正整数,且M×N≥2。
  10. 根据权利要求1所述的生成方法,其中:
    所述生成的PMPI的原始存储数据包括PMPI帧的帧参数和帧数据;
    其中,所述原始存储数据中的PMPI帧的帧参数包括以下参数中的一种或任意组合:
    PMPI帧的分辩率;
    PMPI帧中sMPI的个数;
    为PMPI帧中sMPI统一设置的层数;
    为PMPI帧中sMPI统一设置的层的分布规则;
    为PMPI帧中sMPI统一设置的终止深度;
    使用M×N网格划分三维场景时M,N的取值;
    其中,所述原始存储数据中的每一PMPI帧的帧数据包括:该PMPI帧包括的每一sMPI中每一层的颜色图数据和透明度图数据。
  11. 一种多平面图像的数据处理方法,包括:
    获取分块多平面图像PMPI的原始存储数据,所述PMPI包括多个子多平面图像sMPI以分别表征三维场景划分成的多个场景区域;
    将所述PMPI的原始存储数据转换为封装压缩存储PCS数据,所述PCS数据用于确定所述PMPI中像素的有效层的深度及像素在有效层上的颜色和透明度。
  12. 根据权利要求11所述的数据处理方法,其中:
    所述PMPI采用如权利要求1至10中任一所述的生成方法生成。
  13. 根据权利要求11所述的数据处理方法,其中:
    所述PCS数据包括PMPI帧的帧数据,所述PCS数据中一个PMPI帧的帧数据包括:
    该PMPI帧中每一sMPI的起始深度;及该PMPI帧中每一像素的以下数据:该像素在每一有效层上的颜色数据、透明度数据和该有效层在该像素所在sMPI中的层索引;或者
    该PMPI帧中每一sMPI的起始深度;及该PMPI帧中每一像素的以下数据:该像素所在sMPI的索引,该像素在每一有效层上的颜色数据、透明度数据和该有效层在该像素所在sMPI中的层索引;或者
    该PMPI帧中每一像素的以下数据,该像素所在sMPI的起始深度;及该像素在每一有效层上的颜色数据、透明度数据和该有效层在该像素所在sMPI中的层索引。
  14. 根据权利要求13所述的数据处理方法,其中:
    所述PCS数据中的一个PMPI帧的帧数据还包括:该PMPI帧中每一像素的有效层的个数。
  15. 根据权利要求11或12或13所述的数据处理方法,其中:
    所述PCS数据还包括PMPI帧的帧参数,所述PCS数据中的PMPI帧的帧参数包括以下参数中的一种或任意组合:
    PMPI帧的分辩率;
    PMPI帧中sMPI的个数;
    为PMPI帧中sMPI统一设置的层数;
    为PMPI帧中sMPI统一设置的层的分布规则;
    为PMPI帧中sMPI统一设置的终止深度;
    使用M×N网格划分三维场景时M,N的取值。
  16. 一种多平面图像的编码方法,包括:
    接收分块多平面图像PMPI的封装压缩存储PCS数据,所述PMPI包括多个子多平面图像sMPI以分别表征三维场景划分成的多个场景区域,所述PCS数据包括图像参数,及纹理属性部分和透明度属性部分的数据;
    基于所述PCS数据对所述PMPI进行编码,得到编码后的图像参数和图集数据。
  17. 根据权利要求16所述的编码方法,其中:
    所述基于所述PCS数据对所述PMPI进行编码,包括:对所述PMPI包括的多个sMPI分别进行编码处理。
  18. 根据权利要求16或17所述的编码方法,其中:
    所述PMPI的PCS数据按照如权利要求11至15中任一项所述的数据处理方法、从所述PMPI的原始存储数据转换得到。
  19. 根据权利要求18所述的编码方法,其中:
    所述PCS数据中的图像参数和所述编码后的图像参数均包括以下数据中的至少一种:所述PCS数据中PMPI帧的部分或全部帧参数、PMPI帧中每一sMPI的起始深度;
    所述PCS数据中的纹理属性部分和透明度属性部分的数据,包括所述PCS数据中PMPI帧的部分或全部帧数据,所述纹理属性部分的数据包括颜色数据;
    所述图集数据包括编码时确定的块的数据和参数,所述数据包括颜色数据和透明度数据,所述参数包括以下一种或任意组合:数据所属层的标识信息、数据所属层的起始深度、数据所属sMPI的标识信息、数据所属sMPI的起始深度和数据所属PMPI的标识信息。
  20. 一种多平面图像的解码方法,包括:
    对分块多平面图像PMPI的编码码流进行解码,获取所述PMPI的图像参数,及纹理属性部分和透明度属性部分的数据;
    其中,所述PMPI包括多个子多平面图像sMPI以分别表征三维场景划分成的多个场 景区域,所述编码码流中包括PMPI的图像参数和图集数据。
  21. 根据权利要求20所述的解码方法,其中:
    所述PMPI的图像参数包括以下参数中的一种或任意组合:
    PMPI帧的分辩率;
    PMPI帧中sMPI的个数;
    为PMPI帧中sMPI统一设置的层数;
    为PMPI帧中sMPI统一设置的层的分布规则;
    为PMPI帧中sMPI统一设置的终止深度;
    使用M×N网格划分三维场景时M,N的取值;
    PMPI帧中每一sMPI的起始深度。
  22. 根据权利要求20或21所述的解码方法,其中:
    所述图集数据包括编码时确定的块的数据和参数,所述数据包括颜色数据和透明度数据,所述参数包括以下一种或任意组合:数据所属层的标识信息、数据所属层的起始深度、数据所属sMPI的标识信息、数据所属sMPI的起始深度和数据所属PMPI的标识信息。
  23. 一种码流,其中,所述码流通过对分块多平面图像PMPI编码生成,所述码流中包括所述PMPI的图像参数和图集数据;所述PMPI包括多个子多平面图像sMPI以分别表征三维场景划分成的多个场景区域。
  24. 根据权利要求23所述的码流,其中:
    所述PMPI的图像参数包括以下参数中的一种或任意组合:
    PMPI帧的分辩率;
    PMPI帧中sMPI的个数;
    为PMPI帧中sMPI统一设置的层数;
    为PMPI帧中sMPI统一设置的层的分布规则;
    为PMPI帧中sMPI统一设置的终止深度;
    使用M×N网格划分三维场景时M,N的取值;
    PMPI帧中每一sMPI的起始深度。
  25. 根据权利要求23或24所述的码流,其中:
    所述图集数据包括编码时确定的块的数据和参数,所述数据包括颜色数据和透明度数据,所述参数包括以下一种或任意组合:数据所属层的标识信息、数据所属层的起始深度、数据所属sMPI的标识信息、数据所属sMPI的起始深度和数据所属PMPI的标识信息。
  26. 一种多平面图像的生成装置,包括处理器以及存储有计算机程序的存储器,其中,所述处理器执行所述计算机程序时实现如权利要求1至10中任一所述的多平面图像的生成方法。
  27. 一种多平面图像的数据处理装置,包括处理器以及存储有计算机程序的存储器, 其中,所述处理器执行所述计算机程序时实现如权利要求11至15中任一所述的多平面图像的数据处理方法。
  28. 一种多平面图像的编码装置,包括处理器以及存储有计算机程序的存储器,其中,所述处理器执行所述计算机程序时实现如权利要求16至19中任一所述的多平面图像的编码方法。
  29. 一种多平面图像的解码装置,包括处理器以及存储有计算机程序的存储器,其中,所述处理器执行所述计算机程序时实现如权利要求20至22中任一所述的多平面图像的解码方法。
  30. 一种非瞬态计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其中,所述计算机程序时被处理器执行时实现如权利要求1至22中任一所述的方法。
PCT/CN2021/122390 2021-09-30 2021-09-30 多平面图像的生成、数据处理、编码和解码方法、装置 WO2023050396A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
KR1020247014022A KR20240089119A (ko) 2021-09-30 2021-09-30 멀티 플레인 이미지의 생성, 데이터 처리, 인코딩 및 디코딩 방법, 장치
PCT/CN2021/122390 WO2023050396A1 (zh) 2021-09-30 2021-09-30 多平面图像的生成、数据处理、编码和解码方法、装置
CN202180102720.7A CN117999582A (zh) 2021-09-30 2021-09-30 多平面图像的生成、数据处理、编码和解码方法、装置
US18/609,944 US20240223767A1 (en) 2021-09-30 2024-03-19 Method and apparatus for encoding and decoding multiplane image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/122390 WO2023050396A1 (zh) 2021-09-30 2021-09-30 多平面图像的生成、数据处理、编码和解码方法、装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/609,944 Continuation US20240223767A1 (en) 2021-09-30 2024-03-19 Method and apparatus for encoding and decoding multiplane image

Publications (1)

Publication Number Publication Date
WO2023050396A1 true WO2023050396A1 (zh) 2023-04-06

Family

ID=85781177

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/122390 WO2023050396A1 (zh) 2021-09-30 2021-09-30 多平面图像的生成、数据处理、编码和解码方法、装置

Country Status (4)

Country Link
US (1) US20240223767A1 (zh)
KR (1) KR20240089119A (zh)
CN (1) CN117999582A (zh)
WO (1) WO2023050396A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150156415A1 (en) * 2011-12-30 2015-06-04 Google Inc. Multiplane Panoramas of Long Scenes
US20200226816A1 (en) * 2019-01-14 2020-07-16 Fyusion, Inc. Free-viewpoint photorealistic view synthesis from casually captured video
CN112233165A (zh) * 2020-10-15 2021-01-15 大连理工大学 一种基于多平面图像学习视角合成的基线扩展实现方法
US20210250571A1 (en) * 2020-02-12 2021-08-12 At&T Intellectual Property I, L.P. Apparatus and method for providing content with multiplane image transcoding

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150156415A1 (en) * 2011-12-30 2015-06-04 Google Inc. Multiplane Panoramas of Long Scenes
US20200226816A1 (en) * 2019-01-14 2020-07-16 Fyusion, Inc. Free-viewpoint photorealistic view synthesis from casually captured video
US20210250571A1 (en) * 2020-02-12 2021-08-12 At&T Intellectual Property I, L.P. Apparatus and method for providing content with multiplane image transcoding
CN112233165A (zh) * 2020-10-15 2021-01-15 大连理工大学 一种基于多平面图像学习视角合成的基线扩展实现方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHU JUN, WU TONG, WANG LU : "Multi-Plane Detection Algorithm of Point Clouds Based on Volume Density Change Rate", JOURNAL OF COMPUTER APPLICATIONS, JISUANJI YINGYONG, CN, vol. 33, no. 5, 1 May 2013 (2013-05-01), CN , pages 1411 - 1415, XP093053259, ISSN: 1001-9081, DOI: 10.3724/SP.J.1087.2013.01411 *
LUVIZON DIOGO C.; CARVALHO GUSTAVO SUTTER P.; DOS SANTOS ANDREZA A.; CONCEICAO JHONATAS S.; FLORES-CAMPANA JOSE L.; DECKER LUIS G.: "Adaptive Multiplane Image Generation from a Single Internet Picture", 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 3 January 2021 (2021-01-03), pages 2555 - 2564, XP033926549, DOI: 10.1109/WACV48630.2021.00260 *

Also Published As

Publication number Publication date
KR20240089119A (ko) 2024-06-20
US20240223767A1 (en) 2024-07-04
CN117999582A (zh) 2024-05-07

Similar Documents

Publication Publication Date Title
JP6939883B2 (ja) 自由視点映像ストリーミング用の復号器を中心とするuvコーデック
US20220292730A1 (en) Method and apparatus for haar-based point cloud coding
CN112017228A (zh) 一种对物体三维重建的方法及相关设备
CN113852829A (zh) 点云媒体文件的封装与解封装方法、装置及存储介质
JP2022519462A (ja) ホモグラフィ変換を使用した点群符号化
KR20220011180A (ko) 체적 비디오 인코딩 및 디코딩을 위한 방법, 장치 및 컴퓨터 프로그램
US20220180567A1 (en) Method and apparatus for point cloud coding
WO2023272510A1 (zh) 多平面图像的生成、数据处理、编码和解码方法、装置
US11196977B2 (en) Unified coding of 3D objects and scenes
EP4162691A1 (en) A method, an apparatus and a computer program product for video encoding and video decoding
WO2023050396A1 (zh) 多平面图像的生成、数据处理、编码和解码方法、装置
US20230119830A1 (en) A method, an apparatus and a computer program product for video encoding and video decoding
CN116235497A (zh) 一种用于用信号通知基于多平面图像的体积视频的深度的方法和装置
US20230306683A1 (en) Mesh patch sub-division
WO2022257143A1 (zh) 帧内预测、编解码方法及装置、编解码器、设备、介质
US20240177355A1 (en) Sub-mesh zippering
US11727536B2 (en) Method and apparatus for geometric smoothing
EP4373089A1 (en) Data processing method and apparatus, computer, and readable storage medium
US20230040484A1 (en) Fast patch generation for video based point cloud coding
WO2023173237A1 (zh) 编解码方法、码流、编码器、解码器以及存储介质
WO2023173238A1 (zh) 编解码方法、码流、编码器、解码器以及存储介质
EP4360053A1 (en) Learning-based point cloud compression via unfolding of 3d point clouds
KR20240001203A (ko) Tearing transform을 통한 학습 기반 포인트 클라우드 압축
WO2023180841A1 (en) Mesh patch sub-division
Li et al. Point Cloud Compression: Technologies and Standardization

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21958967

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202180102720.7

Country of ref document: CN

ENP Entry into the national phase

Ref document number: 20247014022

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2021958967

Country of ref document: EP

Effective date: 20240430