WO2023050396A1 - 多平面图像的生成、数据处理、编码和解码方法、装置 - Google Patents
多平面图像的生成、数据处理、编码和解码方法、装置 Download PDFInfo
- Publication number
- WO2023050396A1 WO2023050396A1 PCT/CN2021/122390 CN2021122390W WO2023050396A1 WO 2023050396 A1 WO2023050396 A1 WO 2023050396A1 CN 2021122390 W CN2021122390 W CN 2021122390W WO 2023050396 A1 WO2023050396 A1 WO 2023050396A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- pmpi
- data
- smpi
- scene
- frame
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 70
- 238000012545 processing Methods 0.000 title claims abstract description 13
- 101001099381 Homo sapiens Peroxisomal biogenesis factor 19 Proteins 0.000 claims description 242
- 102100038883 Peroxisomal biogenesis factor 19 Human genes 0.000 claims description 242
- 229920000889 poly(m-phenylene isophthalamide) Polymers 0.000 claims description 242
- OJQSISYVGFJJBY-UHFFFAOYSA-N 1-(4-isocyanatophenyl)pyrrole-2,5-dione Chemical compound C1=CC(N=C=O)=CC=C1N1C(=O)C=CC1=O OJQSISYVGFJJBY-UHFFFAOYSA-N 0.000 claims description 240
- 238000004590 computer program Methods 0.000 claims description 34
- 238000003672 processing method Methods 0.000 claims description 15
- 238000007906 compression Methods 0.000 claims description 5
- 230000006835 compression Effects 0.000 claims description 4
- 238000005538 encapsulation Methods 0.000 claims description 3
- 230000001788 irregular Effects 0.000 claims description 2
- 238000005070 sampling Methods 0.000 abstract description 18
- 238000010586 diagram Methods 0.000 description 14
- 238000011176 pooling Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 3
- 238000013500 data storage Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000000873 masking effect Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/119—Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
- H04N19/463—Embedding additional information in the video signal during the compression process by compressing encoding parameters before transmission
Definitions
- Embodiments of the present disclosure relate to, but are not limited to, image processing technologies, and more specifically, relate to a method and device for generating, data processing, encoding, and decoding multi-plane images.
- Multiplane image is a non-redundant scene representation method.
- MPI decomposes the scene into a series of layers, which are planar layers or spherical layers.
- layers which are planar layers or spherical layers.
- the depth range [dmin,dmax] of MPI needs to be set in advance according to the depth of field data of the scene.
- dmin is the minimum depth, that is, the distance from the layer closest to the reference viewpoint to the reference viewpoint
- dmax is the maximum depth, that is, the farthest from the reference viewpoint.
- Each layer in MPI is divided into two parts: color map (Color frame) and transparency map (Transparency frame).
- the color map and transparency map of a layer contain the texture information and transparency information of the scene at the position of the plane layer respectively.
- MPI can be used for immersive video, but the effect needs to be improved.
- An embodiment of the present disclosure provides a method for generating a multi-plane image, including:
- the PMPI includes a plurality of sub-multi-plane images sMPI for respectively representing a plurality of the scene regions, the starting depth of the sMPI is determined at least according to the depth information of the scene regions represented by the sMPI .
- An embodiment of the present disclosure also provides a data processing method for a multi-plane image, including:
- the PMPI includes a plurality of sub-multi-plane images sMPI to respectively represent a plurality of scene areas into which the three-dimensional scene is divided;
- the original storage data of the PMPI is converted into encapsulated and compressed storage PCS data, and the PCS data is used to determine the depth of the effective layer of the pixel in the PMPI and the color and transparency of the pixel on the effective layer.
- An embodiment of the present disclosure also provides a method for encoding a multi-plane image, including:
- the PMPI includes a plurality of sub-multi-plane images sMPI to respectively represent a plurality of scene areas divided into three-dimensional scenes
- the PCS data includes image parameters, and texture attribute parts and Data in the Transparency property section;
- the PMPI is encoded based on the PCS data to obtain encoded image parameters and atlas data.
- An embodiment of the present disclosure also provides a decoding method for a multi-plane image, including:
- the PMPI includes a plurality of sub-multi-plane images sMPI to respectively represent multiple scene areas divided into three-dimensional scenes
- the encoded code stream includes image parameters and atlas data of the PMPI.
- An embodiment of the present disclosure also provides a code stream, wherein the code stream is generated by encoding a block multi-plane image PMPI, and the code stream includes image parameters and atlas data of the PMPI; the PMPI A plurality of sub-multiplanar images sMPI are included to respectively represent a plurality of scene regions into which the three-dimensional scene is divided.
- An embodiment of the present disclosure also provides a device for generating a multi-plane image, including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the computer program described in any embodiment of the present disclosure is implemented.
- a method for generating multiplanar images including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the computer program described in any embodiment of the present disclosure is implemented.
- An embodiment of the present disclosure also provides a multi-plane image data processing device, including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the computer program described in any embodiment of the present disclosure is implemented.
- a multi-plane image data processing device including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the computer program described in any embodiment of the present disclosure is implemented.
- An embodiment of the present disclosure also provides a multi-plane image encoding device, including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the implementation as described in any embodiment of the present disclosure Coding method for multi-planar images.
- An embodiment of the present disclosure also provides a multi-plane image decoding device, including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the multi-plane image as described in any embodiment of the present disclosure is implemented. Decoding method for planar images.
- An embodiment of the present disclosure also provides a non-transitory computer-readable storage medium, the computer-readable storage medium stores a computer program, wherein, when the computer program is executed by a processor, any implementation of the present disclosure can be realized.
- FIG. 1 is a schematic structural diagram of an exemplary MPI composed of four plane layers
- FIGS. 2A to 2F are schematic diagrams of six consecutive plane layers in an exemplary MPI, showing a color map and a transparency map of each plane layer;
- Fig. 3 is a schematic diagram of representing a three-dimensional scene using common MPI
- FIG. 4 is a schematic diagram of using PMPI to characterize a three-dimensional scene according to an embodiment of the present disclosure
- Fig. 5 is the flowchart of the PMPI generation method of an embodiment of the present disclosure
- FIG. 6 is a schematic diagram of determining the initial depth of sMPI through pooling according to an embodiment of the present disclosure
- FIG. 7 is a schematic diagram of an exemplary PMPI generated using an embodiment of the present disclosure.
- Fig. 8 is a schematic diagram of a video compression process
- Fig. 9 is a schematic diagram of a kind of PCS data converted from MPI original storage data
- Fig. 10 is a flowchart of a data processing method of PMPI according to an embodiment of the present disclosure
- Fig. 11 is a schematic diagram of a kind of PCS data converted from PMPI original storage data according to an embodiment of the present disclosure
- FIG. 12 is a schematic diagram of another PCS data converted from PMPI original storage data according to an embodiment of the present disclosure.
- FIG. 13 is a schematic diagram of another PCS data converted from PMPI original storage data according to an embodiment of the present disclosure.
- FIG. 14 is a schematic structural diagram of a PMPI encoding device according to an embodiment of the present disclosure.
- FIG. 15 is a flow chart of a PMPI encoding method according to an embodiment of the present disclosure.
- Fig. 16 is a flowchart of a PMPI decoding method according to an embodiment of the present disclosure
- FIG. 17 is a schematic diagram of an apparatus for generating PMPI according to an embodiment of the present disclosure.
- words such as “exemplary” or “for example” are used to mean an example, illustration or illustration. Any embodiment described in this disclosure as “exemplary” or “for example” should not be construed as preferred or advantageous over other embodiments.
- "And/or” in this article is a description of the relationship between associated objects, which means that there can be three relationships, for example, A and/or B, which can mean: A exists alone, A and B exist simultaneously, and there exists alone B these three situations.
- “A plurality” means two or more than two.
- words such as “first” and “second” are used to distinguish the same or similar items with basically the same function and effect. Those skilled in the art can understand that words such as “first” and “second” do not limit the number and execution order, and words such as “first” and “second” do not necessarily limit the difference.
- Multiplanar Imagery is a hierarchical representation of 3D scenes without redundancy.
- a 3D scene is decomposed into a set of planar or spherical layers, sampled from different depths from a given reference point. Each layer is obtained by projecting the part of the 3D scene contained around the layer position onto the same reference camera.
- This reference camera is at the given reference viewpoint.
- the reference camera is the perspective camera; when using spherical layers, the reference camera is the spherical (usually equirectangular) camera.
- MPI decomposes the scene into a series of plane layers or spherical layers. Take for example an MPI composed of planar layers that are positively parallel and located at different depths with respect to the reference viewpoint.
- the depth range [dmin, dmax] of the plane layer needs to be set in advance according to the depth range of the actual scene.
- the MPI includes S plane layers, and the size of each plane layer is W ⁇ H, the size of the MPI can be expressed as W ⁇ H ⁇ S.
- W is the number of pixels in the width direction of the MPI
- H is the number of pixels in the height direction of the MPI
- the MPI contains W ⁇ H pixels
- the planar image resolution is W ⁇ H.
- the exemplary MPI shown in Figure 1 includes 4 layers, but the number of plane layers or spherical layers that MPI includes, that is, the number of layers, can also be 2, 3, 5 or more than 5, such as 100, 200 Wait a minute.
- Each layer of MPI includes a color map and a transparency map, which are used to record the color and transparency of pixels on this layer. A pixel can have different colors and transparency on different layers.
- MPI is a hierarchical representation of a 3D scene, that is, the sampling of a 3D scene.
- the points on the MPI plane layer are sampling points. From the examples in Figure 2A to Figure 2F, it can be seen that most of the sampling points in this MPI are located in the 3D scene Invalid positions of , these positions have no visible surface, and the transparency is 0. Only a small number of sampling points are located in the valid area of the 3D scene. There are visible surfaces at the positions of these valid areas, and the transparency is not 0.
- MPI can be used for immersive video. From the perspective of immersive experience, it is the effective area in the 3D scene that plays a decisive role. However, most of the sampling points of MPI are wasted, resulting in low sampling efficiency and low resolution of the final immersive video. lower.
- the depth range [dmin,dmax] of MPI is set according to the global depth of the scene.
- the depth range is enough to cover most of the effective information of the scene.
- the depth of the layer closest to the reference viewpoint is called the initial depth dmin of MPI.
- the depth of the layer farthest from the reference viewpoint may be referred to as the termination depth dmax of the MPI.
- the parallel lines in the figure are used to indicate the depth position of each layer in the MPI in the three-dimensional scene.
- MPI Since one geometry in this scene is far away from other geometries, in order to represent the main information of the scene (four geometries), MPI must use a larger depth range, and the resulting plane layer (four in the figure as an example ) is relatively sparse. For the three geometries located in the foreground region, valid information will only appear on the two plane layers with greater depth in MPI. MPI is less efficient at sampling.
- an embodiment of the present disclosure proposes an MPI with depth adaptive change characteristics.
- the MPI shown in FIG. 1 and FIG. 3 the MPI shown in FIG. 1 and FIG.
- the MPI proposed by the embodiments of the present disclosure is called a patch multiplane image (PMPI: Patch multiplane image) for the common MPI, that is, the MPI representing the entire 3D scene.
- PMPI Patch multiplane image
- the PMPI of the embodiment of the present disclosure is such an MPI: it includes a plurality of sub multiplane images (sMPI: sub multiplane image) to respectively represent a plurality of scene areas divided into three-dimensional scenes, and each sMPI is included in the scene area represented by the sMPI Multiple layers obtained by sampling at different depths, the starting depth of each sMPI is determined at least according to the depth information of the scene area represented by the sMPI.
- PMPI can be regarded as an extension of ordinary MPI.
- the basic unit of ordinary MPI is multiple layers of the same size, which are used to represent a complete 3D scene, while PMPI uses multiple sMPIs to represent multiple scenes divided into a 3D scene.
- Each scene area can be regarded as a block of the 3D scene, and each sMPI can also be regarded as a block of the PMPI, so the PMPI is a hierarchical block representation of the 3D scene.
- a single sMPI is also a kind of MPI, but it represents the scene area divided into the 3D scene, and the scene area can also be regarded as a 3D scene, but the size and shape are different from the original 3D scene.
- the way sMPI characterizes the scene area can still use the way ordinary MPI characterizes the three-dimensional scene.
- an sMPI includes multiple layers sampled at different depths in the scene area. The size and shape of the multiple layers are the same, and each layer includes a color map and A transparency map, multiple layers can be distributed according to set rules (such as equal spacing or equal viewing distance), and so on. Similar to MPI, the depth range of sMPI is also set according to the principle of including most of the valid information in the scene area.
- the end depths of multiple sMPIs in PMPI can be set to be the same, and the start depths can be set to be different according to the depth information of the represented scene area, so as to introduce the depth information of the scene, increase the adaptive ability for the scene depth, and make more Sample points are placed at valid locations in the scene.
- the 3D scene and reference viewpoint shown in FIG. 4 are the same as those in FIG. 3 .
- the number of scene areas is set to 2
- a vertical plane is used to divide the 3D scene into two scene areas, one of which includes the scene area located on the left side of the reference viewpoint.
- Two geometric bodies, the scene area is referred to as the first scene area hereinafter, one geometric body in the first scene area is closest to the reference viewpoint, and the other geometric body is farther away from the reference viewpoint.
- Another scene area includes two geometries located on the right side of the reference viewpoint.
- this scene area is called the second scene area.
- the two geometries in the second scene area are far away from the reference viewpoint.
- the above-mentioned first scene area and the second scene area are respectively represented by one sMPI, and each sMPI contains 4 plane layers, because the depth difference of the two geometric bodies in the first scene area is relatively large, which is used to represent the first scene area.
- the scene depth range of sMPI in the scene area needs to be set larger.
- the depth difference between the two geometric bodies in the second scene area is small and is close to the rear, and the depth range of the sMPI used to characterize the second scene area can be set to be relatively small.
- the start depth of the sMPI used to characterize the first scene area is smaller, and the start depth of the sMPI used to characterize the second scene area is larger.
- the PMPI representation of the resulting 3D scene is shown in Fig. 4. It can be seen from the figure that the 4 layers of sMPI used to represent the second scene area become dense and all are located near the geometry, so compared with the MPI in Figure 3, PMPI has more sampling points located in valid positions. That is to say, the layered and block-based PMPI is used to represent the scene, and the sampling efficiency is higher than that of the ordinary MPI layered representation. It should be noted that the division method of the three-dimensional scene in the example shown in FIG. 4 is only schematic, and is used to illustrate the difference between PMPI and common MPI with a simple example.
- An embodiment of the present disclosure provides a method for generating a multi-plane image, as shown in FIG. 5 , including:
- Step 310 dividing the 3D scene into multiple scene areas
- Step 320 generate a block multi-planar image PMPI, the PMPI includes a plurality of sub-multi-planar images sMPI for respectively representing a plurality of the scene regions, and the starting depth of the sMPI is at least according to the scene region represented by the sMPI Depth information is determined.
- the starting depth of the sMPI is determined according to at least depth information of a scene region represented by the sMPI, including: determining the starting depth of the sMPI according to a minimum depth of a first region,
- the first area is the scene area represented by the sMPI, or the first area is an area jointly formed by the scene area represented by the sMPI and adjacent areas of the scene area represented by the sMPI.
- the adjacent area of the scene area represented by the sMPI may include one or more scene areas located around the scene area represented by the sMPI in the 3D scene, or the adjacent area includes the scene area located in the 3D scene located in the 3D scene.
- the area composed of multiple rows of pixels and/or multiple columns of pixels surrounding the scene area represented by the sMPI does not require to be a complete scene area.
- the starting depth of the sMPI when the starting depth of the sMPI is determined according to the minimum depth of the first region, the starting depth of the sMPI is set as the minimum depth of the first region.
- a depth slightly smaller than the minimum depth may also be selected on the basis of the minimum depth of the first region as the starting depth of the sMPI, for example, subtract a set value from the value of the minimum depth, or subtract the value from the value of the minimum depth and multiply a value obtained by a set ratio, so that the determined The starting depth of sMPI has a certain margin.
- the value of the above minimum depth may be the minimum depth value of the scene area represented by the sMPI in the depth map of the 3D scene, or the minimum depth value in the depth map may be rounded, normalized, etc. for coding.
- Depth values in 3D scenes can be represented by grayscale values.
- the first area is the scene area represented by the sMPI
- the starting depth of the sMPI is determined according to the minimum depth of the scene area represented by the sMPI.
- the depth values in the 3D scene depth map may have deviations
- the minimum depth of the region determines the start depth of the sMPI, and the determined start depth is always less than or equal to the start depth determined only according to the minimum depth of the scene region represented by the sMPI. The impact brought by the above deviation can be avoided as much as possible, so that the sMPI can completely sample the effective area in the represented scene area.
- the initial depth of each sMPI may be calculated by using pooling during actual operation.
- a 3D scene is divided into 36 scene areas with a 6 ⁇ 6 grid
- the solid line area on the left side of Figure 6 represents the original depth map of the 3D scene, which is also divided into the There are 36 scene areas, and each grid in the figure represents a scene area.
- the minimum depth of each scene area can be determined according to the depth information of the scene area in the depth map.
- the pooling size is 5 ⁇ 5, and the pooling step is 1. In order to make the grid number of the pooled depth map still 6x6, it is expanded centered on the original depth map.
- the expanded depth map includes 10 ⁇ 10 grids, the expanded grid is represented by a dotted line, and the minimum depth of each expanded grid is copied to the original depth map closest to the grid
- the minimum depth of the grid (the distance between two grids can be taken as the length of the line connecting the centers of the two grids).
- the minimum depth of each grid in the depth map is equal to 5 ⁇
- the minimum depth of 5 grids is set as the starting depth of the sMPI characterizing the grid (ie the scene area). That is, the operation of determining the starting depth of the sMPI according to the minimum depth of the area formed by the scene area represented by the sMPI and the adjacent areas of the scene area represented by the sMPI is realized.
- the sMPI includes multiple layers sampled at different depths of the represented scene region, and each of the multiple layers includes a color map and a transparency map.
- Step 1 determining the start depth and end depth of the plurality of divided sMPIs, wherein the start depths of the plurality of sMPIs are respectively determined at least according to the depth information of the respectively represented scene areas, and the plurality of sMPIs
- the termination depth is set to be the same;
- the PMPI can be regarded as an ordinary MPI, and a termination depth is jointly set for multiple sMPIs in the PMPI according to the setting method of the termination depth of the ordinary MPI.
- Step 2 for each sMPI in a plurality of said sMPIs, determine the depth of each layer included in the sMPI according to the start depth and end depth of the sMPI, and the number of layers of the sMPI and the distribution rules of the layers, in the The depth of each layer of the scene area represented by the sMPI is sampled to obtain the color map and transparency map of each layer included in the sMPI.
- the multiple sMPIs correspond to the multiple scene regions one by one, and the number of layers and layer distribution rules of the multiple sMPIs are set to be the same, so as to simplify processing and improve coding efficiency.
- the distribution rule of the layers may be equidistant distribution or equidistant distribution, for example. But the present disclosure is not limited thereto.
- the number of layers and layer distribution rules of the multiple sMPIs in the PMPI may also be different. In this case, some coding complexity will be added, but the 3D scene can be represented more flexibly.
- the multiple layers may be planar layers or spherical layers.
- the dividing the 3D scene into multiple scene areas includes: dividing the 3D scene into multiple scene areas according to preset scene division rules, wherein, according to the scene division rules
- the following one or any combination information of the divided multiple scene areas may be determined: the number of scene areas, the shape of the scene area, the size of the scene area, and the position of the scene area; wherein, the sizes of the multiple scene areas are the same
- the shapes of the multiple scene regions are one or a combination of regular shapes or irregular shapes, and the regular shapes include one or any combination of triangles, rectangles, pentagons, and hexagons.
- the dividing the 3D scene into a plurality of scene areas according to a preset scene division rule includes: dividing the 3D scene into M ⁇ N grids using an M ⁇ N grid Scene area, M, N are positive integers, and M ⁇ N ⁇ 2.
- the divided M ⁇ N scene areas are rectangular areas with the same size.
- This division method can easily determine the scene area where each pixel in the PMPI is located according to the coordinates of the pixels (such as looking up a table or calculating through a simple formula), and there is no need to additionally identify the scene area where the pixel is located or the sMPI it belongs to.
- the present disclosure is not limited to this division method.
- the generated PMPI original storage data includes frame parameters and frame data of a PMPI frame
- the frame parameters of the PMPI frame in the original storage data include one or any combination of the following parameters:
- the distribution rule of the layer uniformly set for sMPI in the PMPI frame
- the termination depth uniformly set for sMPI in PMPI frames
- the frame data of each PMPI frame in the original storage data includes: the color map data and the transparency map data of each layer in each sMPI included in the PMPI frame.
- the starting depth of the PMPI generated by the embodiment of the present disclosure is more flexible and changeable, and can be adaptively changed when the depth of field in different regions of the scene changes.
- the resulting result is that the sampling points of PMPI are gathered on the visible surface of the scene, and the sampling efficiency is improved.
- the distribution of plane layers of PMPI is denser on the whole, which is equivalent to ordinary MPI with more layers, but the number of sampling points does not increase.
- a denser depth layer makes the final immersive video generated according to PMPI retain more details and better quality.
- MPI can be displayed as immersive video after video compression.
- Figure 8 shows the corresponding video processing.
- the three-dimensional scene images (such as images captured by a 3D camera) collected by the video acquisition device are preprocessed to obtain MPI, and the MPI is compressed and encoded and then transmitted as a code stream.
- the code stream is decoded and post-processed, and displayed and played in the form of immersive video.
- MPEG Moving Picture Experts Group
- RV Moving Picture Experts Group
- PCS compressed package compression storage
- Image data such as data and reference viewpoint camera parameters can be used as the input of the immersive video test model (TMIV: Test model of immersive video) in MPEG.
- TMIV Test model of immersive video
- an MPI with image resolution of W ⁇ H and number of layers as S that is, an MPI frame with a size of W ⁇ H ⁇ S as an example, it can be converted into PCS data.
- the PCS data records the relevant parameters of each pixel in the MPI. ,include:
- N i, j the number of effective layers of the pixel (i, j);
- C i,j,k the color data such as the color value at the kth effective layer position of the pixel (i,j);
- D i,j,k the index (index) of the kth effective layer of the pixel (i,j) (D i,j,k ⁇ [1,S]);
- T i,j,k Transparency value at the kth effective layer position of pixel (i,j).
- pixel (i, j) is contained in S layers of MPI, and the layer whose transparency value of pixel (i, j) is not 0 is the effective layer of pixel (i, j).
- FIG. 9 shows an example of PCS data of MPI encapsulated according to the above parameters.
- the start depth and end depth of multiple MPI frames within a set time are the same, and the distribution rules of multiple layers in MPI are known.
- the depth of each layer in the MPI can be calculated according to the starting depth and the ending depth.
- the start depth and end depth of the MPI frame can be recorded in the frame parameters of the MPI frame, and do not need to be written into the PCS data of a single MPI frame, so the PCS data of a single MPI does not need to additionally record the effective layer of the pixel depth information.
- PMPI can also be compressed and encoded as a video frame.
- PMPI can be directly generated according to the image of the three-dimensional scene, or it can be generated on the basis of ordinary MPI.
- Before encoding PMPI it is also necessary to convert the original storage data of PMPI into PCS data.
- PMPI includes multiple sMPIs.
- the depth of the layer closest to the reference viewpoint is the starting depth of the sMPI
- the depth of the layer farthest from the reference viewpoint is the depth of the sMPI
- the end depth of other layers is between the start depth and end depth of the sMPI, and multiple layers can also be distributed according to set rules such as equidistant or equidistant. Therefore, after knowing the start depth and end depth of sMPI, the depth of each layer of sMPI can be calculated.
- the termination depths of sMPIs in different PMPIs are set to be the same.
- the starting depth of sMPI in different PMPIs is related to the depth information of the scene area it represents, and is not preset and fixed. Therefore, when converting the original stored data of the PMPI in the embodiment of the present disclosure into PCS data, it is necessary to provide the initial depth information so that the decoding end can accurately calculate the depth of the effective layer of the pixel.
- an embodiment of the present disclosure provides a data processing method for a multi-plane image, as shown in FIG. 10 , including:
- Step 410 acquire the original storage data of the block multi-plane image PMPI, the PMPI includes a plurality of sub-multi-plane images sMPI to respectively represent a plurality of scene areas divided into three-dimensional scenes;
- Step 420 converting the original storage data of the PMPI into encapsulated and compressed storage PCS data, the PCS data is used to determine the depth of the effective layer of the pixel in the PMPI and the color and transparency of the pixel on the effective layer.
- the PMPI is generated using the generation method described in any embodiment of the present disclosure, each pixel in the PMPI is included in one sMPI, and the multiple layers included in the sMPI Both record the color value and transparency value of the pixel.
- the pixels in the ordinary PMI are included in all layers of the ordinary PMI, and the PMPI is divided into blocks, so the pixels in the PMPI are included in all the layers of a sMPI.
- the sMPI containing the pixel is called the sMPI where the pixel is located.
- the color value and transparency value of the pixel are recorded in all layers of the sMPI where the pixel is located, but only some of these layers may be valid layers for the pixel.
- the effective layer of a pixel in MPI may be a layer in MPI whose transparency of the pixel is greater than a set threshold (eg, 0).
- the effective layer of the pixel in the PMPI in the embodiment of the present disclosure may follow the provisions in the above standards.
- the effective layer of the pixel in the PMPI refers to the layer in the sPMI that contains the pixel in the PMPI and the transparency of the pixel is greater than a set threshold (such as 0).
- An effective layer of a pixel can have one or more layers, depending on the actual scene.
- the PCS data includes frame data and frame parameters of a PMPI frame.
- the frame data of a PMPI frame in the PCS data includes:
- the following data of each pixel in the PMPI frame the color data of the pixel on each valid layer, the transparency data and the layer index of the valid layer in the sMPI where the pixel is located.
- the frame data of a PMPI frame in the PCS data includes:
- the following data of each pixel in the PMPI frame the index of the sMPI where the pixel is located, the color data and transparency data of the pixel on each effective layer and the layer index of the effective layer in the sMPI where the pixel is located.
- the frame data of a PMPI frame in the PCS data includes: the following data of each pixel in the PMPI frame, the starting depth of the sMPI where the pixel is located; The color data, transparency data, and layer index of the effective layer in the sMPI where the pixel resides on the layer.
- a parameter may be added to the frame data of a PMPI frame in the PCS data, that is, the number of effective layers of each pixel in the PMPI frame. Increasing this parameter is beneficial to improve the efficiency of data encoding and parsing.
- the PCS data also includes frame parameters of the PMPI frame, and the frame parameters of the PMPI frame in the PCS data include one or any combination of the following parameters:
- the distribution rule of the layer uniformly set for sMPI in the PMPI frame
- the termination depth uniformly set for sMPI in PMPI frames
- the frame parameters of the PMPI frame in this example can be applied to the embodiments shown in FIG. 11 , FIG. 12 and FIG. 13 , and will not be repeated below.
- the first PCS data format applicable to the PMPI of the embodiment of the present disclosure is proposed.
- the frame data of a PMPI in the PCS data includes:
- the image resolution of the PMPI is W ⁇ H
- M ⁇ N grid division is adopted
- the number of sMPIs included in the PMPI is M ⁇ N
- the number of layers of each sMPI is S.
- the PCS data format of the PMPI is as shown in Figure 11, and the frame data of a PMPI frame in the PMPI includes:
- DP x,y used to characterize the starting depth of the sMPI of the scene area represented by the grid (x,y), x ⁇ [1,M], y ⁇ [1,N];
- N i, j the number of effective layers of the pixel (i, j), i ⁇ [1,H], j ⁇ [1,W];
- C i,j,k the color data such as the color value at the kth effective layer position of the pixel (i,j);
- D i,j,k the index (index) of the kth effective layer of the pixel (i,j), D i,j,k ⁇ [1,S];
- T i,j,k Transparency data such as transparency value at the kth effective layer position of pixel (i,j).
- the starting depth of each sMPI in the PMPI frame is written in the frame data of the PMPI frame, and the sMPI where the pixel (i, j) is located can be determined according to i, j and the division rule, combined with the sMPI in the frame parameter
- the termination depth, number of layers and distribution rules of layers can calculate the depth of each layer of sMPI where the pixel (i, j) is located, and then determine the pixel (i, j) according to the indexes of all effective layers of the pixel (i, j). The depth of each effective layer of (i, j) is used for subsequent encoding processing.
- a second PCS data format suitable for the PMPI of the embodiment of the present disclosure is proposed.
- the frame data of a PMPI in the PCS data includes:
- the image resolution of the PMPI is W ⁇ H
- the number of divided sMPIs is M
- the number of layers of each sMPI is S.
- the PCS data format of the PMPI is as shown in Figure 12, and the frame data of a PMPI frame in the PMPI includes:
- N i, j the number of effective layers of the pixel (i, j), i ⁇ [1,H], j ⁇ [1,W];
- I i,j the index of the sMPI where the pixel (i,j) is located;
- C i,j,k the color data such as the color value at the kth effective layer position of the pixel (i,j);
- D i,j,k the index (index) of the kth effective layer of the pixel (i,j), D i,j,k ⁇ [1,S];
- T i,j,k Transparency data such as transparency value at the kth effective layer position of pixel (i,j).
- This embodiment is compared with the previous embodiment.
- the starting depth of each sMPI in the PMPI frame is written, and the index of the sMPI where the pixel (i,j) is located is also written, so it can be applied to the three-dimensional scene with grid division It can also be applied to the case of non-grid division of 3D scenes.
- a second PCS data format suitable for the PMPI of the embodiment of the present disclosure is proposed.
- the frame data of a PMPI frame in the PCS data includes the following parameters of each pixel in the PMPI:
- the image resolution of the PMPI is W ⁇ H
- the number of sMPIs included in the PMPI is M
- the number of layers of each sMPI is S.
- the PCS data format of the PMPI is as shown in Figure 13, and the frame data of a PMPI frame in the PMPI includes the following parameters:
- N i, j the number of effective layers of the pixel (i, j), i ⁇ [1,H], j ⁇ [1,W];
- E i,j the starting depth of the sMPI where the pixel (i,j) is located;
- C i,j,k the color data such as the color value at the kth effective layer position of the pixel (i,j);
- D i,j,k the index (index) of the kth effective layer of the pixel (i,j), D i,j,k ⁇ [1,S];
- T i,j,k Transparency value at the kth effective layer position of pixel (i,j).
- the data of the starting depths of the plurality of sMPIs is expressed as the starting depth of the sMPI where each pixel is located in the PMPI, that is, the starting depth of the sMPI where the pixel is located is directly written into the frame data of the PMPI frame, It is convenient to determine the depth of the effective layer of pixels, but may affect the efficiency of coding.
- the PCS data of PMPI is used to determine the depth of the effective layer of the pixel in the PMPI and the parameters of the color and transparency of the pixel on the effective layer.
- other data formats can also be used. This disclosure There is no limit to this.
- the above PCS data format applicable to PMPI adds information about the initial depth of sMPI to the PCS data, so that the decoder can calculate the depth of the effective layer of pixels based on the initial depth of sMPI, thereby accurately recovering the image of PMPI.
- FIG 14 is a structural diagram of an MPI encoding device that can be used in an embodiment of the present disclosure.
- the input data of the MPI encoding device 10 is the PCS data of the source MPI (such as PMPI), and the PCS data includes but is not limited to image parameters (View parameters ) (also called view parameters, such as reference viewpoint camera parameters, etc.), the data of the Texture Attribute component and the data of the Transparency Attribute component, etc.
- View parameters also called view parameters, such as reference viewpoint camera parameters, etc.
- the data of the Texture Attribute component and the data of the Transparency Attribute component etc.
- the MPI encoding device 10 includes:
- the MPI mask generation (Create mask from MPI) unit 101 is configured to generate an MPI mask according to input data.
- the pixels (also referred to as sampling points) in the MPI layer may be screened according to a transparency threshold to obtain a mask of each layer. This is for distinguishing positions with high transparency (also referred to as pixels) and positions with low transparency (also referred to as pixels) on each layer, and masking positions with high transparency to reduce the amount of data.
- the MPI mask generation unit 101 performs the above-mentioned operations on all MPI frames within an intra-period.
- the MPI frame size is W ⁇ H ⁇ S
- the number of frames included in an intra-period is M.
- the MPI mask aggregation (Aggregate MPI masks) unit 103 is configured to take a union of multiple masks located on the same layer among the M W ⁇ H ⁇ S masks to obtain a W ⁇ H ⁇ S mask.
- Effective pixel clustering (Cluster Active pixels) unit 105 is configured to cluster the regions (effective information regions) whose transparency is greater than a threshold value in the mask of each layer into a series of clusters (cluster);
- the cluster segmentation (Split Clusters) unit 107 is configured to divide the cluster obtained by the clustering of the effective pixel clustering unit 105, and obtain the cluster after the segmentation process;
- the block packaging (Pack patches) unit 109 is configured to recombine the texture map and transparency map corresponding to each patch (patch, such as a rectangular area containing clusters) into a picture, and encode it as atlas data for transmission.
- Video data generation (Generate video data) unit 111 is set to generate video data and transmit according to the atlas data that block encapsulation unit 109 outputs, and described video data comprises texture attribute video data (Texture attribute video data (raw)), transparency attribute video data (Transparency attribute video data (raw)), etc.
- the parameter encoding unit 113 is configured to encode according to the source MPI data to obtain encoded image parameters, and the encoded image parameters may include an image parameter list (View parameters list), a parameter set (Parameter set) and the like.
- the sampling points in the MPI are first screened according to a transparency threshold to obtain a mask of each plane layer.
- the size of MPI is W ⁇ H ⁇ S
- the number of frames contained in a set period of time (intra-period) is M
- the above operation is performed on all MPI frames in the period of time to obtain M W ⁇ H ⁇ S mask (mask).
- the masks on the same plane layer are combined to obtain a W ⁇ H ⁇ S mask.
- the regions (effective information regions) whose transparency is greater than the threshold value in the mask of each layer are clustered and divided into a series of clusters. Cluster obtains small patches through steps such as fusion and decomposition.
- the texture map (ie color map) and transparency map corresponding to each block patch are recombined into a picture, encoded as atlas data for transmission.
- An embodiment of the present disclosure provides a multi-plane image coding method, which can be used for PMPI coding. As shown in FIG. 15, the coding method includes:
- Step 510 receiving PCS data of PMPI, said PMPI includes a plurality of sMPIs to respectively represent a plurality of scene areas divided into three-dimensional scenes, said PCS data includes image parameters, and data of texture attribute part and transparency attribute part;
- Step 520 Encode the PMPI based on the PCS data to obtain encoded image parameters and atlas data.
- the PCS data includes the initial depth information of the sMPI, which can be encapsulated in the data of the image parameter and/or the texture attribute part and the transparency attribute part, and the initial depth information of the sMPI is written into the code stream during encoding, Can be encapsulated in encoded image parameters and/or atlas data.
- the encoding of the PMPI based on the PCS data includes: performing encoding processing on a plurality of sMPIs included in the PMPI, wherein each sMPI can be encoded according to The encoding process is performed in the same encoding manner as ordinary MPI (that is, to represent the entire 3D scene), and the encoding manner of ordinary MPI may follow the provisions in relevant standards.
- the PCS data of the PMPI is converted from the original stored data of the PMPI according to the data processing method described in any embodiment of the present disclosure.
- the image parameters in the PCS data and the encoded image parameters all include at least one of the following data: part or all of the frame parameters of the PMPI frame in the PCS data, the starting depth of each sMPI in the PMPI frame ;
- the data of the texture attribute part and the transparency attribute part in the PCS data include part or all of the frame data of the PMPI frame in the PCS data, and the data of the texture attribute part include color data;
- the atlas data includes data and parameters of a block (patch) determined during encoding, the data includes color data and transparency data, and the parameters include one or any combination of the following: identification information of the layer to which the data belongs, identification information of the layer to which the data belongs The starting depth, the identification information of the sMPI to which the data belongs, the starting depth of the sMPI to which the data belongs, and the identification information of the PMPI to which the data belongs.
- the starting depth of the sMPI in the PMPI may be written into encoded image parameters and/or atlas data.
- An embodiment of the present disclosure provides a multi-plane image decoding method, as shown in FIG. 16 , including:
- Step 610 receiving the coded code stream of the block multi-plane image PMPI, the coded code stream includes image parameters and atlas data of the PMPI;
- the image parameters and/or atlas data of the PMPI in the encoded code stream include the initial depth information of the sMPI.
- Step 620 decoding the encoded code stream to obtain the image parameters of the PMPI, and the data of the texture attribute part and the transparency attribute part;
- the PMPI includes a plurality of sub-multi-plane images SMPI to respectively represent a plurality of scene regions into which the three-dimensional scene is divided.
- the image parameters of the PMPI include one or any combination of the following parameters:
- the distribution rule of the layer uniformly set for sMPI in the PMPI frame
- the termination depth uniformly set for sMPI in PMPI frames
- the atlas data includes block data and parameters determined during encoding, the data includes color data and transparency data, and the parameters include one or any combination of the following: the layer to which the data belongs The identification information of the data, the initial depth of the layer to which the data belongs, the identification information of the sMPI to which the data belongs, the initial depth of the sMPI to which the data belongs, and the identification information of the PMPI to which the data belongs.
- An embodiment of the present disclosure also provides a code stream, wherein the code stream is generated by encoding a block multi-plane image PMPI, and the code stream includes image parameters and atlas data of the PMPI; the PMPI includes The multiple sub-multiplanar images sMPI are used to respectively represent the multiple scene regions into which the 3D scene is divided.
- the image parameters and/or atlas data of the PMPI in the code stream include the initial depth information of the sMPI.
- the image parameters of the PMPI include one or any combination of the following parameters:
- the distribution rule of the layer uniformly set for sMPI in the PMPI frame
- the termination depth uniformly set for sMPI in PMPI frames
- the atlas data includes block data and parameters determined during encoding, the data includes color data and transparency data, and the parameters include one or any combination of the following: the layer to which the data belongs The identification information of the data, the initial depth of the layer to which the data belongs, the identification information of the sMPI to which the data belongs, the initial depth of the sMPI to which the data belongs, and the identification information of the PMPI to which the data belongs.
- An embodiment of the present disclosure also provides a device for generating a multi-plane image, as shown in FIG. 17 , including a processor 5 and a memory 6 storing a computer program operable on the processor 5, wherein the When the processor 5 executes the computer program, the method for generating a multi-plane image according to any embodiment of the present disclosure is realized.
- An embodiment of the present disclosure also provides a multi-plane image data processing device, which can also be referred to FIG. 17 , including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the implementation of the present disclosure
- a multi-plane image data processing device which can also be referred to FIG. 17 , including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the implementation of the present disclosure
- the data processing method of the multi-plane image described in any embodiment.
- An embodiment of the present disclosure also provides a multi-plane image encoding device, which can also be referred to FIG. 17 , including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the implementation of the present disclosure A method for encoding multi-plane images described in any embodiment.
- An embodiment of the present disclosure also provides a multi-plane image decoding device, which can also be referred to FIG. 17 , including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the implementation of the present disclosure
- a multi-plane image decoding device which can also be referred to FIG. 17 , including a processor and a memory storing a computer program, wherein, when the processor executes the computer program, the implementation of the present disclosure.
- An embodiment of the present disclosure also provides a non-transitory computer-readable storage medium, the computer-readable storage medium stores a computer program, wherein, when the computer program is executed by a processor, any implementation of the present disclosure can be realized.
- the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over, as one or more instructions or code, a computer-readable medium and executed by a hardware-based processing unit.
- Computer-readable media may include computer-readable storage media that correspond to tangible media such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, eg, according to a communication protocol. In this manner, a computer-readable medium may generally correspond to a non-transitory tangible computer-readable storage medium or a communication medium such as a signal or carrier wave.
- Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this disclosure.
- a computer program product may comprise a computer readable medium.
- such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk or other magnetic storage, flash memory, or may be used to store instructions or data Any other medium that stores desired program code in the form of a structure and that can be accessed by a computer.
- any connection could also be termed a computer-readable medium. For example, if a connection is made from a website, server or other remote source for transmitting instructions, coaxial cable, fiber optic cable, dual wire, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium.
- disk and disc include compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk, or blu-ray disc, etc. where disks usually reproduce data magnetically, while discs use lasers to Data is reproduced optically. Combinations of the above should also be included within the scope of computer-readable media.
- processors can be implemented by one or more processors such as one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuits.
- DSPs digital signal processors
- ASICs application specific integrated circuits
- FPGAs field programmable logic arrays
- processors may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein.
- the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec.
- the techniques may be fully implemented in one or more circuits or logic elements.
- the technical solutions of the embodiments of the present disclosure may be implemented in a wide variety of devices or devices, including a wireless handset, an integrated circuit (IC), or a set of ICs (eg, a chipset).
- IC integrated circuit
- Various components, modules, or units are described in the disclosed embodiments to emphasize functional aspects of devices configured to perform the described techniques, but do not necessarily require realization by different hardware units. Rather, as described above, the various units may be combined in a codec hardware unit or provided by a collection of interoperable hardware units (comprising one or more processors as described above) in combination with suitable software and/or firmware.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Graphics (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
Abstract
Description
Claims (30)
- 一种多平面图像的生成方法,包括:将三维场景划分成多个场景区域;及生成分块多平面图像PMPI,所述PMPI包括用于分别表征多个所述场景区域的多个子多平面图像sMPI,所述sMPI的起始深度至少根据所述sMPI所表征场景区域的深度信息确定。
- 根据权利要求1所述的生成方法,其中:所述sMPI的起始深度至少根据所述sMPI所表征场景区域的深度信息确定,包括:所述sMPI的起始深度根据第一区域的最小深度确定,所述第一区域为所述sMPI所表征场景区域,或者所述第一区域为所述sMPI所表征场景区域和所述sMPI所表征场景区域的相邻区域所共同组成的区域。
- 根据权利要求2所述的生成方法,其中:所述sMPI的起始深度根据第一区域的最小深度确定,包括:所述sMPI的起始深度设置为所述第一区域的最小深度。
- 根据权利要求2或3所述的生成方法,其中:所述sMPI所表征场景区域的相邻区域包括所述三维场景中位于所述sMPI所表征场景区域周边的一个或多个场景区域。
- 根据权利要求1所述的生成方法,其中:所述sMPI包括在所表征场景区域的不同深度采样得到的多个层,所述多个层均包括颜色图和透明度图。
- 根据权利要求5所述的生成方法,其中:所述生成PMPI,包括:确定多个所述sMPI的起始深度和终止深度,其中,多个所述sMPI的起始深度至少根据各自所表征场景区域的深度信息分别确定,多个所述sMPI的终止深度设置为相同;对多个所述sMPI中的每一sMPI,根据该sMPI的起始深度和终止深度,以及该sMPI的层数和层的分布规则确定该sMPI包括的每一层的深度,在该sMPI所表征场景区域的所述每一层的深度处采样,得到该sMPI包括的每一层的颜色图和透明度图。
- 根据权利要求6所述的生成方法,其中:多个所述sMPI与多个所述场景区域一一对应,多个所述sMPI的层数和层的分布规则设置为相同。
- 根据权利要求1所述的生成方法,其中:所述将三维场景划分成多个场景区域,包括:根据预设的场景划分规则将三维场景划分成多个场景区域,其中,根据所述场景划分规则可以确定划分成的多个场景区域的以下一种或任意组合信息:场景区域的个数、场景区域的形状、场景区域的大小、场景区域的 位置;其中,多个所述场景区域的大小相同或不同,多个所述场景区域的形状为规则形状或不规则形状中的一种或者组合,所述规则形状包括三角形、矩形、五边形、六边形中的一种或任意组合。
- 根据权利要求1或8所述的生成方法,其中:所述根据预设的场景划分规则将所述三维场景划分成多个场景区域,包括:使用M×N的网格将所述三维场景划分为M×N个场景区域,M,N为正整数,且M×N≥2。
- 根据权利要求1所述的生成方法,其中:所述生成的PMPI的原始存储数据包括PMPI帧的帧参数和帧数据;其中,所述原始存储数据中的PMPI帧的帧参数包括以下参数中的一种或任意组合:PMPI帧的分辩率;PMPI帧中sMPI的个数;为PMPI帧中sMPI统一设置的层数;为PMPI帧中sMPI统一设置的层的分布规则;为PMPI帧中sMPI统一设置的终止深度;使用M×N网格划分三维场景时M,N的取值;其中,所述原始存储数据中的每一PMPI帧的帧数据包括:该PMPI帧包括的每一sMPI中每一层的颜色图数据和透明度图数据。
- 一种多平面图像的数据处理方法,包括:获取分块多平面图像PMPI的原始存储数据,所述PMPI包括多个子多平面图像sMPI以分别表征三维场景划分成的多个场景区域;将所述PMPI的原始存储数据转换为封装压缩存储PCS数据,所述PCS数据用于确定所述PMPI中像素的有效层的深度及像素在有效层上的颜色和透明度。
- 根据权利要求11所述的数据处理方法,其中:所述PMPI采用如权利要求1至10中任一所述的生成方法生成。
- 根据权利要求11所述的数据处理方法,其中:所述PCS数据包括PMPI帧的帧数据,所述PCS数据中一个PMPI帧的帧数据包括:该PMPI帧中每一sMPI的起始深度;及该PMPI帧中每一像素的以下数据:该像素在每一有效层上的颜色数据、透明度数据和该有效层在该像素所在sMPI中的层索引;或者该PMPI帧中每一sMPI的起始深度;及该PMPI帧中每一像素的以下数据:该像素所在sMPI的索引,该像素在每一有效层上的颜色数据、透明度数据和该有效层在该像素所在sMPI中的层索引;或者该PMPI帧中每一像素的以下数据,该像素所在sMPI的起始深度;及该像素在每一有效层上的颜色数据、透明度数据和该有效层在该像素所在sMPI中的层索引。
- 根据权利要求13所述的数据处理方法,其中:所述PCS数据中的一个PMPI帧的帧数据还包括:该PMPI帧中每一像素的有效层的个数。
- 根据权利要求11或12或13所述的数据处理方法,其中:所述PCS数据还包括PMPI帧的帧参数,所述PCS数据中的PMPI帧的帧参数包括以下参数中的一种或任意组合:PMPI帧的分辩率;PMPI帧中sMPI的个数;为PMPI帧中sMPI统一设置的层数;为PMPI帧中sMPI统一设置的层的分布规则;为PMPI帧中sMPI统一设置的终止深度;使用M×N网格划分三维场景时M,N的取值。
- 一种多平面图像的编码方法,包括:接收分块多平面图像PMPI的封装压缩存储PCS数据,所述PMPI包括多个子多平面图像sMPI以分别表征三维场景划分成的多个场景区域,所述PCS数据包括图像参数,及纹理属性部分和透明度属性部分的数据;基于所述PCS数据对所述PMPI进行编码,得到编码后的图像参数和图集数据。
- 根据权利要求16所述的编码方法,其中:所述基于所述PCS数据对所述PMPI进行编码,包括:对所述PMPI包括的多个sMPI分别进行编码处理。
- 根据权利要求16或17所述的编码方法,其中:所述PMPI的PCS数据按照如权利要求11至15中任一项所述的数据处理方法、从所述PMPI的原始存储数据转换得到。
- 根据权利要求18所述的编码方法,其中:所述PCS数据中的图像参数和所述编码后的图像参数均包括以下数据中的至少一种:所述PCS数据中PMPI帧的部分或全部帧参数、PMPI帧中每一sMPI的起始深度;所述PCS数据中的纹理属性部分和透明度属性部分的数据,包括所述PCS数据中PMPI帧的部分或全部帧数据,所述纹理属性部分的数据包括颜色数据;所述图集数据包括编码时确定的块的数据和参数,所述数据包括颜色数据和透明度数据,所述参数包括以下一种或任意组合:数据所属层的标识信息、数据所属层的起始深度、数据所属sMPI的标识信息、数据所属sMPI的起始深度和数据所属PMPI的标识信息。
- 一种多平面图像的解码方法,包括:对分块多平面图像PMPI的编码码流进行解码,获取所述PMPI的图像参数,及纹理属性部分和透明度属性部分的数据;其中,所述PMPI包括多个子多平面图像sMPI以分别表征三维场景划分成的多个场 景区域,所述编码码流中包括PMPI的图像参数和图集数据。
- 根据权利要求20所述的解码方法,其中:所述PMPI的图像参数包括以下参数中的一种或任意组合:PMPI帧的分辩率;PMPI帧中sMPI的个数;为PMPI帧中sMPI统一设置的层数;为PMPI帧中sMPI统一设置的层的分布规则;为PMPI帧中sMPI统一设置的终止深度;使用M×N网格划分三维场景时M,N的取值;PMPI帧中每一sMPI的起始深度。
- 根据权利要求20或21所述的解码方法,其中:所述图集数据包括编码时确定的块的数据和参数,所述数据包括颜色数据和透明度数据,所述参数包括以下一种或任意组合:数据所属层的标识信息、数据所属层的起始深度、数据所属sMPI的标识信息、数据所属sMPI的起始深度和数据所属PMPI的标识信息。
- 一种码流,其中,所述码流通过对分块多平面图像PMPI编码生成,所述码流中包括所述PMPI的图像参数和图集数据;所述PMPI包括多个子多平面图像sMPI以分别表征三维场景划分成的多个场景区域。
- 根据权利要求23所述的码流,其中:所述PMPI的图像参数包括以下参数中的一种或任意组合:PMPI帧的分辩率;PMPI帧中sMPI的个数;为PMPI帧中sMPI统一设置的层数;为PMPI帧中sMPI统一设置的层的分布规则;为PMPI帧中sMPI统一设置的终止深度;使用M×N网格划分三维场景时M,N的取值;PMPI帧中每一sMPI的起始深度。
- 根据权利要求23或24所述的码流,其中:所述图集数据包括编码时确定的块的数据和参数,所述数据包括颜色数据和透明度数据,所述参数包括以下一种或任意组合:数据所属层的标识信息、数据所属层的起始深度、数据所属sMPI的标识信息、数据所属sMPI的起始深度和数据所属PMPI的标识信息。
- 一种多平面图像的生成装置,包括处理器以及存储有计算机程序的存储器,其中,所述处理器执行所述计算机程序时实现如权利要求1至10中任一所述的多平面图像的生成方法。
- 一种多平面图像的数据处理装置,包括处理器以及存储有计算机程序的存储器, 其中,所述处理器执行所述计算机程序时实现如权利要求11至15中任一所述的多平面图像的数据处理方法。
- 一种多平面图像的编码装置,包括处理器以及存储有计算机程序的存储器,其中,所述处理器执行所述计算机程序时实现如权利要求16至19中任一所述的多平面图像的编码方法。
- 一种多平面图像的解码装置,包括处理器以及存储有计算机程序的存储器,其中,所述处理器执行所述计算机程序时实现如权利要求20至22中任一所述的多平面图像的解码方法。
- 一种非瞬态计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其中,所述计算机程序时被处理器执行时实现如权利要求1至22中任一所述的方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020247014022A KR20240089119A (ko) | 2021-09-30 | 2021-09-30 | 멀티 플레인 이미지의 생성, 데이터 처리, 인코딩 및 디코딩 방법, 장치 |
PCT/CN2021/122390 WO2023050396A1 (zh) | 2021-09-30 | 2021-09-30 | 多平面图像的生成、数据处理、编码和解码方法、装置 |
CN202180102720.7A CN117999582A (zh) | 2021-09-30 | 2021-09-30 | 多平面图像的生成、数据处理、编码和解码方法、装置 |
US18/609,944 US20240223767A1 (en) | 2021-09-30 | 2024-03-19 | Method and apparatus for encoding and decoding multiplane image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/122390 WO2023050396A1 (zh) | 2021-09-30 | 2021-09-30 | 多平面图像的生成、数据处理、编码和解码方法、装置 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/609,944 Continuation US20240223767A1 (en) | 2021-09-30 | 2024-03-19 | Method and apparatus for encoding and decoding multiplane image |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023050396A1 true WO2023050396A1 (zh) | 2023-04-06 |
Family
ID=85781177
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/122390 WO2023050396A1 (zh) | 2021-09-30 | 2021-09-30 | 多平面图像的生成、数据处理、编码和解码方法、装置 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20240223767A1 (zh) |
KR (1) | KR20240089119A (zh) |
CN (1) | CN117999582A (zh) |
WO (1) | WO2023050396A1 (zh) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150156415A1 (en) * | 2011-12-30 | 2015-06-04 | Google Inc. | Multiplane Panoramas of Long Scenes |
US20200226816A1 (en) * | 2019-01-14 | 2020-07-16 | Fyusion, Inc. | Free-viewpoint photorealistic view synthesis from casually captured video |
CN112233165A (zh) * | 2020-10-15 | 2021-01-15 | 大连理工大学 | 一种基于多平面图像学习视角合成的基线扩展实现方法 |
US20210250571A1 (en) * | 2020-02-12 | 2021-08-12 | At&T Intellectual Property I, L.P. | Apparatus and method for providing content with multiplane image transcoding |
-
2021
- 2021-09-30 KR KR1020247014022A patent/KR20240089119A/ko unknown
- 2021-09-30 WO PCT/CN2021/122390 patent/WO2023050396A1/zh active Application Filing
- 2021-09-30 CN CN202180102720.7A patent/CN117999582A/zh active Pending
-
2024
- 2024-03-19 US US18/609,944 patent/US20240223767A1/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150156415A1 (en) * | 2011-12-30 | 2015-06-04 | Google Inc. | Multiplane Panoramas of Long Scenes |
US20200226816A1 (en) * | 2019-01-14 | 2020-07-16 | Fyusion, Inc. | Free-viewpoint photorealistic view synthesis from casually captured video |
US20210250571A1 (en) * | 2020-02-12 | 2021-08-12 | At&T Intellectual Property I, L.P. | Apparatus and method for providing content with multiplane image transcoding |
CN112233165A (zh) * | 2020-10-15 | 2021-01-15 | 大连理工大学 | 一种基于多平面图像学习视角合成的基线扩展实现方法 |
Non-Patent Citations (2)
Title |
---|
CHU JUN, WU TONG, WANG LU : "Multi-Plane Detection Algorithm of Point Clouds Based on Volume Density Change Rate", JOURNAL OF COMPUTER APPLICATIONS, JISUANJI YINGYONG, CN, vol. 33, no. 5, 1 May 2013 (2013-05-01), CN , pages 1411 - 1415, XP093053259, ISSN: 1001-9081, DOI: 10.3724/SP.J.1087.2013.01411 * |
LUVIZON DIOGO C.; CARVALHO GUSTAVO SUTTER P.; DOS SANTOS ANDREZA A.; CONCEICAO JHONATAS S.; FLORES-CAMPANA JOSE L.; DECKER LUIS G.: "Adaptive Multiplane Image Generation from a Single Internet Picture", 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 3 January 2021 (2021-01-03), pages 2555 - 2564, XP033926549, DOI: 10.1109/WACV48630.2021.00260 * |
Also Published As
Publication number | Publication date |
---|---|
KR20240089119A (ko) | 2024-06-20 |
US20240223767A1 (en) | 2024-07-04 |
CN117999582A (zh) | 2024-05-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6939883B2 (ja) | 自由視点映像ストリーミング用の復号器を中心とするuvコーデック | |
US20220292730A1 (en) | Method and apparatus for haar-based point cloud coding | |
CN112017228A (zh) | 一种对物体三维重建的方法及相关设备 | |
CN113852829A (zh) | 点云媒体文件的封装与解封装方法、装置及存储介质 | |
JP2022519462A (ja) | ホモグラフィ変換を使用した点群符号化 | |
KR20220011180A (ko) | 체적 비디오 인코딩 및 디코딩을 위한 방법, 장치 및 컴퓨터 프로그램 | |
US20220180567A1 (en) | Method and apparatus for point cloud coding | |
WO2023272510A1 (zh) | 多平面图像的生成、数据处理、编码和解码方法、装置 | |
US11196977B2 (en) | Unified coding of 3D objects and scenes | |
EP4162691A1 (en) | A method, an apparatus and a computer program product for video encoding and video decoding | |
WO2023050396A1 (zh) | 多平面图像的生成、数据处理、编码和解码方法、装置 | |
US20230119830A1 (en) | A method, an apparatus and a computer program product for video encoding and video decoding | |
CN116235497A (zh) | 一种用于用信号通知基于多平面图像的体积视频的深度的方法和装置 | |
US20230306683A1 (en) | Mesh patch sub-division | |
WO2022257143A1 (zh) | 帧内预测、编解码方法及装置、编解码器、设备、介质 | |
US20240177355A1 (en) | Sub-mesh zippering | |
US11727536B2 (en) | Method and apparatus for geometric smoothing | |
EP4373089A1 (en) | Data processing method and apparatus, computer, and readable storage medium | |
US20230040484A1 (en) | Fast patch generation for video based point cloud coding | |
WO2023173237A1 (zh) | 编解码方法、码流、编码器、解码器以及存储介质 | |
WO2023173238A1 (zh) | 编解码方法、码流、编码器、解码器以及存储介质 | |
EP4360053A1 (en) | Learning-based point cloud compression via unfolding of 3d point clouds | |
KR20240001203A (ko) | Tearing transform을 통한 학습 기반 포인트 클라우드 압축 | |
WO2023180841A1 (en) | Mesh patch sub-division | |
Li et al. | Point Cloud Compression: Technologies and Standardization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21958967 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202180102720.7 Country of ref document: CN |
|
ENP | Entry into the national phase |
Ref document number: 20247014022 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2021958967 Country of ref document: EP Effective date: 20240430 |