WO2023174334A1

WO2023174334A1 - Encoding method and apparatus, decoding method and apparatus, and device

Info

Publication number: WO2023174334A1
Application number: PCT/CN2023/081637
Authority: WO
Inventors: 邹文杰; 张伟; 杨付正; 吕卓逸
Original assignee: 维沃移动通信有限公司
Priority date: 2022-03-18
Filing date: 2023-03-15
Publication date: 2023-09-21
Also published as: CN116800970A

Abstract

The present application relates to the technical field of encoding and decoding, and discloses an encoding method and apparatus, a decoding method and apparatus, and a device. The encoding method comprises: an encoding end screens for repeated vertices in a target three-dimensional grid to obtain first grid geometric information and information of the repeated vertices, the first grid geometric information being grid geometric information which does not comprise the repeated vertices; the encoding end separately encodes the first grid geometric information and the information of the repeated vertices, wherein the repeated vertices are vertices except a first vertex in a plurality of vertices having the same position coordinates, and the first vertex is one of the plurality of vertices having the same position coordinates.

Description

Encoding and decoding methods, devices and equipment

Cross-references to related applications

This application claims priority from Chinese Patent Application No. 202210273182.4 filed in China on March 18, 2022, the entire content of which is incorporated herein by reference.

Technical field

This application belongs to the field of coding and decoding technology, and specifically relates to a coding and decoding method, device and equipment.

Background technique

Three-dimensional mesh (Mesh) can be considered the most popular representation method of three-dimensional models in the past many years, and it plays an important role in many applications. Its expression is simple, so it is widely integrated into the Graphic Processing Unit (GPU) of computers, tablets and smartphones with hardware algorithms, specifically used to render three-dimensional meshes.

For a three-dimensional mesh with a texture map, vertices with repeated geometric positions correspond to different UV coordinates, that is, these vertices appear at the boundary positions of the texture map. When encoding this kind of three-dimensional grid with repeated points, there is a problem of low encoding and decoding efficiency.

Contents of the invention

Embodiments of the present application provide an encoding and decoding method, device and equipment, which can solve the problem of low encoding and decoding efficiency when encoding a three-dimensional grid with repeated points.

The first aspect provides an encoding method, including:

The encoding end screens the repeated vertices in the target three-dimensional grid and obtains the first grid geometric information and the information of the repeated vertices, where the first grid geometric information is the grid geometric information excluding repeated vertices;

The encoding end encodes the first mesh geometric information and the information of the repeated vertices respectively;

Wherein, the repeated vertices are vertices other than the first vertex among the plurality of vertices with the same position coordinates, and the first vertex is one of the plurality of vertices with the same position coordinates.

In a second aspect, an encoding device is provided, including:

A screening module, used to screen repeated vertices in the target three-dimensional grid, and obtain first grid geometric information and information of repeated vertices, where the first grid geometric information is grid geometric information excluding repeated vertices;

An encoding module, configured to encode the first mesh geometric information and the information of the repeated vertices respectively;

The third aspect provides a decoding method, including:

The decoding end decomposes the code stream corresponding to the acquired target three-dimensional grid, and obtains the information of repeated vertices and the first grid geometry information, where the first grid geometry information is grid geometry information excluding repeated vertices;

The decoding end obtains the target three-dimensional mesh based on the information of the repeated vertices and the first mesh geometry information;

In the fourth aspect, a decoding device is provided, including:

The first acquisition module is used to decompose the code stream corresponding to the acquired target three-dimensional grid, and obtain the information of repeated vertices and the first grid geometry information. The first grid geometry information is a grid that does not include repeated vertices. geometric information;

A second acquisition module, configured to acquire the target three-dimensional mesh according to the information of the repeated vertices and the first mesh geometry information;

In a fifth aspect, a coding device is provided, including a processor and a memory. The memory stores programs or instructions that can be run on the processor. When the program or instructions are executed by the processor, the first The steps of the method described in this aspect.

In a sixth aspect, a coding device is provided, including a processor and a communication interface, wherein the processor is used to screen repeated vertices in the target three-dimensional grid and obtain the first grid geometric information and the information of the repeated vertices, The first mesh geometric information is mesh geometric information that does not include repeated vertices; the first mesh geometric information and the information of the repeated vertices are encoded respectively;

In a seventh aspect, a decoding device is provided, including a processor and a memory. The memory stores programs or instructions that can be run on the processor. When the program or instructions are executed by the processor, the third process is implemented. The steps of the method described in this aspect.

In an eighth aspect, a decoding device is provided, including a processor and a communication interface, wherein the processor is used to decompose the obtained code stream corresponding to the target three-dimensional grid, and obtain the information of repeated vertices and the first grid geometry. Information, the first mesh geometry information is mesh geometry information that does not include repeated vertices; according to the information of the repeated vertices and the first mesh geometry information, the target three-dimensional mesh is obtained;

In a ninth aspect, a communication system is provided, including: an encoding device and a decoding device. The encoding device can be used to perform the steps of the method described in the first aspect, and the decoding device can be used to perform the steps of the method described in the third aspect. steps of the method.

In a tenth aspect, a readable storage medium is provided. Programs or instructions are stored on the readable storage medium. When the programs or instructions are executed by a processor, the steps of the method described in the first aspect are implemented, or the steps of the method are implemented as described in the first aspect. The steps of the method described in the third aspect.

In an eleventh aspect, a chip is provided. The chip includes a processor and a communication interface. The communication interface is coupled to the processor. The processor is used to run programs or instructions to implement the method described in the first aspect. method, or implement a method as described in the third aspect.

In a twelfth aspect, a computer program/program product is provided, the computer program/program product is stored in a storage medium, and the computer program/program product is executed by at least one processor to implement the first aspect or the second aspect. The steps of the method described in three aspects.

In the embodiment of the present application, by screening the repeated vertices in the target three-dimensional mesh, the mesh geometric information without repeated vertices and the information of the repeated vertices are respectively encoded, so that the mesh geometric information can be losslessly processed. When encoding and decoding, improve the encoding and decoding efficiency of geometric information.

Description of the drawings

Figure 1 is a schematic flow chart of the encoding method according to the embodiment of the present application;

Figure 2 is a schematic diagram of the fine division process based on grid;

Figure 3 is a schematic diagram of the eight directions of patch arrangement;

Figure 4 is a schematic diagram of the video-based three-dimensional grid geometric information encoding framework;

Figure 5 is a module schematic diagram of the encoding device according to the embodiment of the present application;

Figure 6 is a schematic structural diagram of an encoding device according to an embodiment of the present application;

Figure 7 is a schematic flow chart of the decoding method according to the embodiment of the present application;

Figure 8 is a block diagram of geometric information reconstruction;

Figure 9 is a schematic diagram of the video-based three-dimensional grid geometric information decoding framework;

Figure 10 is a schematic module diagram of a decoding device according to an embodiment of the present application;

Figure 11 is a schematic structural diagram of a communication device according to an embodiment of the present application.

Detailed ways

The technical solutions in the embodiments of the present application will be clearly described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art fall within the scope of protection of this application.

The terms "first", "second", etc. in the description and claims of this application are used to distinguish similar objects and are not used to describe a specific order or sequence. It is to be understood that the terms so used are interchangeable under appropriate circumstances so that the embodiments of the present application can be practiced in sequences other than those illustrated or described herein, and that "first" and "second" are distinguished objects It is usually a category, and the number of objects is not limited. For example, the first object can be one, or Can be multiple. In addition, "and/or" in the description and claims indicates at least one of the connected objects, and the character "/" generally indicates that the related objects are in an "or" relationship.

It is worth pointing out that the technology described in the embodiments of this application is not limited to Long Term Evolution (LTE)/LTE Evolution (LTE-Advanced, LTE-A) systems, and can also be used in other wireless communication systems, such as code Code Division Multiple Access (CDMA), Time Division Multiple Access (TDMA), Frequency Division Multiple Access (FDMA), Orthogonal Frequency Division Multiple Access, OFDMA), Single-carrier Frequency Division Multiple Access (SC-FDMA) and other systems. The terms "system" and "network" in the embodiments of this application are often used interchangeably, and the described technology can be used not only for the above-mentioned systems and radio technologies, but also for other systems and radio technologies. The following description describes a New Radio (NR) system for example purposes, and NR terminology is used in much of the following description, but these techniques can also be applied to applications other than NR system applications, such as 6th ^generation Generation, 6G) communication system.

The prior art related to this application is briefly described below.

In recent years, with the rapid development of multimedia technology, related research results have been rapidly industrialized and become an indispensable and important part of people's lives. Three-dimensional models have become a new generation of digital media after audio, images, and videos. Three-dimensional mesh and point cloud are two commonly used three-dimensional model representation methods. Compared with traditional multimedia such as images and videos, 3D mesh models are more interactive and realistic, making them useful in various fields such as commerce, manufacturing, construction, education, medicine, entertainment, art, and military. has been increasingly widely used.

With people's increasing demand for the visual effects of 3D mesh models, and the emergence of many more mature 3D scanning technologies and 3D modeling software, 3D mesh models obtained through 3D scanning equipment or 3D modeling software The size and complexity of data are also growing rapidly. Therefore, how to efficiently compress 3D mesh data is the key to realizing convenient transmission, storage and processing of 3D mesh data.

A three-dimensional mesh often contains three main types of information: topological information, geometric information and attribute information. Topological information is used to describe the connection relationship between elements such as vertices and patches in the mesh; geometric information is the three-dimensional coordinates of all vertices in the mesh; attribute information records other information attached to the mesh, such as normal vectors, Texture coordinates and colors, etc. Although some traditional general data compression methods can reduce a certain amount of 3D grid data, due to the particularity of 3D grid data, directly using these compression methods to compress 3D grid data often cannot achieve ideal results. Therefore, the compression of three-dimensional mesh data faces new challenges. Among three-dimensional grid data, geometric data often takes up more storage space than topological data. Efficient compression of geometric data will be of extremely important significance in reducing the storage space of three-dimensional grid data. Therefore, the compression of three-dimensional mesh geometric information has become a research focus.

The 3D mesh geometric information compression algorithm can use the 3D geometric information compression algorithm of point cloud. In recent years, there are two main international standards for point cloud compression, namely Video-based Point Cloud Compression (V-PCC) and Geometry-based Point Cloud Compression (G-PCC). PCC).

The main idea of V-PCC is to project the geometric and attribute information of the point cloud into a two-dimensional video, using existing video coding Technology compresses two-dimensional videos to achieve the purpose of compressing point clouds. The geometric coding of V-PCC is achieved by projecting geometric information into placeholder video and geometric video, and using a video encoder to encode the two videos respectively.

The process of V-PCC geometric information encoding mainly includes: first generating a three-dimensional patch (3-dimension patch, 3D patch), which refers to a set of vertices with the same and connected projection plane in the point cloud. The current method of generating 3D patches is to use nearby points to estimate the normal vector of each vertex, calculate the projection plane of each vertex based on the normal vector of each point and the normal vector of the preset plane, and connect the connected vertices with the same projection plane Make up a patch. Then, the 3D patch is projected onto a two-dimensional (2D) plane to form a 2D patch, and the 2D patches are arranged on a two-dimensional image. This process is called patch packing. In order to arrange patches more closely to improve compression performance, current arrangement methods include: priority arrangement, time domain consistent arrangement, global patch allocation, etc. Then, placeholder images and geometric images are generated. A placeholder map is an image that represents the placeholder information of vertices in a two-dimensional image. The position value of the vertex projection in the placeholder map is 1, and the other position values are 0. By arranging patches in a two-dimensional image according to certain rules, a placeholder image is generated. What is stored in the geometric graph is the distance from each vertex to the projection plane. The depth information of each vertex can be directly calculated using the three-dimensional coordinates of the vertex, the projection plane of the vertex and the placeholder map, thereby generating a geometric map. For vertices with repeated projection positions, arrange the geometric coordinates of the vertices except the first projected vertex into the raw patch and put it into a geometric diagram or encode it separately. To improve compression efficiency, an image filling process is performed on the geometric image. Image filling methods include "push-pull" background filling algorithm, filling method based on sparse linear model (Sparse Linear Model), harmonic background filling (Harmonic Background Filling) and other methods. After the image is filled, the final geometric map is obtained. The existing video encoder is used to compress the placeholder map and geometric map to obtain the video code stream. Finally, the placeholder video code stream, the geometric video code stream, and the sub-code stream containing patch information are synthesized into the final total code stream.

The encoding and decoding methods, devices and equipment provided by the embodiments of the present application will be described in detail below with reference to the accompanying drawings through some embodiments and their application scenarios.

As shown in Figure 1, this embodiment of the present application provides an encoding method, including:

Step 101: The encoding end screens the repeated vertices in the target three-dimensional grid and obtains the first grid geometric information and the information of the repeated vertices. The first grid geometric information is the grid geometric information excluding repeated vertices;

It should be noted that the target three-dimensional grid mentioned in this application can be understood as the three-dimensional grid corresponding to any video frame. What this application involves is the screening of repeated vertices in the geometric information of the target three-dimensional grid. The target The geometric information of the three-dimensional grid can be understood as the position coordinates of the vertices in the three-dimensional grid, and the position coordinates usually refer to the three-dimensional coordinates.

It should be noted that the repeated vertices are vertices other than the first vertex among the plurality of vertices with the same position coordinates, and the first vertex is one of the plurality of vertices with the same position coordinates. Optionally, the The first vertex is the first vertex in the vertex list among multiple vertices with the same position coordinates.

Step 102: The encoding end encodes the first mesh geometric information and the information of the repeated vertices respectively;

It should be noted that after obtaining the first mesh geometric information and the information of the repeated vertices, the first mesh geometric information and the information of the repeated vertices can be encoded to obtain the corresponding sub-code stream. The above scheme is based on the target three-dimensional The repeated vertices of the mesh are screened, and the geometric information of the mesh without repeated vertices and the information of repeated vertices are separately processed. Line encoding can improve the coding efficiency of geometric information when lossless encoding and decoding of mesh geometric information.

It should be noted that the information of the repeated vertex includes:

A11. The index of the first vertex corresponding to the repeated vertex;

A12. The number of repeated vertices.

It should be noted that the three-dimensional coordinates of the vertices of the input grid are traversed to find vertices with repeated positions. Filter out the duplicate vertices in the vertex list to obtain mesh geometry information that does not contain duplicate vertices. In the process of filtering out duplicate vertices, record the index value of the first vertex at that position and the number of duplicate vertices in the geometric information after filtering out duplicate vertices, that is, which index values are recorded in the mesh geometry information without duplicate vertices. It turns out that there are duplicate vertices on the vertices and several duplicate vertices have been filtered out. The recorded information is the information of the duplicate vertices.

Optionally, the information of repeated vertices can be represented and encoded in different forms, which may include but is not limited to at least one of the following:

A21. The index value of the vertex in the first grid geometry information is identified as whether there are duplicate vertices at the position corresponding to the vertex and/or the number of duplicate vertices;

It should be noted that in this case, the index value of the vertex in the first mesh geometry information is used to identify the repeating vertex, that is, through the index value corresponding to the vertex, it can be known whether there are other repeating vertices on the vertex. If a vertex has other repeating vertices, the number of repeating vertices can also be determined based on the index value of the vertex.

For example, you can use the 01 binary string to correspond to the index value of a vertex with no duplicate vertex geometry information. Considering that the number of duplicate vertices at a position in the grid information is probably 2, using 0 indicates that there are no duplicate vertices at this position, and using 1 indicates that there are duplicate vertices at this position. It should be noted that if there is a third or more duplicate vertex at this position, the index value of the vertex in the non-duplicate vertex geometry information can be directly encoded and treated as a special case. It should be noted that in this case, encoding methods such as run-length coding and entropy coding can be used to encode the information of repeated vertices.

A22. The pixel value of the position where there are repeated vertices in the target three-dimensional grid in the placeholder map corresponding to the first grid geometry information is determined based on the number of repeated vertices;

That is to say, in this case, the value of the position where the repeated vertices exist in the placeholder map corresponding to the first mesh geometry information is represented based on the number of repeated vertices.

It should be noted that the binary 01 value can not be used to represent the placeholder in the placeholder map in the process of encoding the mesh geometry information without repeated vertices, but the placeholder map can be generated based on the number of repeated vertices. The value of the position of the repeated vertex. For example, you can use the number of repeated vertices plus one to indicate the value of the position where there are repeated vertices in the placeholder map. At this time, if the value of a certain position in the placeholder map is greater than 1, which means that there are duplicate vertices corresponding to this position, and subtracting one from the value can get the number of duplicate vertices at this position.

A23. Use different bits to represent the index of the first vertex corresponding to the repeated vertex and the number of repeated vertices;

It should be noted that in this case, the information of the repeated vertices is directly represented by the index of the repeated vertex in the mesh geometry information that does not exist and the number of repeated vertices. For example, n bits are used to represent the index, and m bits are used to represent the number of repeated vertices. That is, using n+m binary numbers to represent information on a repeated vertex position. Then use entropy editor The coding method is used to encode a x*(n+m) long binary string, where x is the number of positions where repeated vertices exist (there is no limit to non-entropy coding here, other coding methods are also possible).

Optionally, the above-mentioned specific implementation method of encoding the first mesh geometric information is:

Step 1021: The encoding end divides the first grid geometric information into three-dimensional slices;

It should be noted that in this case, the first grid geometry information is mainly divided into patches to obtain multiple three-dimensional patches; the specific implementation method of this step is: the encoding end determines the first grid geometry information The projection plane of each vertex included; the encoding end performs slice division on the vertices included in the first mesh geometry information according to the projection plane; the encoding end performs slice division on the vertices included in the first mesh geometry information Perform clustering to obtain each divided piece. That is to say, the process of patch division mainly includes: first estimating the normal vector of each vertex, selecting the candidate projection plane with the smallest angle between the plane normal vector and the vertex normal vector as the projection plane of the vertex; then, according to the projection The plane initially divides the vertices, and vertices with the same and connected projection planes are composed into a patch; finally, a fine division algorithm is used to optimize the clustering results to obtain the final three-dimensional patch (3D patch).

The specific implementation of the process of obtaining the three-dimensional patch from the first mesh geometric information will be described in detail below.

First estimate the normal vector of each point. The tangent plane and its corresponding normal are defined based on each point's nearest neighbor vertex m at a predefined search distance. KD-Tree is used to separate data and find adjacent points near point p _i , the center of gravity of the set Used to define normals. The calculation method of the center of gravity c is as follows:

Formula 1:

Use the eigendecomposition method to estimate the vertex normal vector, and the calculation process is shown in Formula 2:

Formula 2:

In the initial partitioning stage, the projection plane of each vertex is initially selected. Let the estimated value of the vertex normal vector be The normal vector of the candidate projection plane is Select the plane whose normal vector direction is closest to the vertex normal vector direction as the projection plane of the vertex. The calculation process of plane selection is as shown in Formula 3:

Formula three:

The fine division process can use a grid-based algorithm to reduce the time complexity of the algorithm. The grid-based fine division algorithm flow is shown in Figure 2, which specifically includes:

First set the number of cycles (numlter) to 0 and determine whether the number of cycles is less than the maximum number of cycles (it should be noted that the maximum number of cycles can be set according to usage requirements). If it is less than the maximum number of cycles, perform the following process:

Step 201: Divide the (x, y, z) geometric coordinate space into voxels.

It should be noted that the geometric coordinate space here refers to the geometric coordinate space composed of the first grid geometric information. For example, for a 10-bit mesh using a voxel size of 8, the number of voxels at each coordinate would be 1024/8 = 128, and the total number of voxels in this coordinate space would be 128×128×128.

Step 202: Find filled voxels. Filled voxels refer to voxels that contain at least one point in the grid.

Step 203: Calculate the smoothing score of each filled voxel on each projection plane, recorded as voxScoreSmooth. The voxel smoothing score of the voxel on a certain projection plane is the number of points gathered to the projection plane through the initial segmentation process.

Step 204, use KD-Tree partitioning to find nearest filled voxels, denoted as nnFilledVoxels, that is, the nearest filled voxels of each filled voxel (within the search radius and/or limited to the maximum number of adjacent voxels).

Step 205: Use the voxel smoothing score of the nearest neighbor filled voxel in each projection plane to calculate the smoothing score (scoreSmooth) of each filled voxel. The calculation process is as shown in Formula 4:

Formula 4:

where p is the index of the projection plane and v is the index of the nearest neighbor filling voxel. The scoreSmooth of all points in a voxel is the same.

Step 206: Calculate the normal score using the normal vector of the vertex and the normal vector of the candidate projection plane, recorded as scoreNormal. The calculation process is as shown in Formula 5:

Formula 5: scoreNormal[i][p]=normal[i]·orientation[p];

where p is the index of the projection plane and i is the index of the vertex.

Step 207, use scoreSmooth and scoreNormal to calculate the final score of each voxel on each projection plane. The calculation process is as shown in Formula 6:

Formula six:

Among them, i is the vertex index, p is the index of the projection plane, and v is the voxel index where vertex i is located.

Step 208: Use the scores in step 207 to cluster the vertices to obtain finely divided patches.

Iterate the above process multiple times until a more accurate patch is obtained.

Step 1022: The encoding end performs two-dimensional projection on the divided three-dimensional slice to obtain the two-dimensional slice;

What needs to be said is that this process is to project the 3D patch onto a two-dimensional plane to obtain a two-dimensional patch (2D patch).

It should be noted that patch partitioning converts 3D samples into 2D samples by using a strategy that provides the best compression performance on a given projection plane. The goal of patch division is to decompose the vertices of a frame of 3D model into patches with the smallest number and smooth boundaries, while minimizing the reconstruction error.

Step 1023: The encoding end packages the two-dimensional slices to obtain two-dimensional image information;

It should be noted that this step implements patch packing. The purpose of patch packing is to arrange 2D patches on a two-dimensional image. The basic principle of patch packing is to arrange patches on a two-dimensional image without overlapping or The pixel-free parts of the patch are partially overlapped and arranged on the two-dimensional image. Through priority arrangement, time domain consistent arrangement and other algorithms, the patches are arranged more closely and have time domain consistency to improve coding performance.

Assume that the resolution of the 2D image is WxH, and the minimum block size that defines the patch arrangement is T, which specifies the minimum distance between different patches placed on this 2D grid.

First, patches are inserted and placed on the 2D grid according to the non-overlapping principle. Each patch occupies an area consisting of an integer number of TxT blocks. In addition, there is a requirement of at least one TxT block between adjacent patches. When there is not enough space to place the next patch, the height of the image will be doubled and the patch will continue to be placed.

In order to arrange the patches more closely, the patches can choose a variety of different arrangement directions. For example, eight different arrangement directions can be adopted, as shown in Figure 3, including 0 degrees, 180 degrees, 90 degrees, 270 degrees and mirror images of the first four directions.

In order to obtain better adaptability to inter-frame prediction characteristics of video encoders, a patch arrangement method with temporal consistency is adopted. In a Group of frame (GOF), all patches of the first frame are arranged from largest to smallest. Arranged in order. For other frames in the GOF, the temporal consistency algorithm is used to adjust the order of patches.

It should also be noted here that after obtaining the two-dimensional image information, the patch information can be obtained based on the information in the process of obtaining the two-dimensional image information. After that, the patch information can be encoded and the patch information sub-stream can be obtained;

What needs to be explained here is that in the process of obtaining two-dimensional image information, it is necessary to record the information of patch division, the information of patch projection plane and the information of patch packing position, so the patch information records the information of each step operation in the process of obtaining two-dimensional image. , that is, the patch information includes: patch division information, patch projection plane information, and patch packing position information.

Step 1024: The encoding end obtains a placeholder map and a geometric map based on the two-dimensional image information;

What needs to be said is that the process of obtaining the placeholder map is mainly: using the patch arrangement information obtained by patch packing, setting the position of the vertex in the two-dimensional image to 1, and setting the remaining positions to 0 to obtain the placeholder map. For the process of obtaining the geometric map, the main process is: in the process of obtaining the 2D patch through projection, the distance from each vertex to the projection plane is saved. This distance is called the depth. The low-precision geometric map compression part is to compress each 2D patch in the 2D patch. The depth value of the vertex is arranged to the position of the vertex in the placeholder map to obtain the geometric map.

Step 1025: The encoding end encodes the placeholder image and the geometric image respectively.

The video-based three-dimensional grid geometric information encoding framework of the embodiment of this application is shown in Figure 4. The overall encoding process is:

Specifically, for the encoding end, the three-dimensional grid is first screened for repeated vertices. In the process, another form of information representing repeated vertices is recorded and encoded, such as another picture representing repeated vertices, repeated vertices, etc. The index of a vertex or a string marking the location of a duplicate, etc. The information of repeated vertices can be encoded according to the representation form of the information of repeated vertices, including but not limited to run-length coding, entropy coding, video coding, etc., or the geometric map put into V-PCC for video coding. The filtered mesh geometry information does not contain vertices with repeated geometric coordinates. After recording the information of repeated vertices, the repeated vertices in the mesh geometric information are removed. For the geometric information of the three-dimensional mesh that does not contain repeated vertices, the patch is divided by projection, and the patches are arranged to generate patch sequence compression information (patch division information). , placeholder map and geometric map; finally, encode the patch sequence to compress the information, placeholder map, and geometric map to obtain the corresponding sub-stream; separately encode the information of the repeated vertices to form a repeated vertex sub-stream; finally, the multi-path sub-stream is The code streams are mixed to obtain the final output code stream.

It should be noted that this application provides a new compression algorithm that independently handles repeated position vertices, which is of great significance for realizing efficient compression of three-dimensional mesh geometric information.

For the encoding method provided by the embodiment of the present application, the execution subject may be an encoding device. In the embodiment of the present application, the encoding device performing the encoding method is taken as an example to illustrate the encoding device provided by the embodiment of the present application.

As shown in Figure 5, this embodiment of the present application provides an encoding device 500, which includes:

The screening module 501 is used to screen repeated vertices in the target three-dimensional mesh and obtain the first mesh geometric information and the information of the repeated vertices. The first mesh geometric information is the mesh geometric information excluding repeated vertices. ;

Encoding module 502, used to encode the first mesh geometric information and the information of the repeated vertices respectively;

Optionally, the repeated vertex information includes:

the index of the first vertex corresponding to the repeated vertex;

The number of repeated vertices.

Optionally, the information of the repeated vertices is represented by at least one of the following:

The index value of the vertex in the first mesh geometry information is identified as whether there are duplicate vertices at the position corresponding to the vertex and/or the number of duplicate vertices;

The pixel value of the position where there are repeated vertices in the target three-dimensional grid in the placeholder map corresponding to the first grid geometry information is determined based on the number of repeated vertices;

Different bits are respectively used to represent the index of the first vertex corresponding to the repeated vertex and the number of repeated vertices.

Optionally, the encoding module 502 includes:

A dividing unit, used to divide the first grid geometric information into three-dimensional slices;

The first acquisition unit is used to perform two-dimensional projection on the divided three-dimensional slices to obtain the two-dimensional slices;

a second acquisition unit, used to package the two-dimensional slices and acquire two-dimensional image information;

A third acquisition unit, configured to acquire placeholder images and geometric images according to the two-dimensional image information;

The first coding unit is used to code the placeholder map and the geometric map respectively.

Optionally, after the second acquisition unit packages the two-dimensional slices and acquires two-dimensional image information, the encoding module 502 further includes:

The fourth acquisition unit is used to acquire slice information based on the information in the process of acquiring two-dimensional image information;

The second encoding unit is used to encode the slice information and obtain the slice information sub-stream.

This device embodiment corresponds to the above-mentioned encoding method embodiment. Each implementation process and implementation manner of the above-mentioned method embodiment can be applied to this device embodiment, and can achieve the same technical effect.

Embodiments of the present application also provide a coding device, including a processor and a communication interface, wherein the processor is used to screen repeated vertices in the target three-dimensional grid and obtain the first grid geometric information and the information of the repeated vertices. , the first mesh geometric information is mesh geometric information excluding repeated vertices; the first mesh geometric information and the information of the repeated vertices are encoded respectively;

Specifically, this embodiment of the present application also provides an encoding device. As shown in Figure 6 , the encoding device 600 includes: a processor 601, a network interface 602, and a memory 603. The network interface 602 is, for example, a common public radio interface (CPRI).

Specifically, the encoding device 600 in the embodiment of the present application also includes: instructions or programs stored in the memory 603 and executable on the processor 601. The processor 601 calls the instructions or programs in the memory 603 to execute the modules shown in Figure 5 The implementation method and achieve the same technical effect will not be repeated here to avoid repetition.

As shown in Figure 7, this embodiment of the present application also provides a decoding method, including:

Step 701: The decoder decomposes the code stream corresponding to the obtained target three-dimensional grid and obtains the information of repeated vertices. And the first grid geometry information;

It should be noted that the first mesh geometry information is mesh geometry information that does not include repeated vertices;

Step 702: The decoding end obtains the target three-dimensional mesh based on the information of the repeated vertices and the first mesh geometry information;

Optionally, the repeated vertex information includes:

the index of the first vertex corresponding to the repeated vertex;

The number of repeated vertices.

Optionally, the specific implementation of obtaining the first grid geometry information includes:

The decoding end obtains the target sub-code stream according to the obtained code stream corresponding to the target three-dimensional grid. The target sub-code stream includes: a slice information sub-stream, a placeholder map sub-stream and a geometric map sub-stream;

The decoding end obtains placeholder images and geometric images according to the target sub-stream;

The decoding end obtains the first grid geometry information based on the placeholder map and the geometric map.

Optionally, obtaining the first grid geometry information based on the placeholder map and the geometric map includes:

The decoding end obtains two-dimensional image information based on the placeholder map and the geometric map;

The decoding end obtains a two-dimensional slice according to the two-dimensional image information;

The decoding end performs three-dimensional back-projection on the two-dimensional slice according to the slice information corresponding to the slice information sub-stream to obtain the three-dimensional slice;

The decoder acquires the first grid geometry information based on the three-dimensional slice.

It should be noted that the geometric information reconstruction process is a process of reconstructing a three-dimensional geometric model using patch information, placeholder images, geometric images, and repeated vertex information. The specific process is shown in Figure 8, which is mainly divided into four steps:

Step 801, obtain 2D patch;

Obtaining a 2D patch refers to using the patch information to segment the occupancy information and depth information of the 2D patch from the occupancy map and geometric map. The patch information contains the position and size of the bounding box of each 2D patch in the placeholder map and geometric map. The placeholder information and geometric information of the 2D patch can be directly obtained by using the patch information, placeholder map, and geometric map.

Step 802, reconstruct the 3D patch;

Reconstructing a 3D patch refers to using the occupancy information and geometric information in the 2D patch to reconstruct the vertices in the 2D patch into a 3D patch. The occupancy information of the 2D patch includes the relative position of the vertices in the local coordinate system of the patch projection plane. The position of the coordinate origin, and the depth information contains the depth value of the vertex in the normal direction of the projection plane. Therefore, the 2D patch can be reconstructed into a 3D patch in the local coordinate system using the occupancy information and depth information.

Step 803, reconstruct the geometric model without repeated points;

Reconstructing a geometric model without repeated points refers to using the reconstructed 3D patch to reconstruct the entire three-dimensional geometric model. The patch information contains the conversion relationship of the 3D patch from the local coordinate system to the global coordinate system of the three-dimensional geometric model. Using the coordinate conversion relationship to convert all 3D patches to the global coordinate system, a three-dimensional geometric model without repeated vertices is obtained.

Step 804: Reconstruct the geometric model containing repeated points;

For the information of repeated vertices obtained by decoding, the information of repeated vertices is directly used, such as the corresponding relationship with the vertices in the three-dimensional geometric model without repeated vertices, and the geometric coordinates of the corresponding positions of the repeated vertices can be supplemented back to the geometric information of the non-repeated vertices. , thereby obtaining a three-dimensional geometric model containing repeated vertices.

The video-based three-dimensional grid geometric information decoding framework of the embodiment of this application is shown in Figure 9. The overall decoding process is:

First, the code stream is decomposed into patch information sub-stream, placeholder map sub-stream, geometric map sub-stream and repeated vertex sub-stream and decoded respectively; using placeholder map and geometric map, the code stream without repeated vertices can be reconstructed. The geometric information of the three-dimensional mesh, combined with the decoded repeated vertex information, can reconstruct the original mesh geometric information.

The above solution decodes the mesh geometry information without repeated vertices and the information of repeated vertices separately, thereby improving the decoding efficiency of the geometric information when performing lossless encoding and decoding of the mesh geometry information.

It should be noted that the embodiment of the present application is a method embodiment of the opposite end corresponding to the embodiment of the above encoding method. The decoding process is the inverse process of encoding. All the above implementation methods on the encoding side are applicable to the embodiment of the decoding end. The same technical effect can also be achieved, which will not be described again here.

As shown in Figure 10, this embodiment of the present application also provides a decoding device 1000, which includes:

The first acquisition module 1001 is used to decompose the code stream corresponding to the acquired target three-dimensional mesh, and acquire the information of repeated vertices and the first mesh geometric information. The first mesh geometric information is a mesh that does not include repeated vertices. Lattice geometry information;

The second acquisition module 1002 is used to acquire the target three-dimensional mesh according to the information of the repeated vertices and the first mesh geometry information;

Optionally, the repeated vertex information includes:

the index of the first vertex corresponding to the repeated vertex;

The number of repeated vertices.

Optionally, the first acquisition module 1001 decomposes the acquired code stream, and when acquiring the first grid geometry information, includes:

The fifth acquisition unit is used to acquire the target sub-code stream according to the obtained code stream corresponding to the target three-dimensional grid. The target sub-code stream includes: a slice information sub-code stream, a placeholder map sub-code stream and a geometric map sub-code stream. flow;

The sixth acquisition unit is used to acquire placeholder images and geometric images according to the target sub-code stream;

A seventh acquisition unit is used to acquire the first grid geometry information according to the placeholder map and the geometric map.

Optionally, the seventh acquisition unit is used for:

Obtain two-dimensional image information according to the placeholder map and the geometric map;

According to the two-dimensional image information, obtain a two-dimensional slice;

Perform three-dimensional back-projection on the two-dimensional slice according to the slice information corresponding to the slice information sub-stream to obtain the three-dimensional slice;

According to the three-dimensional slice, the first grid geometry information is obtained.

It should be noted that this device embodiment is a device corresponding to the above-mentioned method. All implementation methods in the above-mentioned method embodiment are applicable to this device embodiment and can achieve the same technical effect, which will not be described again here.

Preferably, the embodiment of the present application also provides a decoding device, including a processor, a memory, and a program or instruction stored in the memory and executable on the processor. When the program or instruction is executed by the processor, the above-mentioned decoding device is implemented. Each process of the decoding method embodiment can achieve the same technical effect. To avoid repetition, it will not be described again here.

Embodiments of the present application also provide a readable storage medium. Programs or instructions are stored on the computer-readable storage medium. When the program or instructions are executed by a processor, each process of the above-mentioned decoding method embodiment is implemented, and the same process can be achieved. To avoid repetition, the technical effects will not be repeated here.

Among them, the computer-readable storage medium is such as read-only memory (ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, etc.

Embodiments of the present application also provide a decoding device, including a processor and a communication interface, wherein the processor is used to decompose the obtained code stream corresponding to the target three-dimensional grid, and obtain the information of repeated vertices and the first grid. Geometric information, the first mesh geometric information is mesh geometric information excluding repeated vertices; according to the information of the repeated vertices and the first mesh geometric information, the target three-dimensional mesh is obtained;

This decoding device embodiment corresponds to the above-mentioned decoding method embodiment. Each implementation process and implementation manner of the above-mentioned method embodiment can be applied to this decoding device embodiment, and can achieve the same technical effect.

Specifically, the embodiment of the present application also provides a decoding device. Specifically, the decoding device in the embodiment of the present application also includes: instructions or programs stored in the memory and executable on the processor. The processor calls the instructions or programs in the memory to execute the method executed by each module shown in Figure 10, and To achieve the same technical effect, to avoid repetition, we will not repeat them here.

Embodiments of the present application also provide a readable storage medium, where programs or instructions are stored on the readable storage medium. When the program or instruction is executed by the processor, each process of the above decoding method embodiment is implemented, and the same technical effect can be achieved. To avoid repetition, the details will not be described here.

Wherein, the processor is the processor in the decoding device described in the above embodiment. The readable storage medium includes computer readable storage media, such as computer read-only memory ROM, random access memory RAM, magnetic disk or optical disk, etc.

Optionally, as shown in Figure 11, this embodiment of the present application also provides a communication device 1100, which includes a processor 1101 and a memory 1102. The memory 1102 stores programs or instructions that can be run on the processor 1101, such as , when the communication device 1100 is a coding device, when the program or instruction is executed by the processor 1101, each step of the above coding method embodiment is implemented, and the same technical effect can be achieved. When the communication device 1100 is a decoding device, when the program or instruction is executed by the processor 1101, each step of the above decoding method embodiment is implemented, and the same technical effect can be achieved. To avoid duplication, the details will not be described here.

An embodiment of the present application further provides a chip. The chip includes a processor and a communication interface. The communication interface is coupled to the processor. The processor is used to run programs or instructions to implement the above encoding method or decoding method. Each process in the example can achieve the same technical effect. To avoid repetition, we will not repeat it here.

It should be understood that the chips mentioned in the embodiments of this application may also be called system-on-chip, system-on-a-chip, system-on-chip or system-on-chip, etc.

Embodiments of the present application further provide a computer program/program product. The computer program/program product is stored in a storage medium. The computer program/program product is executed by at least one processor to implement the above encoding method or decoding method. Each process of the embodiment can achieve the same technical effect, so to avoid repetition, it will not be described again here.

Embodiments of the present application also provide a communication system, which at least includes: an encoding device and a decoding device. The encoding device can be used to perform the steps of the encoding method as described above. The decoding device can be used to perform the decoding method as described above. A step of. And can achieve the same technical effect. To avoid repetition, they will not be described again here.

It should be noted that, in this document, the terms "comprising", "comprises" or any other variations thereof are intended to cover a non-exclusive inclusion, such that a process, method, article or device that includes a series of elements not only includes those elements, It also includes other elements not expressly listed or inherent in the process, method, article or apparatus. Without further limitation, an element defined by the statement "comprises a..." does not exclude the presence of additional identical elements in a process, method, article or apparatus that includes that element. In addition, it should be pointed out that the scope of the methods and devices in the embodiments of the present application is not limited to performing functions in the order shown or discussed, but may also include performing functions in a substantially simultaneous manner or in reverse order according to the functions involved. Functions may be performed, for example, the methods described may be performed in an order different from that described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better. implementation. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the existing technology can be embodied in the form of a computer software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), including a number of instructions to cause a terminal (which can be a mobile phone, computer, server, air conditioner, or network equipment, etc.) to execute the method described in various embodiments of this application.

The embodiments of the present application have been described above in conjunction with the accompanying drawings. However, the present application is not limited to the above-mentioned specific implementations. The above-mentioned specific implementations are only illustrative and not restrictive. Those of ordinary skill in the art will Inspired by this application, many forms can be made without departing from the purpose of this application and the scope protected by the claims, all of which fall within the protection of this application.

Claims

A coding method that includes:

The encoding end screens the repeated vertices in the target three-dimensional grid and obtains the first grid geometric information and the information of the repeated vertices, where the first grid geometric information is the grid geometric information excluding repeated vertices;

The encoding end encodes the first mesh geometric information and the information of the repeated vertices respectively;

Wherein, the repeated vertices are vertices other than the first vertex among the plurality of vertices with the same position coordinates, and the first vertex is one of the plurality of vertices with the same position coordinates.
The method according to claim 1, wherein the information of the repeated vertices includes:

the index of the first vertex corresponding to the repeated vertex;

The number of repeated vertices.
The method according to claim 1, wherein the information of the repeated vertices is represented by at least one of the following:

The index value of the vertex in the first mesh geometry information is identified as whether there are duplicate vertices at the position corresponding to the vertex and/or the number of duplicate vertices;

The pixel value of the position where there are repeated vertices in the target three-dimensional grid in the placeholder map corresponding to the first grid geometry information is determined based on the number of repeated vertices;

Different bits are respectively used to represent the index of the first vertex corresponding to the repeated vertex and the number of repeated vertices.
The method of claim 1, wherein encoding the first mesh geometry information includes:

The encoding end divides the first grid geometric information into three-dimensional slices;

The encoding end performs two-dimensional projection on the divided three-dimensional slice to obtain the two-dimensional slice;

The encoding end packages the two-dimensional slices to obtain two-dimensional image information;

The encoding end obtains placeholder images and geometric images based on the two-dimensional image information;

The encoding end encodes the placeholder image and the geometric image respectively.
A decoding method including:

The decoding end decomposes the code stream corresponding to the acquired target three-dimensional grid, and obtains the information of repeated vertices and the first grid geometry information, where the first grid geometry information is grid geometry information excluding repeated vertices;

The decoding end obtains the target three-dimensional mesh based on the information of the repeated vertices and the first mesh geometry information;

Wherein, the repeated vertices are vertices other than the first vertex among the plurality of vertices with the same position coordinates, and the first vertex is one of the plurality of vertices with the same position coordinates.
The method according to claim 5, wherein the information of the repeated vertices includes:

the index of the first vertex corresponding to the repeated vertex;

The number of repeated vertices.
The method according to claim 5, wherein the information of the repeated vertices is represented by at least one of the following:

The index value of the vertex in the first mesh geometry information is identified as whether there is a duplicate vertex at the position corresponding to the vertex. and/or the number of repeated vertices;

The pixel value of the position where there are repeated vertices in the target three-dimensional grid in the placeholder map corresponding to the first grid geometry information is determined based on the number of repeated vertices;

Different bits are respectively used to represent the index of the first vertex corresponding to the repeated vertex and the number of repeated vertices.
The method of claim 5, wherein obtaining the first mesh geometry information includes:

The decoding end obtains the target sub-code stream according to the obtained code stream corresponding to the target three-dimensional grid. The target sub-code stream includes: a slice information sub-stream, a placeholder map sub-stream and a geometric map sub-stream;

The decoding end obtains placeholder images and geometric images according to the target sub-stream;

The decoding end obtains the first grid geometry information based on the placeholder map and the geometric map.
The method according to claim 8, wherein said obtaining the first grid geometry information according to the placeholder map and the geometric map includes:

The decoding end obtains two-dimensional image information based on the placeholder map and the geometric map;

The decoding end obtains a two-dimensional slice according to the two-dimensional image information;

The decoding end performs three-dimensional back-projection on the two-dimensional slice according to the slice information corresponding to the slice information sub-stream to obtain the three-dimensional slice;

The decoder acquires the first grid geometry information based on the three-dimensional slice.
An encoding device comprising:

A screening module, used to screen repeated vertices in the target three-dimensional grid, and obtain first grid geometric information and information of repeated vertices, where the first grid geometric information is grid geometric information excluding repeated vertices;

An encoding module, configured to encode the first mesh geometric information and the information of the repeated vertices respectively;

Wherein, the repeated vertices are vertices other than the first vertex among the plurality of vertices with the same position coordinates, and the first vertex is one of the plurality of vertices with the same position coordinates.
The device according to claim 10, wherein the information of the repeated vertices includes:

the index of the first vertex corresponding to the repeated vertex;

The number of repeated vertices.
The device according to claim 10, wherein the information of the repeated vertices is represented by at least one of the following:

The index value of the vertex in the first mesh geometry information is identified as whether there are duplicate vertices at the position corresponding to the vertex and/or the number of duplicate vertices;

The pixel value of the position where there are repeated vertices in the target three-dimensional grid in the placeholder map corresponding to the first grid geometry information is determined based on the number of repeated vertices;

Different bits are respectively used to represent the index of the first vertex corresponding to the repeated vertex and the number of repeated vertices.
The device according to claim 10, wherein the encoding module includes:

A dividing unit, used to divide the first grid geometric information into three-dimensional slices;

The first acquisition unit is used to perform two-dimensional projection on the divided three-dimensional slices to obtain the two-dimensional slices;

a second acquisition unit, used to package the two-dimensional slices and acquire two-dimensional image information;

A third acquisition unit, configured to acquire placeholder images and geometric images according to the two-dimensional image information;

The first coding unit is used to code the placeholder map and the geometric map respectively.
A coding device, including a processor and a memory, the memory stores a program or instructions that can be run on the processor, wherein when the program or instructions are executed by the processor, any of claims 1 to 4 is implemented. One of the steps of the encoding method.
A decoding device including:

The first acquisition module is used to decompose the code stream corresponding to the acquired target three-dimensional grid, and obtain the information of repeated vertices and the first grid geometry information. The first grid geometry information is a grid that does not include repeated vertices. geometric information;

A second acquisition module, configured to acquire the target three-dimensional mesh according to the information of the repeated vertices and the first mesh geometry information;

Wherein, the repeated vertices are vertices other than the first vertex among the plurality of vertices with the same position coordinates, and the first vertex is one of the plurality of vertices with the same position coordinates.
The device according to claim 15, wherein the information of the repeated vertices includes:

the index of the first vertex corresponding to the repeated vertex;

The number of repeated vertices.
The device according to claim 15, wherein the information of the repeated vertices is represented by at least one of the following:

The index value of the vertex in the first mesh geometry information is identified as whether there are duplicate vertices at the position corresponding to the vertex and/or the number of duplicate vertices;

The pixel value of the position where there are repeated vertices in the target three-dimensional grid in the placeholder map corresponding to the first grid geometry information is determined based on the number of repeated vertices;

Different bits are respectively used to represent the index of the first vertex corresponding to the repeated vertex and the number of repeated vertices.
The device according to claim 15, wherein the first acquisition module decomposes the acquired code stream, and when acquiring the first grid geometry information, includes:

The fifth acquisition unit is used to acquire the target sub-code stream according to the obtained code stream corresponding to the target three-dimensional grid. The target sub-code stream includes: a slice information sub-code stream, a placeholder map sub-code stream and a geometric map sub-code stream. flow;

The sixth acquisition unit is used to acquire placeholder images and geometric images according to the target sub-code stream;

A seventh acquisition unit is used to acquire the first grid geometry information according to the placeholder map and the geometric map.
The device according to claim 18, wherein the seventh acquisition unit is used for:

Obtain two-dimensional image information according to the placeholder map and the geometric map;

According to the two-dimensional image information, obtain a two-dimensional slice;

Perform three-dimensional back-projection on the two-dimensional slice according to the slice information corresponding to the slice information sub-stream to obtain the three-dimensional slice;

According to the three-dimensional slice, the first grid geometry information is obtained.
A decoding device, including a processor and a memory, the memory stores a program or instructions that can be run on the processor, wherein when the program or instructions are executed by the processor, any of claims 5 to 9 is implemented. The steps of the decoding method described in one item.
A readable storage medium on which programs or instructions are stored, wherein when the programs or instructions are executed by a processor, the steps of the encoding method according to any one of claims 1 to 4 are implemented or as The steps of the decoding method according to any one of claims 5 to 9.