WO2023231872A1 - 编码方法、解码方法、装置及设备 - Google Patents

编码方法、解码方法、装置及设备 Download PDF

Info

Publication number
WO2023231872A1
WO2023231872A1 PCT/CN2023/096097 CN2023096097W WO2023231872A1 WO 2023231872 A1 WO2023231872 A1 WO 2023231872A1 CN 2023096097 W CN2023096097 W CN 2023096097W WO 2023231872 A1 WO2023231872 A1 WO 2023231872A1
Authority
WO
WIPO (PCT)
Prior art keywords
code stream
information
grid
target
dimensional grid
Prior art date
Application number
PCT/CN2023/096097
Other languages
English (en)
French (fr)
Inventor
邹文杰
张伟
杨付正
吕卓逸
Original Assignee
维沃移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 维沃移动通信有限公司 filed Critical 维沃移动通信有限公司
Publication of WO2023231872A1 publication Critical patent/WO2023231872A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding

Definitions

  • This application belongs to the field of coding and decoding technology, and specifically relates to an encoding method, a decoding method, a device and equipment.
  • Three-dimensional mesh can be considered the most popular representation method of three-dimensional models in the past many years, and it plays an important role in many applications. Its expression is simple, so it is widely integrated into the graphics processing units of computers, tablets and smartphones with hardware algorithms, specifically used to render three-dimensional meshes.
  • Texture coordinates also known as UV coordinates
  • UV coordinates are information that describes the texture of the vertices of a three-dimensional mesh.
  • the amount of UV coordinate data accounts for a large proportion in the three-dimensional grid. Encoding UV coordinates in related technical solutions will consume a large amount of code rate, resulting in low efficiency of three-dimensional grid encoding.
  • Embodiments of the present application provide an encoding method, a decoding method, a device and equipment, which can solve the problem of low efficiency of three-dimensional grid encoding in related technical solutions.
  • the first aspect provides an encoding method, including:
  • the encoding end encodes the basic grid corresponding to the target three-dimensional grid according to the first identification information to obtain the first code stream.
  • the basic grid includes reconstructed texture coordinate information corresponding to the target three-dimensional grid.
  • the first The identification information is used to indicate whether to encode the reconstructed texture coordinate information;
  • the encoding end obtains the second code stream according to the grid difference information.
  • the grid difference information is used to characterize the difference information between the basic grid and the three-dimensional grid to be encoded.
  • the target three-dimensional grid is Obtained based on the three-dimensional grid to be encoded;
  • the encoding end obtains a third code stream based on the reconstructed texture map information, which is obtained based on the first code stream and the second code stream;
  • the encoding end generates a target code stream based on the first code stream, the second code stream, and the third code stream.
  • the second aspect provides a decoding method, including:
  • the decoding end decomposes the obtained target code stream to obtain the first code stream, the second code stream and the third code stream.
  • the first code stream is obtained based on the basic grid corresponding to the target three-dimensional grid
  • the third code stream is obtained.
  • the two-code stream is obtained based on grid difference information.
  • the grid difference information is used to characterize the difference information between the basic grid and the three-dimensional grid to be encoded.
  • the target three-dimensional grid is based on the three-dimensional grid to be encoded.
  • the third code stream is obtained based on the reconstructed texture map information. arrived;
  • the decoding end determines that the first code stream includes reconstructed texture coordinate information, according to the first decoding result corresponding to the first code stream, the second decoding result corresponding to the second code stream and the The third decoding result corresponding to the third code stream reconstructs the target three-dimensional grid;
  • the decoding end determines that the first code stream does not include reconstructed texture coordinate information
  • the reconstructed texture coordinate information is generated, and based on the generated reconstructed texture coordinate information, the first decoding result corresponding to the first code stream,
  • the second decoding result corresponding to the second code stream and the third decoding result corresponding to the third code stream reconstruct the target three-dimensional grid.
  • an encoding device applied to the encoding end, including:
  • a first encoding module configured to encode the basic grid corresponding to the target three-dimensional grid according to the first identification information, and obtain the first code stream.
  • the basic grid includes reconstructed texture coordinate information corresponding to the target three-dimensional grid.
  • the first identification information is used to characterize whether to encode the reconstructed texture coordinate information;
  • the first acquisition module is used to acquire the second code stream according to the grid difference information.
  • the grid difference information is used to characterize the difference information between the basic grid and the three-dimensional grid to be encoded.
  • the target three-dimensional grid The grid is obtained based on the three-dimensional grid to be encoded;
  • a second acquisition module configured to acquire a third code stream based on the reconstructed texture map information, which is obtained based on the first code stream and the second code stream;
  • a first generation module configured to generate a target code stream according to the first code stream, the second code stream and the third code stream.
  • a decoding device applied to the decoding end, including:
  • the sixth acquisition module is used to decompose the acquired target code stream to obtain the first code stream, the second code stream and the third code stream.
  • the first code stream is obtained based on the basic grid corresponding to the target three-dimensional grid.
  • the second code stream is obtained based on grid difference information.
  • the grid difference information is used to characterize the difference information between the basic grid and the three-dimensional grid to be encoded.
  • the target three-dimensional grid is obtained based on the three-dimensional grid to be encoded, and the third code stream is obtained based on the reconstructed texture map information;
  • a reconstruction module configured to: when the decoding end determines that the first code stream includes reconstructed texture coordinate information, according to the first decoding result corresponding to the first code stream, the second code stream corresponding to the second code stream The decoding result and the third decoding result corresponding to the third code stream are used to reconstruct the target three-dimensional grid; and/or, when the decoding end determines that the first code stream does not include reconstructed texture coordinate information, Generate reconstructed texture coordinate information, and use the generated reconstructed texture coordinate information, the first decoding result corresponding to the first code stream, the second decoding result corresponding to the second code stream, and the third decoding result corresponding to the third code stream. Three decoding results, reconstruct the target three-dimensional grid.
  • an encoding device in a fifth aspect, includes a processor and a memory.
  • the memory stores a program or instructions that can be run on the processor.
  • the program or instructions are implemented when executed by the processor. The steps of the method as described in the first aspect.
  • an encoding device including a processor and a communication interface, wherein the processor is configured to encode the basic grid corresponding to the target three-dimensional grid according to the first identification information to obtain the first code stream,
  • the basic network includes reconstructed texture coordinate information corresponding to the target three-dimensional grid, and the first identification information is used to indicate whether to encode the reconstructed texture coordinate information; according to the grid difference information, a second code stream is obtained, and the network
  • the grid difference information is used to characterize the difference information between the basic grid and the three-dimensional grid to be encoded.
  • the target three-dimensional grid is obtained based on the three-dimensional grid to be encoded; according to the reconstructed texture map information, the third Three code streams, the reconstructed texture map information is obtained according to the first code stream and the second code stream; according to the first code stream, the second code stream and the third code stream, generate Target code stream.
  • a decoding device in a seventh aspect, includes a processor and a memory.
  • the memory stores a program or instructions that can be run on the processor.
  • the program or instructions are implemented when executed by the processor. The steps of the method as described in the second aspect.
  • a decoding device including a processor and a communication interface, wherein the processor is used to decompose the acquired target code stream to obtain a first code stream, a second code stream and a third code stream, The first code stream is obtained based on the basic grid corresponding to the target three-dimensional grid, and the second code stream is obtained based on grid difference information.
  • the grid difference information is used to characterize the difference between the basic grid and Difference information between the three-dimensional grids to be encoded, the target three-dimensional grid is obtained based on the three-dimensional grid to be encoded, and the third code stream is obtained based on the reconstructed texture map information;
  • the second decoding result corresponding to the second code stream and the third code stream reconstructs the target three-dimensional grid
  • the first code stream does not include reconstructed texture coordinate information
  • reconstructed texture coordinate information is generated, and the reconstructed texture coordinate information is generated according to the generated reconstructed texture coordinate information, the first decoding result corresponding to the first code stream, and the second The second decoding result corresponding to the code stream and the third decoding result corresponding to the third code stream are used to reconstruct the target three-dimensional grid.
  • a ninth aspect provides a coding and decoding system, including: an encoding device and a decoding device.
  • the encoding device can be used to perform the steps of the encoding method as described in the first aspect.
  • the decoding device can be used to perform the steps of the encoding method as described in the second aspect. The steps of the decoding method.
  • a readable storage medium In a tenth aspect, a readable storage medium is provided. Programs or instructions are stored on the readable storage medium. When the programs or instructions are executed by a processor, the steps of the method described in the first aspect are implemented, or the steps of the method are implemented as described in the first aspect. The steps of the method described in the second aspect.
  • a chip in an eleventh aspect, includes a processor and a communication interface.
  • the communication interface is coupled to the processor.
  • the processor is used to run programs or instructions to implement the method described in the first aspect. method, or implement a method as described in the second aspect.
  • a computer program/program product is provided, the computer program/program product is stored in a storage medium, and the computer program/program product is executed by at least one processor to implement as described in the first aspect
  • the encoding end encodes the basic grid corresponding to the target three-dimensional grid according to the first identification information.
  • the basic grid includes the reconstructed texture coordinate information corresponding to the target three-dimensional grid, and obtains the first code. stream, obtain the second code stream according to the grid difference information, and obtain the third code stream according to the reconstructed texture map information, and obtain the third code stream according to the first code stream, the second code stream and the third code stream to generate a target code stream. Since the amount of reconstructed texture coordinate data accounts for a large proportion in the three-dimensional grid, in the embodiment of the present application, you can choose not to encode the reconstructed texture coordinate information in the basic grid according to the first identification information, which can greatly save the code rate and improve the coding efficiency. .
  • Figure 1 is a schematic flow chart of the encoding method according to the embodiment of the present application.
  • Figure 2 is a coding framework diagram of a three-dimensional grid in the embodiment of the present application.
  • FIG. 3 is a schematic diagram of the preprocessing module in the embodiment of the present application.
  • Figure 4 is a schematic diagram of the process of merging vertices during the mesh simplification process in the embodiment of the present application
  • Figure 5 is a schematic diagram of the midpoint subdivision method according to the embodiment of the present application.
  • Figure 6 is a schematic diagram of the displacement calculation method in the embodiment of the present application.
  • Figure 7 is a schematic diagram of the five operating modes defined in EB.
  • Figure 8 is a schematic diagram of the prediction of a parallelogram with geometric coordinates
  • Figure 9 is a schematic flow chart of the decoding method according to the embodiment of the present application.
  • Figure 10 is a schematic diagram of the three-dimensional grid decoding framework in the embodiment of the present application.
  • Figure 11 is a schematic module diagram of an encoding device according to an embodiment of the present application.
  • Figure 12 is a schematic structural diagram of an encoding device according to an embodiment of the present application.
  • Figure 13 is a schematic module diagram of a decoding device according to an embodiment of the present application.
  • Figure 14 is a schematic structural diagram of a decoding device according to an embodiment of the present application.
  • Figure 15 is a schematic structural diagram of a communication device according to an embodiment of the present application.
  • first, second, etc. in the description and claims of this application are used to distinguish similar objects and are not used to describe a specific order or sequence. It is to be understood that the terms so used are interchangeable under appropriate circumstances so that the embodiments of the present application can be practiced in sequences other than those illustrated or described herein, and that "first" and “second” are distinguished objects It is usually one type, and the number of objects is not limited.
  • the first object can be one or multiple.
  • “and/or” in the description and claims indicates at least one of the connected objects, and the character “/" generally indicates that the related objects are in an "or” relationship.
  • LTE Long Term Evolution
  • LTE-Advanced, LTE-A Long Term Evolution
  • CDMA Code Division Multiple Access
  • TDMA Time Division Multiple Access
  • FDMA Frequency Division Multiple Access
  • OFDMA Orthogonal Frequency Division Multiple Access
  • SC-FDMA Single-carrier Frequency Division Multiple Access
  • NR New Radio
  • this embodiment of the present application provides an encoding method, including:
  • Step 101 The encoding end encodes the basic grid corresponding to the target three-dimensional grid according to the first identification information, and obtains the first code stream.
  • the basic grid includes the reconstructed texture coordinate information corresponding to the target three-dimensional grid, so
  • the first identification information is used to indicate whether to encode the reconstructed texture coordinate information.
  • the first identification information is used to determine whether to encode the reconstructed texture coordinate information corresponding to the target three-dimensional grid. For example, when the first identification information is 1, it indicates that the reconstructed texture coordinate information needs to be encoded. When the first identification information is 0, it indicates that there is no need to encode the reconstructed texture coordinate information.
  • the above-mentioned reconstructed texture coordinate information includes reconstructed texture coordinates corresponding to each vertex, that is, UV coordinates.
  • the above-mentioned UV coordinates are used to represent the texture color value of the corresponding vertex.
  • target three-dimensional grid mentioned in this application can be understood as the three-dimensional grid corresponding to any video frame.
  • the basic grid also includes geometric information and connection relationship information corresponding to the target three-dimensional grid.
  • any grid coding method can be used to encode the geometric information, connection relationship information, and reconstructed texture coordinate information in the above-mentioned basic grid (if it is determined that encoding is required according to the first identification information), , after merging, the basic grid code stream is obtained, which is the above-mentioned first code stream.
  • Step 102 The encoding end obtains the second code stream according to the grid difference information.
  • the grid difference information is used to characterize the difference information between the basic grid and the three-dimensional grid to be encoded.
  • the target three-dimensional grid is The grid is obtained based on the three-dimensional grid to be encoded.
  • the mesh difference information is used to characterize the difference information between the refined base mesh and the three-dimensional mesh to be encoded.
  • Step 103 The encoding end obtains a third code stream based on the reconstructed texture map information, which is obtained based on the first code stream and the second code stream.
  • the reconstructed texture map information is encoded to obtain the third code stream.
  • the above reconstructed texture map information is encoded by a video encoder.
  • Step 104 The encoding end generates a target code stream based on the first code stream and the second code stream.
  • the first code stream, the second code stream and the third code stream are mixed to generate a target code stream.
  • the encoding method in the embodiment of the present application is suitable for lossy mode encoding.
  • the encoding end encodes the basic grid corresponding to the target three-dimensional grid according to the first identification information.
  • the basic grid includes the reconstructed texture coordinate information corresponding to the target three-dimensional grid, and obtains the first code stream.
  • obtain the second code stream, and according to the reconstructed texture map information obtain the third code stream, and generate the target code according to the first code stream, the second code stream and the third code stream flow. Since the amount of reconstructed texture coordinate data accounts for a large proportion in the three-dimensional grid, in the embodiment of the present application, you can choose not to encode the reconstructed texture coordinate information in the basic grid according to the first identification information, which can greatly save the code rate and improve the coding efficiency. .
  • the encoding end encodes the basic grid corresponding to the target three-dimensional grid according to the first identification information, and before obtaining the first code stream, it also includes:
  • the three-dimensional grid to be encoded is simplified to obtain the target three-dimensional grid
  • the three-dimensional grid to be encoded is determined as the target three-dimensional grid.
  • the three-dimensional grid to be coded is preprocessed.
  • the preprocessing can be a simplification process.
  • the geometry and connection relationships can be simplified, that is, the grid structure can be maintained as much as possible. In this case, the number of mesh vertices and edges is reduced, thereby reducing the data volume of the three-dimensional mesh.
  • the encoding end generates a target code stream based on the first code stream and the second code stream, including:
  • a target code stream is generated according to the encoded first identification information, the first code stream and the second code stream.
  • the above-mentioned first identification information can be carried in the target code stream, so that the decoding end can determine whether it is necessary to generate reconstructed texture coordinate information based on the first identification information.
  • the encoding end encodes the basic grid corresponding to the target three-dimensional grid according to the first identification information, and obtains the first code stream, including:
  • the first identification information represents encoding of reconstructed texture coordinate information corresponding to the target three-dimensional grid, encoding the geometric information, connection relationship information and reconstructed texture coordinate information to obtain a first code stream;
  • the geometric information and connection relationship information are encoded to obtain a first code stream.
  • the user can set the above-mentioned first identification information according to actual needs, that is, the user can choose whether to encode the above-mentioned reconstructed texture coordinate information.
  • the encoding end obtains the third code stream based on the reconstructed texture map information, it also includes:
  • reconstructed texture map information is generated according to a texture map generation algorithm.
  • a texture map generation algorithm According to the geometric information of the basic grid corresponding to the target three-dimensional grid and connection relationship information, and generate the reconstructed texture coordinate information according to the texture coordinate resampling algorithm.
  • the method of generating the reconstructed texture coordinate information is the same as that in related technologies, and will not be described again here.
  • the encoding end obtains the second code stream based on the grid difference information, including:
  • the grid difference information is updated to obtain updated grid difference information
  • the updated grid difference information is encoded to obtain the second code stream.
  • the encoding end since the encoding end performs lossy compression on the basic grid, in order to improve the accuracy of the grid difference information, it is necessary to update the grid difference information based on the reconstructed grid after decoding the basic grid code stream. This enables the grid difference information to more accurately represent the difference between the basic grid and the original grid (the grid to be encoded).
  • the grid difference information is transformed after updating it, such as wavelet transform.
  • the transformed displacement information is then quantified and arranged into the pixel values of the image according to certain rules, such as the z-scan sequence. Video encoding of images.
  • the encoding end generates a target code stream based on the first code stream, the second code stream and the third code stream, including:
  • the target code stream is obtained according to the first code stream, the second code stream, the third code stream and the fourth code stream.
  • the preprocessed grid (called the basic grid), the displacement information used to represent the difference between the basic grid and the original grid, and the reconstructed texture map attribute information are respectively carried out.
  • Coding 1) In lossy mode, preprocess the three-dimensional mesh. For example, the geometry and connection relationships can be simplified, that is, the number of mesh vertices and edges can be reduced while maintaining the mesh structure as much as possible, thereby reducing the amount of data in the three-dimensional mesh. 2) For the simplified mesh, use the UV coordinate resampling algorithm to regenerate UV coordinates.
  • the simplified geometric information, connection relationships, and newly generated UV coordinates based on the simplified grid are called the basic grid; 3) Use any static grid encoding method to encode the geometric information, connection relationships, and new The generated UV coordinates are encoded, and the basic grid code stream is obtained after merging the code streams. It should be noted that whether to encode the UV coordinates of the basic grid is determined by the logo; 4) In the preprocessing module, the geometric information and UV coordinates of the basic grid are refined and interpolated (refinement), and the interpolation point is calculated and the original The displacement vector of the nearest neighbor point of the mesh. In the displacement information coding module, the refined interpolation algorithm parameters and the displacement vector are encoded to obtain the displacement information code stream.
  • the three-dimensional grid coding framework in the embodiment of this application mainly includes a grid preprocessing module, a basic grid coding module, a video-based displacement information coding module, etc.
  • the three-dimensional grid encoding framework diagram is shown in Figure 2.
  • the input three-dimensional grid containing the texture map that is, the three-dimensional grid to be encoded
  • the preprocessing module is shown in Figure 3.
  • the fragmented information forms patch information.
  • the simplified mesh is surface parameterized, that is, new UV coordinates (reconstructed texture coordinate information) are generated. During this process, some geometric information will also change.
  • the basic mesh (including geometric information, connection relationships and UV coordinates) is obtained and output as a path.
  • the basic mesh geometric information and UV coordinates are refined and interpolated, and the offset vector between the interpolation point and its projection point of the patch normal vector on the original mesh is calculated and output as displacement information.
  • the preprocessing module outputs basic mesh and displacement information.
  • the basic grid output by preprocessing is then quantized, and then the geometric information, connection relationships and UV coordinates are encoded respectively. It is worth noting that the encoding of the basic mesh here can be replaced by any three-dimensional mesh encoding method. In this module, UV coordinates can be optionally encoded.
  • the decoding end needs to use the same UV coordinate generation method as the encoding end to reconstruct the UV coordinates.
  • the code streams of each part of the basic grid are jointly used as the output of the basic grid encoding module, that is, the basic grid sub-stream.
  • Video encoding is performed on the image to obtain the displacement information sub-stream.
  • the encoded basic grid needs to be decoded and reconstructed to obtain the reconstructed basic grid.
  • the encoded displacement information is decoded and inversely quantized to obtain the decoded and inversely quantized displacement information.
  • the reconstructed base mesh and the decoded inverse quantized displacement information are then used to reconstruct the mesh.
  • a texture map generation algorithm is used to generate a new texture map using the reconstructed mesh and a video encoder is used to encode the newly generated texture map. Finally, the patch information sub-stream, basic grid sub-stream and displacement information sub-stream are mixed to obtain the encoded output code stream.
  • the grid simplification operation is first performed.
  • the focus of mesh simplification lies in the simplified operations and the corresponding error measures.
  • the mesh simplification operation here can be edge-based simplification. As shown in Figure 4, the number of patches and vertices can be reduced by merging two vertices of an edge.
  • the mesh can also be simplified through point-based mesh simplification methods.
  • the mesh simplification process it is necessary to define a simplified error measure.
  • the sum of the equation coefficients of all adjacent faces of a vertex can be selected as the error measure of the vertex, and the error measure of the corresponding edge is the sum of the error measures of the two vertices on the edge.
  • the mesh can be divided into one or more local meshes, and the vertex error of the initial mesh in the slice is first calculated to obtain the error of each edge. Then all the edges in the piece are arranged according to a certain rule according to the error, such as from small to large.
  • Each simplification can merge edges according to certain rules, such as selecting the edge with the smallest error for merging.
  • the merged vertex position is calculated and the errors of all edges related to the merged vertex are updated, and the order of edge arrangement is updated. Iteratively simplify the mesh's faces to some desired number.
  • the specific process includes:
  • the vertex error can be defined as the sum of the coefficients of the equations of all adjacent faces of the vertex. For example, each adjacent face defines a plane, which can be expressed by Formula 1:
  • D is the distance from any vertex to the plane
  • n is the unit normal vector of the plane
  • v is the position vector of the vertex
  • Q is the vertex error
  • A, b, c are the coefficients representing the corresponding symbols in Formula 1.
  • a major step in the process of merging vertices is determining the location of the merged vertices.
  • the vertex position can be selected to make the error as small as possible. For example, by taking the partial derivative of formula 3, we can get,
  • the connection relationship between the vertices needs to be updated. For example, during the process of merging vertices, it is possible to determine that the merged vertex corresponds to the two vertices before merging. You only need to replace all the indexes of the two vertices before merging that appear in the face with the index of the merged vertex, and then delete the faces with duplicate indexes to achieve the purpose of updating the connection relationship.
  • the three-dimensional grid may also carry attribute information, and the attribute information may also need to be simplified.
  • attribute information such as texture coordinates, colors, normal vectors, etc.
  • the vertex coordinates can be extended to higher dimensions to calculate the vertex error with attribute information.
  • set the vertex coordinates is (x,y,z)
  • the texture coordinates are (u,v)
  • the expanded vertex is (x,y,z,u,v).
  • T (p, q, r).
  • two standard orthogonal vectors namely:
  • e 1 and e 2 are two vectors on the plane where T is located, q is, e 2 is, where " ⁇ " represents the dot product of the vectors, which defines a coordinate axis on the high-dimensional plane, with p as the origin.
  • the edge part of an image can attract people's attention more, thus affecting people's evaluation of the quality of the image.
  • the same is true for three-dimensional meshes. People tend to notice the boundaries more easily. Therefore, whether boundaries are maintained is also a factor that affects the quality of mesh simplification.
  • the boundaries of the mesh are generally the boundaries of geometric shapes and textures. When an edge belongs to only one face, the edge is a geometric boundary. When the same vertex has two or more texture coordinates, the vertex is the boundary of the texture coordinates. None of the above boundaries should be merged during mesh simplification. Therefore, during each simplification, you can first determine whether the vertex on the edge is a boundary point. If it is a boundary point, skip it and proceed directly to the next iteration.
  • the grid parameterization method includes:
  • the original three-dimensional mesh to be processed (may or may not contain UV coordinates)
  • the reconstructed texture coordinate information can be obtained using the ISO-charts algorithm.
  • This algorithm uses spectral analysis to realize stretch-driven three-dimensional mesh parameterization, and UV-expands, slices, and packages the three-dimensional mesh into two-dimensional meshes. texture domain. Set a stretch threshold.
  • the specific implementation process of this algorithm is as follows:
  • IsoMap isometric feature mapping
  • MDS multidimensional scaling
  • I is an N-dimensional identity matrix
  • 1 is a unit vector of length N.
  • the eigenvalue ⁇ i of B N and the corresponding eigenvector constitutes the spectral decomposition of the surface shape.
  • Eigenvectors corresponding to large eigenvalues represent global low-frequency features on the surface, and eigenvectors corresponding to small eigenvalues represent high-frequency details.
  • N eigenvalues are needed to fully represent a surface with N vertices, a small fraction of their energy usually accounts for the majority of the energy. Therefore, only the n ⁇ N largest eigenvalues and corresponding eigenvectors are calculated to produce n-dimensional embeddings of all points.
  • GDD geodesic distance distortion
  • d geo (i,j) is the geodesic distance between point i and point j.
  • the Isomap algorithm calculates the geodesic distance along a manifold, there are some non-manifold situations in the input three-dimensional grid. This program will perform corresponding preprocessing to eliminate the existence of these non-manifolds.
  • Distortion can be measured in many ways, including how well angles or areas are preserved, or how much a parametric distance stretches or shrinks on a surface.
  • the focus in this algorithm is distance distortion, especially the definition of geometric stretch, which defines two measures of the average stretch L 2 and the worst-case stretch L ⁇ of the local distance on the surface.
  • ⁇ a,b,c> represents the area of triangle abc. Since the mapping is affine, its partial derivative is constant on (s, t), and its calculation process is as follows:
  • ⁇ max and ⁇ min represent the maximum and minimum lengths obtained when the unit length vector is mapped from the two-dimensional texture domain to the three-dimensional surface, that is, the maximum and minimum local "stretch".
  • the two stretch measures on triangle T are defined as follows:
  • A'(T i ) is the surface area of triangle T i in three-dimensional space.
  • the distance threshold is 10 times the average edge length of the target three-dimensional grid.
  • the three-dimensional grid is divided into m parts by growing charts simultaneously around the representative points. Each triangle is assigned to the chart with the closest representative point to the triangle (the geodesic distance from the triangle to the representative point is calculated as the average of the geodesic distances from the three vertices of the triangle to the representative vertex).
  • Chart boundaries should meet two goals: 1) they should cross areas of high curvature without being too jagged, 2) they should minimize embedding distortion of their bordering charts.
  • This algorithm formulates the optimal boundary problem as a graph cutting problem. For simplicity, the binary case of bisecting the surface is discussed below. When subdividing into more than two charts, each pair of adjacent charts is considered in turn.
  • Equation 26 corresponds to the first goal of non-ragged cutting along edges with high dihedral angles, and the calculation process is as shown in Equation 27:
  • d ang (f i , f j ) 1-cos ⁇ ij
  • ⁇ ij is the angle between the normals of triangles f i and f j
  • avg (d ang ) is the average angular distance between adjacent triangles.
  • GDD A (f i ) and GDD B (f i ) are the GDDs embedded by triangle f i induced by chart A or chart B respectively, and avg(d distort ) is the d distort (f i ) on all adjacent triangle pairs. ,f j ) average.
  • This definition of c distort (f i , f j ) prefers the boundary edges of the GDD between the embeddings determined by chart A and chart B, whose adjacent triangles balance it. In other words, the cuts should avoid placing the triangle on the wrong side to create unnecessary distortion.
  • the weight parameter ⁇ in Formula 26 is a trade-off between the above two objectives.
  • the Iso-charts algorithm uses Isomap's extended algorithm landmark Isomap.
  • the landmark Isomap algorithm is also used to calculate the embedding coordinates of the mid-region vertices during boundary optimization to further reduce embedding distortion.
  • Input basic grid (containing attribute information);
  • any mesh refinement scheme can be used to refine the basic mesh.
  • a feasible refinement scheme is the midpoint subdivision scheme, which subdivides each triangle into 4 subtriangles in each subdivision iteration, as shown in Figure 5. Introduce new vertices in the middle of each edge. The subdivision process is applied independently to geometry and texture coordinates because the connectivity of geometry and texture coordinates often differs. The subdivision scheme calculates the position Pos(v 12) of the newly introduced vertex v 12 at the center of the edge (v 1 , v 2 ), as follows:
  • Pos(v 1 ) and Pos(v 2 ) are the positions of vertices v 1 and v 2 .
  • N(v 12 ), N(v 1 ) and N(v 2 ) are the normal vectors corresponding to the vertices v 12 , v 1 and v 2 respectively.
  • is the modulo 2 operation on vector x.
  • Input refined mesh and original mesh (including attribute information);
  • Figure 6 shows the basic idea of the preprocessing scheme using 2D curves.
  • the same concept is applied to the input 3D mesh to generate the base mesh and displacement fields.
  • the input 2D curve represented by a 2D polyline
  • the original curve is first downsampled to generate a basic curve/polyline, called the "simplified curve”.
  • a subdivision scheme is then applied to the simplified polyline to produce a "thinning or subdivision curve”.
  • the subdivided polyline is then deformed to obtain a better approximation of the original curve. That is, the displacement vector (shown by the arrow in Figure 6) is calculated for each vertex of the subdivided mesh, so that the shape of the displacement curve is as close as possible to the shape of the original curve.
  • These displacement vectors are the displacement information output by the module.
  • the encoding of the basic mesh can use the mesh encoder Draco in the related technology, which mainly includes five parts: quantization, connection relationship encoding, geometric information encoding, UV coordinate encoding and texture map encoding, which are explained separately below. .
  • Input geometric information and UV coordinates of the basic mesh
  • the three-dimensional coordinates of the vertices of the input mesh are quantified to obtain the quantized geometric information.
  • the f 1 function in Formula 32 to Formula 34 is a quantization function.
  • the input of the quantization function is the coordinates of a certain dimension and the quantization coefficient of that dimension, and the output is the quantized coordinate value.
  • the f 1 function can be calculated in many ways.
  • a more common calculation method is as shown in Formula 35 to Formula 37. It is calculated by dividing the original coordinates of each dimension by the quantization coefficient of that dimension. Among them, "/" is the division operator, and the result of the division operation can be rounded in different ways, such as rounding, rounding down, rounding up, etc.
  • the f 1 function can be implemented using bit operations, such as Formula 38 to Formula 40:
  • Formula 38: x q x>>log 2 QP x
  • Formula 39: y q y>>log 2 QP y
  • Formula 40: z q z>>log 2 QP z
  • the quantization coefficients QP x , QP y and QP z can be set flexibly.
  • the quantization coefficients of different components are not necessarily equal. You can use the correlation of the quantization parameters of different components to establish the relationship between QP x , QP y and QP z , and set different quantization coefficients for different components; secondly, different spatial regions
  • the quantization coefficients are not necessarily equal, and the quantization parameters can be adaptively set according to the sparsity of the vertex distribution in the local area.
  • the quantification of two-dimensional UV coordinates is similar to the quantification of three-dimensional coordinates, just reduce the quantification of one dimension.
  • Output encoded connection relationship sub-stream and vertex encoding sequence.
  • the EB algorithm obtains a string sequence consisting of five characters, C, L, E, R, and S, by traversing each triangle of the triangular mesh model, and then uses the Huffman encoding method to encode this string sequence.
  • the five operating modes defined in EB are shown in Figure 7.
  • C represents the topological situation in which the vertex v to be encoded is not on the boundary
  • L and R represent that the vertex v to be encoded is on the boundary and the current triangle has an edge e in addition to the current edge.
  • L and R respectively represent that e is on the current boundary.
  • E indicates that the three sides of the triangle are on the boundary.
  • the algorithm encodes the grid in the form of a spiral.
  • a directed boundary composed of edges is always maintained, and this boundary divides the grid into traversed parts and untraversed parts.
  • a topological relationship operator of the triangle and its boundary is output, and the polygon is classified into the encoded part.
  • the specific traversal process is as follows: first select any triangle to form the initial boundary, and then select any edge as the current edge.
  • the Edgebreaker algorithm uses five operators C, L, E, R and S to record the topological relationship between the current triangle and the boundary. According to the direction of the arrows in different operators, select the next edge as the current edge and continue to determine the operation mode corresponding to the vertex to be encoded.
  • the string is entropy encoded.
  • using the EB algorithm also requires outputting the traversed vertex order to the geometric information and UV coordinate encoding module.
  • the final entropy coded pattern codeword is CCRRSLCRSERRELCRRRCRRRE.
  • Input quantized geometric information, connection relationships and connection relationship encoding vertex order
  • triangle S1 is a triangle that has currently encoded geometric coordinates.
  • the traversal method of the vertices to be encoded is the same as the order of the vertices of the encoded connection relationship when encoding the connection relationship.
  • the vertex traversal order is the same as the vertex order in the base mesh.
  • two or three flat quadrilaterals can also be used to predict the geometric coordinates to be encoded.
  • the specific encoding method is not emphasized here.
  • UV coordinate encoding control whether to encode UV coordinates through identification
  • Output encoded UV coordinate sub-stream.
  • UV coordinates can be encoded using parallelogram prediction as follows:
  • Triangle S1 is a triangle that has currently encoded UV coordinates.
  • the traversal method of the vertices to be encoded is the same as the order of the vertices of the encoded connection relationship when encoding the connection relationship.
  • the vertex traversal order is the same as the vertex order in the base mesh.
  • two or three flat quadrilaterals can also be used to predict the UV coordinates to be encoded.
  • the specific encoding method is not emphasized here.
  • the basic mesh code stream After completing the encoding of the basic mesh, the basic mesh code stream needs to be decoded to obtain the distorted geometric information and UV coordinates. If the identification mark does not encode UV coordinates, the UV coordinates of the base mesh are used to correct the vertex offset. Correct the vertex offset vector value in the displacement information based on the decoded geometric information and UV coordinates (depending on the logo to determine whether to use the decoded UV coordinates or the pre-encoding basic mesh UV coordinates). Then, the updated displacement information is encoded.
  • a feasible displacement information encoding method is linear wavelet transform.
  • the update process is:
  • v * is the set of adjacent vertices of vertex v.
  • Signal(v) is the geometry or attribute value of vertex v.
  • v is the vertex inserted at the midpoint of the edge between v 1 and v 2 .
  • Signal(v), Signal(v 1 ) and Signal(v 2 ) are the geometry or attribute values of vertices v, v 1 and v 2 respectively.
  • the transformed displacement information can be arranged into a 2D image using the following method:
  • Method 1 Traverse the coefficients from low frequency to high frequency.
  • the encoder can explicitly signal the arrangement scheme used in the bitstream.
  • any video encoder can be used to encode the image to obtain the displacement information sub-stream.
  • the displacement information sub-stream Before encoding the texture map, the displacement information sub-stream needs to be decoded and inversely quantized to obtain the distorted displacement information. This operation can ensure the consistency of information used by the codec end.
  • the reconstructed mesh is jointly generated using the reconstructed base mesh and the distorted displacement information. Generate a new texture map using the reconstructed mesh and the original texture map.
  • a video encoder For new texture maps, you can usually directly use a video encoder to encode the frame-by-frame texture map, such as using High Efficiency Video Coding (HEVC), Universal Video Coding (VVC) and other encoders to form attributes. substream.
  • HEVC High Efficiency Video Coding
  • VVC Universal Video Coding
  • the video encoder here can choose any video encoder.
  • each sub-stream is mixed to form an output trellis-encoded code stream.
  • the encoding end encodes the basic grid corresponding to the target three-dimensional grid according to the first identification information.
  • the basic grid includes the reconstructed texture coordinate information corresponding to the target three-dimensional grid, and obtains the first code stream.
  • obtain the second code stream, and according to the reconstructed texture map information obtain the third code stream, and generate the target code according to the first code stream, the second code stream and the third code stream flow. Since the amount of reconstructed texture coordinate data accounts for a large proportion in the three-dimensional grid, in the application embodiment, you can choose not to encode the reconstructed texture coordinate information in the basic grid based on the first identification information, which can greatly save the code rate and improve the coding efficiency.
  • this embodiment of the present application also provides a decoding method, including:
  • Step 901 The decoder decomposes the acquired target code stream to obtain the first code stream, the second code stream and the third code stream.
  • the first code stream is obtained based on the basic grid corresponding to the target three-dimensional grid.
  • the second code stream is obtained based on grid difference information.
  • the grid difference information is used to characterize the difference information between the basic grid and the three-dimensional grid to be encoded.
  • the target three-dimensional grid is based on
  • the third code stream is obtained based on the reconstructed texture map information.
  • Step 902 When the decoding end determines that the first code stream includes reconstructed texture coordinate information, according to the first decoding result corresponding to the first code stream and the second decoding result corresponding to the second code stream Reconstruct the target three-dimensional grid with the third decoding result corresponding to the third code stream;
  • Step 903 When the decoder determines that the first code stream does not include reconstructed texture coordinate information, generate reconstructed texture coordinate information, and use the generated reconstructed texture coordinate information and the first code stream corresponding to the first code stream to The decoding result, the second decoding result corresponding to the second code stream and the third decoding result corresponding to the third code stream are used to reconstruct the target three-dimensional grid.
  • the encoding end can choose not to encode the reconstructed texture coordinate information in the basic grid based on the first identification information.
  • the decoding end can generate the reconstructed texture coordinate information based on the decoded information. In the lossy mode , which can greatly save bit rate and improve coding efficiency.
  • the method in the embodiment of this application also includes:
  • the decoding end decomposes the acquired target code stream to obtain first identification information, and the first identification information is used to characterize whether the encoding end encodes the reconstructed texture coordinate information;
  • the first identification information it is determined whether the first code stream includes reconstructed texture coordinate information.
  • the encoding end encodes the first identification information used to indicate whether to encode the reconstructed texture coordinate information, so that the decoding end can determine whether the reconstructed texture coordinate information needs to be generated based on the first identification information.
  • the decoding end decomposes the acquired target code stream to obtain the first code stream, the second code stream and the third code stream, and also includes:
  • the first decoding result it is determined whether the first code stream includes reconstructed texture coordinate information.
  • the encoding end may not encode the above-mentioned first identification information.
  • the decoding end may determine whether the reconstructed texture coordinate information is included according to the first decoding result.
  • the first decoding result also includes:
  • the geometric information and connection relationship information corresponding to the target three-dimensional grid is the geometric information and connection relationship information corresponding to the target three-dimensional grid.
  • the generating and reconstructing texture coordinate information includes:
  • the reconstructed texture coordinate information is generated according to a texture coordinate resampling algorithm.
  • the decoding end uses the same UV coordinate generation method as the encoding end to reconstruct the UV coordinates to obtain reconstructed texture coordinate information.
  • the decoding end decomposes the obtained target code stream to obtain the first code stream, the second code stream and the third code stream, including:
  • the decoding end decomposes the obtained target code stream to obtain the first code stream, the second code stream, the third code stream and the fourth code stream.
  • the fourth code stream is determined based on the fragmentation information of the target three-dimensional grid. ;as well as
  • Reconstructing the target three-dimensional grid based on the first decoding result corresponding to the first code stream, the second decoding result corresponding to the second code stream, and the third decoding result corresponding to the third code stream includes: reconstruct the target three-dimensional grid according to the first decoding result, the second decoding result, the third decoding result and the fourth decoding result corresponding to the fourth code stream; or, reconstruct the texture coordinate information based on the generated , the first decoding result corresponding to the first code stream, the second decoding result corresponding to the second code stream, and the third decoding result corresponding to the third code stream, reconstructing the target three-dimensional grid includes: according to the The target three-dimensional grid is reconstructed using the first decoding result, the second decoding result, the third decoding result, the fourth decoding result corresponding to the fourth code stream and the generated reconstructed texture coordinate information.
  • the decoding framework of the three-dimensional grid is shown in Figure 10.
  • the target code stream is decomposed into a patch information sub-stream, a geometric information sub-stream, a connection relationship sub-stream, and a UV coordinate sub-stream ( If any), texture map sub-stream and displacement information sub-stream.
  • the generation algorithm regenerates UV coordinates.
  • the three-dimensional grid is reconstructed using each channel of decoded information.
  • the texture map sub-stream and the displacement information sub-stream are decoded using a video decoder.
  • the decoder corresponding to the encoding method of the encoding end is used to decode the geometric information, connection relationship and UV coordinate sub-stream.
  • the decoding of various information is introduced below.
  • connection relationship sub-stream to be decoded connection relationship sub-stream to be decoded
  • Output The connection relationship of the three-dimensional mesh and the decoded vertex order.
  • connection relationship sub-stream decodes the connection relationship sub-stream to obtain the pattern string.
  • the connection relationship is reconstructed according to the corresponding pattern in the string in the encoding order, and the traversal attributes of the vertices are output to the geometric information and UV coordinate decoding module.
  • Input geometric information sub-stream, decoded displacement information and decoding order of connection relationship
  • Output geometric information of the 3D mesh.
  • the decoding process of grid geometric coordinates is the inverse process of the encoding process: first entropy decodes the coordinate prediction residual. Then according to the decoded triangle, the predicted coordinates of the point to be decoded are predicted according to the parallelogram rule. Add the predicted coordinates to the residual value decoded by entropy to obtain the geometric coordinate position to be decoded.
  • the vertex traversal order here is the same as the vertex order encoding the connection relationship when encoding the connection relationship. When no connection relationships are encoded, the vertex traversal order is the same as the vertex order in the base mesh. Note that the geometric coordinates of the initial triangles do not use predictive encoding, but directly encode their geometric coordinate values.
  • the decoded displacement information needs to be used to correct the decoded geometric information.
  • the correction method is to use the displacement value in the displacement information to displace the corresponding vertex along the normal vector direction. Finally, the corrected geometric information is obtained.
  • UV coordinate decoding and reconstruction (whether to decode UV coordinates is determined by the first identifier)
  • UV coordinate code stream to be decoded, decoded and corrected geometric information and decoding sequence of connection relationships
  • the decoding process of the grid UV coordinates is the reverse process of the encoding process: first entropy decode the coordinate prediction residual. Then according to the decoded triangle, the predicted coordinates of the point to be decoded are predicted according to the parallelogram rule. Add the predicted coordinates to the residual value decoded by entropy to obtain the UV coordinate position to be decoded. Note that the UV coordinates of the initial triangles do not use predictive encoding, but directly encode their UV coordinate values. After decoding the UV coordinates of the triangle at the decoding end, it starts to traverse and decode the UV coordinates of other triangle vertices as the initial triangle. In addition, it is possible to use two or three flat quadrilaterals to predict the UV coordinates to be decoded, and the specific prediction method is not emphasized.
  • the same UV coordinate generation algorithm as the encoding end is used, and the decoded geometric information and connection relationships are used to generate UV coordinates.
  • the decoded displacement information needs to be used to correct the decoded UV coordinates.
  • the correction method is to use the displacement value in the displacement information to displace the corresponding vertex along the normal vector direction. Finally got Corrected UV coordinates.
  • the file format of the texture map is not emphasized here.
  • the format can be jpg, png, etc.
  • the encoding end can choose not to encode the reconstructed texture coordinate information in the basic grid based on the first identification information.
  • the decoding end can generate the reconstructed texture coordinate information based on the decoded information. In the lossy mode , which can greatly save bit rate and improve coding efficiency.
  • the execution subject may be an encoding device.
  • the encoding device performing the encoding method is taken as an example to illustrate the encoding device provided by the embodiment of the present application.
  • this embodiment of the present application also provides an encoding device 1100, which is applied to the encoding end and includes:
  • the first encoding module 1101 is used to encode the basic grid corresponding to the target three-dimensional grid according to the first identification information, and obtain the first code stream.
  • the basic grid includes the reconstructed texture coordinates corresponding to the target three-dimensional grid.
  • Information, the first identification information is used to characterize whether to encode the reconstructed texture coordinate information;
  • the first acquisition module 1102 is used to acquire the second code stream according to the grid difference information.
  • the grid difference information is used to characterize the difference information between the basic grid and the three-dimensional grid to be encoded.
  • the target The three-dimensional grid is obtained based on the three-dimensional grid to be encoded;
  • the second acquisition module 1103 is configured to acquire a third code stream based on the reconstructed texture map information, which is obtained based on the first code stream and the second code stream;
  • the first generation module 1104 is configured to generate a target code stream according to the first code stream, the second code stream and the third code stream.
  • the first generation module includes:
  • the first acquisition sub-module is used to encode the first identification information and obtain the encoded first identification information
  • the first generation sub-module is used to generate a target code stream according to the encoded first identification information, the first code stream and the second code stream.
  • the basic grid also includes geometric information and connection relationship information corresponding to the target three-dimensional grid.
  • the first encoding module is used for:
  • the first identification information represents encoding of reconstructed texture coordinate information corresponding to the target three-dimensional grid, encoding the geometric information, connection relationship information and reconstructed texture coordinate information to obtain a first code stream;
  • the geometric information and connection relationship information are encoded to obtain a first code stream.
  • the device of the embodiment of the present application also includes:
  • the third acquisition module is configured to decode and dequantize the first code stream to obtain the reconstructed basic grid before the second acquisition module acquires the third code stream based on the reconstructed texture map information;
  • the fourth acquisition module is used to decode and dequantize the second code stream to obtain target grid difference information
  • the second generation module is configured to generate reconstructed texture map information according to a texture map generation algorithm based on the difference information between the reconstructed basic mesh and the target mesh.
  • the first acquisition module includes:
  • the second acquisition sub-module is used to decode the first code stream and obtain the reconstructed grid corresponding to the first code stream;
  • An update submodule configured to update the grid difference information according to the reconstructed grid to obtain updated grid difference information
  • the first encoding submodule is used to encode the updated grid difference information to obtain the second code stream.
  • the device of the embodiment of the present application also includes:
  • the fifth acquisition module is used to encode the basic grid corresponding to the target three-dimensional grid according to the first identification information and obtain the first code stream.
  • the code to be encoded is The three-dimensional grid is simplified to obtain the target three-dimensional grid; in the case of lossless encoding mode, the three-dimensional grid to be encoded is determined as the target three-dimensional grid.
  • the first generation module includes:
  • the third acquisition submodule is used to acquire the fourth code stream according to the fragmentation information of the target three-dimensional grid
  • the fourth acquisition sub-module is used to obtain the target code stream according to the first code stream, the second code stream, the third code stream and the fourth code stream.
  • the encoding end encodes the basic grid corresponding to the target three-dimensional grid according to the first identification information, obtains the first code stream, and obtains the second code stream according to the grid difference information. According to the first code stream and the second code stream to generate a target code stream. Since the amount of reconstructed texture coordinate data accounts for a large proportion in the three-dimensional grid, in the embodiment of the present application, you can choose not to encode the reconstructed texture coordinate information in the basic grid according to the first identification information, which can greatly save the code rate and improve the coding efficiency. .
  • This device embodiment corresponds to the above-mentioned encoding method embodiment shown in Figure 1.
  • Each implementation process and implementation method on the encoding end in the above-mentioned method embodiment can be applied to this device embodiment, and can achieve the same technical effect.
  • the encoding device 1200 includes: a processor 1201, a network interface 1202, and a memory 1203.
  • the network interface 1202 is, for example, a common public radio interface (CPRI).
  • CPRI common public radio interface
  • the encoding device 1200 in the embodiment of the present application also includes: instructions or programs stored in the memory 1203 and executable on the processor 1201.
  • the processor 1201 calls the instructions or programs in the memory 1203 to execute the modules shown in Figure 11
  • the implementation method and achieve the same technical effect will not be repeated here to avoid repetition.
  • the execution subject may be a decoding device.
  • the decoding device performing the decoding method is taken as an example to illustrate the decoding device provided by the embodiment of the present application.
  • this embodiment of the present application also provides a decoding device 1300, which is applied to the decoding end and includes:
  • the sixth acquisition module 1301 is used to decompose the acquired target code stream to obtain the first code stream, the second code stream and the third code stream.
  • the first code stream is based on the basic grid corresponding to the target three-dimensional grid.
  • the second code stream is the base Obtained from grid difference information
  • the grid difference information is used to characterize the difference information between the basic grid and the three-dimensional grid to be encoded
  • the target three-dimensional grid is obtained based on the three-dimensional grid to be encoded
  • the third code stream is obtained based on the reconstructed texture map information
  • Reconstruction module 1302 configured to: when the decoding end determines that the first code stream includes reconstructed texture coordinate information, according to the first decoding result corresponding to the first code stream, the third code stream corresponding to the second code stream.
  • the second decoding result and the third decoding result corresponding to the third code stream are used to reconstruct the target three-dimensional grid; and/or, used when the decoding end determines that the first code stream does not include reconstructed texture coordinate information. , generate reconstructed texture coordinate information, and use the generated reconstructed texture coordinate information, the first decoding result corresponding to the first code stream, the second decoding result corresponding to the second code stream, and the decoding result corresponding to the third code stream.
  • the third decoding result is to reconstruct the target three-dimensional grid.
  • the device of the embodiment of the present application also includes:
  • the seventh acquisition module is used to decompose the acquired target code stream to obtain first identification information.
  • the first identification information is used to characterize whether the encoding end encodes the reconstructed texture coordinate information;
  • a first determination module configured to determine whether the first code stream includes reconstructed texture coordinate information according to the first identification information.
  • the device of the embodiment of the present application also includes:
  • the eighth acquisition module is used to decode the first code stream after the sixth acquisition module decomposes the acquired target code stream to obtain the first code stream, the second code stream and the third code stream to obtain The first decoding result;
  • a second determination module configured to determine whether the first code stream includes reconstructed texture coordinate information according to the first decoding result.
  • the first decoding result also includes:
  • the geometric information and connection relationship information corresponding to the target three-dimensional grid is the geometric information and connection relationship information corresponding to the target three-dimensional grid.
  • the reconstruction module is configured to generate the reconstructed texture coordinate information according to the texture coordinate resampling algorithm according to the geometric information and connection relationship information.
  • the sixth acquisition module is used to decompose the acquired target code stream to obtain the first code stream, the second code stream, the third code stream and the fourth code stream.
  • the fourth code stream is based on The fragmentation information of the target 3D mesh is determined;
  • the reconstruction module is configured to reconstruct the target three-dimensional grid according to the first decoding result, the second decoding result, the third decoding result and the fourth decoding result corresponding to the fourth code stream; or, according to the The first decoding result, the second decoding result, the third decoding result, the fourth decoding result corresponding to the fourth code stream and the generated reconstructed texture coordinate information are used to reconstruct the target three-dimensional grid.
  • the encoding end can choose not to encode the reconstructed texture coordinate information in the basic grid based on the first identification information.
  • the decoding end can generate the reconstructed texture coordinate information based on the decoded information. In the lossy mode , which can greatly save bit rate and improve coding efficiency.
  • this device embodiment is a device corresponding to the above-mentioned method embodiment shown in Figure 9. All implementations of the decoding end in the above-mentioned method embodiment are applicable to this device embodiment, and the same results can be achieved. The technical effects will not be repeated here.
  • An embodiment of the present application also provides a decoding device, including a processor, a memory, and a program or instruction stored in the memory and executable on the processor.
  • a decoding device including a processor, a memory, and a program or instruction stored in the memory and executable on the processor.
  • the program or instruction is executed by the processor, the above decoding method is implemented.
  • Each process in the example can achieve the same technical effect. To avoid repetition, we will not repeat it here.
  • An embodiment of the present application also provides an encoding device, including a processor, a memory, and a program or instruction stored in the memory and executable on the processor.
  • the program or instruction is executed by the processor, the above decoding method is implemented.
  • Each process in the example can achieve the same technical effect. To avoid repetition, we will not repeat it here.
  • Embodiments of the present application also provide a readable storage medium.
  • Programs or instructions are stored on the computer-readable storage medium.
  • the program or instructions are executed by a processor, each process of the above encoding method or decoding method embodiment is implemented, and can To achieve the same technical effect, to avoid repetition, we will not repeat them here.
  • the processor is the processor in the decoding device described in the above embodiment.
  • the readable storage medium includes computer readable storage media, such as computer read-only memory ROM, random access memory RAM, magnetic disk or optical disk, etc.
  • the computer-readable storage medium is such as read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, etc.
  • Embodiments of the present application also provide an encoding device, including a processor and a communication interface, wherein the processor is used to encode the basic grid corresponding to the target three-dimensional grid according to the first identification information, so as to obtain the first code stream.
  • the basic grid includes reconstructed texture coordinate information corresponding to the target three-dimensional grid, and the first identification information is used to indicate whether to encode the reconstructed texture coordinate information; according to the grid difference information, the second code stream is obtained,
  • the grid difference information is used to characterize the difference information between the basic grid and the three-dimensional grid to be encoded.
  • the target three-dimensional grid is obtained based on the three-dimensional grid to be encoded; according to the reconstructed texture map information , obtain the third code stream, the reconstructed texture map information is obtained according to the first code stream and the second code stream; according to the first code stream, the second code stream and the third code stream Stream, generate target code stream.
  • This encoding device embodiment corresponds to the above-mentioned encoding method embodiment.
  • Each implementation process and implementation manner of the above-mentioned method embodiment can be applied to this encoding device embodiment, and can achieve the same technical effect.
  • An embodiment of the present application also provides a decoding device, including a processor and a communication interface, wherein the processor is used to decompose the acquired target code stream to obtain a first code stream, a second code stream and a third code stream.
  • the first code stream is obtained based on the basic grid corresponding to the target three-dimensional grid
  • the second code stream is obtained based on grid difference information
  • the grid difference information is used to characterize the basic grid and the difference information between the three-dimensional grid to be encoded
  • the target three-dimensional grid is obtained based on the three-dimensional grid to be encoded
  • the third code stream is obtained based on the reconstructed texture map information
  • the first code stream includes reconstructed texture coordinate information
  • the second decoding result corresponding to the second code stream and the third code stream corresponding Three decoding results are used to reconstruct the target three-dimensional grid; when it is determined that the first code stream does not include reconstructed texture coordinate information, reconstructed texture coordinate information is generated, and
  • This decoding device embodiment corresponds to the above-mentioned decoding method embodiment, and each implementation process of the above-mentioned method embodiment and implementation methods can be applied to this decoding device embodiment, and can achieve the same technical effect.
  • the embodiment of the present application also provides a decoding device.
  • the decoding device 1400 includes: a processor 1401, a network interface 1402, and a memory 1403.
  • the network interface 1402 is, for example, a common public radio interface (CPRI).
  • the decoding device 1400 in the embodiment of the present application also includes: instructions or programs stored in the memory 1403 and executable on the processor 1401.
  • the processor 1401 calls the instructions or programs in the memory 1403 to execute the modules shown in Figure 13
  • the implementation method and achieve the same technical effect will not be repeated here to avoid repetition.
  • this embodiment of the present application also provides a communication device 1500, which includes a processor 1501 and a memory 1502.
  • the memory 1502 stores programs or instructions that can be run on the processor 1501, such as , when the communication device 1500 is a coding device, when the program or instruction is executed by the processor 1501, each step of the above coding method embodiment is implemented, and the same technical effect can be achieved.
  • the communication device 1500 is a decoding device, when the program or instruction is executed by the processor 1501, each step of the above decoding method embodiment is implemented, and the same technical effect can be achieved. To avoid duplication, the details are not repeated here.
  • An embodiment of the present application further provides a chip.
  • the chip includes a processor and a communication interface.
  • the communication interface is coupled to the processor.
  • the processor is used to run programs or instructions to implement the above encoding method or decoding method.
  • Each process in the example can achieve the same technical effect. To avoid repetition, we will not repeat it here.
  • chips mentioned in the embodiments of this application may also be called system-on-chip, system-on-a-chip, system-on-chip or system-on-chip, etc.
  • Embodiments of the present application further provide a computer program/program product.
  • the computer program/program product is stored in a storage medium.
  • the computer program/program product is executed by at least one processor to implement the above encoding method or decoding method.
  • Each process of the embodiment can achieve the same technical effect, so to avoid repetition, it will not be described again here.
  • An embodiment of the present application also provides a communication system, which at least includes: a coding device and a decoding device.
  • the encoding device may be an encoding device as shown in Figure 12, and may be used to perform the steps of the encoding method as shown in Figure 1.
  • the decoding device may be a decoding device as shown in Figure 14, and may be used to perform the steps of the decoding method as shown in Figure 9. And can achieve the same technical effect. To avoid repetition, they will not be described again here.
  • the methods of the above embodiments can It can be implemented with the help of software plus the necessary common hardware platform. Of course, it can also be implemented through hardware, but in many cases the former is a better implementation method.
  • the technical solution of the present application can be embodied in the form of a computer software product that is essentially or contributes to related technologies.
  • the computer software product is stored in a storage medium (such as ROM/RAM, disk, CD), including several instructions to cause a terminal (which can be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in various embodiments of this application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请公开了一种编码方法、解码方法、装置及设备,属于编解码技术领域,本申请实施例的编码方法,包括:编码端根据第一标识信息,对目标三维网格对应的基础网格进行编码,获取第一码流,基础网格包括目标三维网格对应的重建纹理坐标信息,第一标识信息用于表征是否对重建纹理坐标信息进行编码;编码端根据网格差异信息,获取第二码流,网格差异信息用于表征基础网格与待编码的三维网格之间的差异信息,目标三维网格是基于待编码的三维网格得到的;所述编码端根据重建的纹理图信息,获取第三码流,重建的纹理图信息是根据第一码流和第二码流得到的;编码端根据所述第一码流、第二码流和所述第三码流,生成目标码流。

Description

编码方法、解码方法、装置及设备
相关申请的交叉引用
本申请主张在2022年5月31日在中国提交的中国专利申请No.202210613984.5的优先权,其全部内容通过引用包含于此。
技术领域
本申请属于编解码技术领域,具体涉及一种编码方法、解码方法、装置及设备。
背景技术
三维网格(Mesh)可以被认为是过去多年来最流行的三维模型的表示方法,其在许多应用程序中扮演着重要的角色。它的表示简便,因此被大量以硬件算法集成到电脑、平板电脑和智能手机的图形处理单元中,专门用于渲染三维网格。
纹理坐标,又称为UV坐标是一种描述三维网格顶点纹理的信息。UV坐标数据量在三维网格中比重较大,相关技术方案中对UV坐标进行编码会消耗大量的码率,导致三维网格编码效率低。
发明内容
本申请实施例提供一种编码方法、解码方法、装置及设备,能够解决相关技术方案中三维网格编码效率低的问题。
第一方面,提供了一种编码方法,包括:
编码端根据第一标识信息,对目标三维网格对应的基础网格进行编码,获取第一码流,所述基础网格包括所述目标三维网格对应的重建纹理坐标信息,所述第一标识信息用于表征是否对所述重建纹理坐标信息进行编码;
所述编码端根据网格差异信息,获取第二码流,所述网格差异信息用于表征所述基础网格与待编码的三维网格之间的差异信息,所述目标三维网格是基于待编码的三维网格得到的;
所述编码端根据重建的纹理图信息,获取第三码流,所述重建的纹理图信息是根据所述第一码流和第二码流得到的;
所述编码端根据所述第一码流、所述第二码流和所述第三码流,生成目标码流。
第二方面,提供一种解码方法,包括:
解码端对获取的目标码流进行分解,得到第一码流、第二码流和第三码流,所述第一码流是基于目标三维网格对应的基础网格得到的,所述第二码流是基于网格差异信息得到的,所述网格差异信息用于表征所述基础网格与待编码的三维网格之间的差异信息,所述目标三维网格是基于待编码的三维网格得到的,所述第三码流是根据重建的纹理图信息得 到的;
在所述解码端确定所述第一码流包括重建纹理坐标信息的情况下,根据所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格;
在所述解码端确定所述第一码流未包括重建纹理坐标信息的情况下,生成重建纹理坐标信息,并根据生成的重建纹理坐标信息、所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格。
第三方面,提供了一种编码装置,应用于编码端,包括:
第一编码模块,用于根据第一标识信息,对目标三维网格对应的基础网格进行编码,获取第一码流,所述基础网格包括所述目标三维网格对应的重建纹理坐标信息,所述第一标识信息用于表征是否对所述重建纹理坐标信息进行编码;
第一获取模块,用于根据网格差异信息,获取第二码流,所述网格差异信息用于表征所述基础网格与待编码的三维网格之间的差异信息,所述目标三维网格是基于待编码的三维网格得到的;
第二获取模块,用于根据重建的纹理图信息,获取第三码流,所述重建的纹理图信息是根据所述第一码流和第二码流得到的;
第一生成模块,用于根据所述第一码流、所述第二码流和所述第三码流,生成目标码流。
第四方面,提供了一种解码装置,应用于解码端,包括:
第六获取模块,用于对获取的目标码流进行分解,得到第一码流、第二码流和第三码流,所述第一码流是基于目标三维网格对应的基础网格得到的,所述第二码流是基于网格差异信息得到的,所述网格差异信息用于表征所述基础网格与待编码的三维网格之间的差异信息,所述目标三维网格是基于待编码的三维网格得到的,所述第三码流是根据重建的纹理图信息得到的;
重建模块,用于在所述解码端确定所述第一码流包括重建纹理坐标信息的情况下,根据所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格;和/或,用于在所述解码端确定所述第一码流未包括重建纹理坐标信息的情况下,生成重建纹理坐标信息,并根据生成的重建纹理坐标信息、所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格。
第五方面,提供了一种编码设备,该编码设备包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第一方面所述的方法的步骤。
第六方面,提供了一种编码设备,包括处理器及通信接口,其中,所述处理器用于根据第一标识信息,对目标三维网格对应的基础网格进行编码,获取第一码流,所述基础网 格包括所述目标三维网格对应的重建纹理坐标信息,所述第一标识信息用于表征是否对所述重建纹理坐标信息进行编码;根据网格差异信息,获取第二码流,所述网格差异信息用于表征所述基础网格与待编码的三维网格之间的差异信息,所述目标三维网格是基于待编码的三维网格得到的;根据重建的纹理图信息,获取第三码流,所述重建的纹理图信息是根据所述第一码流和第二码流得到的;根据所述第一码流、所述第二码流和所述第三码流,生成目标码流。
第七方面,提供了一种解码设备,该解码设备包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第二方面所述的方法的步骤。
第八方面,提供了一种解码设备,包括处理器及通信接口,其中,所述处理器用于对获取的目标码流进行分解,得到第一码流、第二码流和第三码流,所述第一码流是基于目标三维网格对应的基础网格得到的,所述第二码流是基于网格差异信息得到的,所述网格差异信息用于表征所述基础网格与待编码的三维网格之间的差异信息,所述目标三维网格是基于待编码的三维网格得到的,所述第三码流是根据重建的纹理图信息得到的;
在确定所述第一码流包括重建纹理坐标信息的情况下,根据所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格;
在确定所述第一码流未包括重建纹理坐标信息的情况下,生成重建纹理坐标信息,并根据生成的重建纹理坐标信息、所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格。
第九方面,提供了一种编解码系统,包括:编码设备及解码设备,所述编码设备可用于执行如第一方面所述的编码方法的步骤,所述解码设备可用于执行如第二方面所述的解码方法的步骤。
第十方面,提供了一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如第一方面所述的方法的步骤,或者实现如第二方面所述的方法的步骤。
第十一方面,提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如第一方面所述的方法,或实现如第二方面所述的方法。
第十二方面,提供了一种计算机程序/程序产品,所述计算机程序/程序产品被存储在存储介质中,所述计算机程序/程序产品被至少一个处理器执行以实现如第一方面所述的方法的步骤,或实现如第二方面所述的方法的步骤。
在本申请实施例中,编码端根据第一标识信息对目标三维网格对应的基础网格进行编码,所述基础网格包括所述目标三维网格对应的重建纹理坐标信息,获取第一码流,根据网格差异信息,获取第二码流,并根据重建的纹理图信息,获取第三码流,根据所述第一 码流、所述第二码流和第三码流,生成目标码流。由于重建纹理坐标数据量在三维网格中比重较大,本申请实施例中根据第一标识信息可以选择不对基础网格中的重建纹理坐标信息进行编码,可以极大地节省码率,提高编码效率。
附图说明
图1是本申请实施例的编码方法的流程示意图;
图2是本申请实施例中三维网格的编码框架图;
图3是本申请实施例中预处理模块的示意图;
图4是本申请实施例中网格简化过程中合并顶点过程的示意图;
图5是本申请实施例的中点细分法的示意图;
图6是本申请实施例中位移计算方法的示意图;
图7是EB中定义的五种操作模式的示意图;
图8是几何坐标平行四边形的预测示意图;
图9是本申请实施例的解码方法的流程示意图;
图10是本申请实施例中三维网格解码框架的示意图;
图11是本申请实施例的编码装置的模块示意图;
图12是本申请实施例的编码设备的结构示意图;
图13是本申请实施例的解码装置的模块示意图;
图14是本申请实施例的解码设备的结构示意图;
图15是本申请实施例的通信设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施,且“第一”、“第二”所区别的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也可以是多个。此外,说明书以及权利要求中“和/或”表示所连接对象的至少其中之一,字符“/”一般表示前后关联对象是一种“或”的关系。
值得指出的是,本申请实施例所描述的技术不限于长期演进型(Long Term Evolution,LTE)/LTE的演进(LTE-Advanced,LTE-A)系统,还可用于其他无线通信系统,诸如码分多址(Code Division Multiple Access,CDMA)、时分多址(Time Division Multiple Access,TDMA)、频分多址(Frequency Division Multiple Access,FDMA)、正交频分多址(Orthogonal  Frequency Division Multiple Access,OFDMA)、单载波频分多址(Single-carrier Frequency Division Multiple Access,SC-FDMA)和其他系统。本申请实施例中的术语“系统”和“网络”常被可互换地使用,所描述的技术既可用于以上提及的系统和无线电技术,也可用于其他系统和无线电技术。以下描述出于示例目的描述了新空口(New Radio,NR)系统,并且在以下大部分描述中使用NR术语,但是这些技术也可应用于NR系统应用以外的应用,如第6代(6th Generation,6G)通信系统。
下面结合附图,通过一些实施例及其应用场景对本申请实施例提供的编码方法、解码方法进行详细地说明。
如图1所示,本申请实施例提供了一种编码方法,包括:
步骤101:编码端根据第一标识信息,对目标三维网格对应的基础网格进行编码,获取第一码流,所述基础网格包括所述目标三维网格对应的重建纹理坐标信息,所述第一标识信息用于表征是否对所述重建纹理坐标信息进行编码。
本申请实施例中,通过第一标识信息来确定是否对目标三维网格对应的重建纹理坐标信息进行编码,例如,该第一标识信息为1时,表示需要对重建纹理坐标信息进行编码,该第一标识信息为0时,表示无需对重建纹理坐标信息进行编码。
上述重建纹理坐标信息包括每个顶点对应的重建纹理坐标,即UV坐标,上述UV坐标用于表征对应的顶点的纹理颜色值。
需要说明的是,本申请中所说的目标三维网格可以理解为任意视频帧对应的三维网格。
可选地,所述基础网格还包括所述目标三维网格对应的几何信息和连接关系信息。
可选地,本申请实施例中,可以使用任意网格编码方法对上述基础网格中的几何信息、连接关系信息、重建纹理坐标信息(如果根据第一标识信息确定需要进行编码)进行编码,,合并后得到基础网格码流,即上述第一码流。
步骤102:所述编码端根据网格差异信息,获取第二码流,所述网格差异信息用于表征所述基础网格与待编码的三维网格之间的差异信息,所述目标三维网格是基于待编码的三维网格得到的。
可选地,所述网格差异信息用于表征细化处理后的基础网格与待编码的三维网格之间的差异信息。
可选地,对基础网格的几何信息和UV坐标进行细化插值(refinement)处理,然后计算插值点与原始网格(待编码的三维网格)最近邻点的位移向量,通过该位移向量得到上述网络差异信息。
步骤103:所述编码端根据重建的纹理图信息,获取第三码流,所述重建的纹理图信息是根据所述第一码流和第二码流得到的。
这里,对重建的纹理图信息进行编码,得到第三码流。可选地,通过视频编码器对上述重建的纹理图信息进行编码。
步骤104:所述编码端根据所述第一码流和所述第二码流,生成目标码流。
本步骤中,在得到第一码流、第二码流和第三码流之后,对上述第一码流、第二码流和第三码流进行混流,生成目标码流。
需要说明的是,本申请实施例的编码方法适用于有损模式编码。
本申请实施例中,编码端根据第一标识信息对目标三维网格对应的基础网格进行编码,所述基础网格包括所述目标三维网格对应的重建纹理坐标信息,获取第一码流,根据网格差异信息,获取第二码流,并根据重建的纹理图信息,获取第三码流,根据所述第一码流、所述第二码流和第三码流,生成目标码流。由于重建纹理坐标数据量在三维网格中比重较大,本申请实施例中根据第一标识信息可以选择不对基础网格中的重建纹理坐标信息进行编码,可以极大地节省码率,提高编码效率。
可选地,所述编码端根据第一标识信息,对目标三维网格对应的基础网格进行编码,获取第一码流之前,还包括:
在处于有损编码模式的情况下,对待编码的三维网格进行简化处理,得到目标三维网格;
在处于无损编码模式的情况下,将待编码的三维网格,确定为目标三维网格。
本申请实施例中,在有损编码模式中,对待编码的三维网格进行预处理,该预处理可以是简化处理,例如可以对几何和连接关系进行简化操作,即在尽量保持网格结构的情况下减少网格顶点和边的数量,进而减少三维网格的数据量。
可选地,所述编码端根据所述第一码流和所述第二码流,生成目标码流,包括:
对所述第一标识信息进行编码,得到编码后的第一标识信息;
根据所述编码后的第一标识信息、所述第一码流和所述第二码流,生成目标码流。
本申请实施例中,可以在目标码流中携带上述第一标识信息,这样,解码端可以根据该第一标识信息确定是否需要生成重建纹理坐标信息。
可选地,所述编码端根据第一标识信息,对目标三维网格对应的基础网格进行编码,获取第一码流,包括:
在所述第一标识信息表征对所述目标三维网格对应的重建纹理坐标信息进行编码的情况下,对所述几何信息、连接关系信息和重建纹理坐标信息进行编码,得到第一码流;
和/或,在所述第一标识信息表征对所述目标三维网格对应的重建纹理坐标信息不进行编码的情况下,对所述几何信息和连接关系信息进行编码,得到第一码流。
本申请实施例中,可以由用户根据实际需求来设置上述第一标识信息,即由用户选择是否对上述重建纹理坐标信息进行编码。
可选地,所述编码端根据重建的纹理图信息,获取第三码流之前,还包括:
对所述第一码流进行解码和反量化处理,得到重建基础网格;
对所述第二码流进行解码和反量化处理,得到目标网格差异信息;
根据所述重建基础网格和所述目标网格差异信息,按照纹理图生成算法生成重建的纹理图信息。可选地,本申请实施例中,根据目标三维网格对应的基础网格的所述几何信息 和连接关系信息,按照纹理坐标重采样算法,生成所述重建纹理坐标信息。
该重建纹理坐标信息的生成方式与相关技术中的生成方式相同,此处不再赘述。
可选地,所述编码端根据网格差异信息,获取第二码流,包括:
对所述第一码流进行解码,获取所述第一码流对应的重建网格;
根据所述重建网格,对所述网格差异信息进行更新,得到更新后的网格差异信息;
对更新后的网格差异信息进行编码,得到所述第二码流。
本申请实施例中,由于编码端对基础网格进行有损压缩,为了提高网格差异信息的准确性,需要基于基础网格码流解码后的重建网格,对网格差异信息进行更新,使得网格差异信息能够更准确的表示基础网格与原始网格(待编码网格)的差异。
另外,在对网格差异信息进行更新后对其进行变换,比如小波变换。再将变换后的位移信息进行量化并按照一定的规则,如z扫描顺序将变换后的网格差异信息排列至图像的像素点值中。对图像进行视频编码。
可选地,所述编码端根据所述第一码流、所述第二码流和所述第三码流,生成目标码流,包括:
根据所述目标三维网格的分片信息,获取第四码流;
根据所述第一码流、第二码流、第三码流和第四码流,得到所述目标码流。
本申请实施例中,对于编码端而言,分别针对预处理得到的网格(称为基础网格)、用于表示基础网格与原始网格差异的位移信息和重建的纹理图属性信息进行编码:1)在有损模式中,对三维网格进行预处理。例如可以对几何和连接关系进行简化操作,即在尽量保持网格结构的情况下减少网格顶点和边的数量,进而减少三维网格的数据量。2)对于简化后得到的网格,使用UV坐标重采样算法重新生成UV坐标。本申请中将简化后的几何信息、连接关系和基于简化后网格新生成的UV坐标称为基础网格;3)使用任意静态网格编码方法对基础网格的几何信息、连接关系、新生成的UV坐标进行编码,合并码流后得到基础网格码流。需要注意的是,是否对基础网格的UV坐标进行编码由标识来决定;4)在预处理模块中对基础网格的几何信息和UV坐标进行细化插值(refinement),计算插值点与原始网格最近邻点的位移向量。在位移信息编码模块编码细化插值算法参数和位移向量得到位移信息码流。5)对编码后的基础网格进行解码和重建,得到重建的基础网格。6)对编码后的位移信息进行解码和反量化,得到解码和反量化后的位移信息。7)使用重建的基础网格和解码反量化后的位移信息重建网格。8)使用重建网格采用纹理图生成算法生成新的纹理图,使用视频编码器对新生成的纹理图进行编码。9)把得到的各路子码流混流成为编码器的输出码流。
本申请实施例的三维网格编码框架主要包括网格预处理模块、基础网格编码模块和基于视频的位移信息编码模块等。三维网格编码框架图如图2所示,首先,对输入含纹理图的三维网格(即待编码的三维网格)进行预处理。预处理模块如图3所示。在预处理模块中,可以选择是否对三维网格进行分片。分片的信息形成片(patch)信息。接着对三维网 格进行抽样简化。然后对简化后的网格进行表面参数化处理,即生成新的UV坐标(重建纹理坐标信息)。在此过程中,部分几何信息也会出现变化。经过表面参数化后得到基础网格(包括几何信息、连接关系和UV坐标)并作为一路输出。此外,将基础网格几何信息和UV坐标进行细化插值操作,并计算插值点与其在原始网格上延面片法向量投影点之间的偏移向量作为位移信息输出。至此,预处理模块输出基础网格和位移信息。如图2所示,接下来对预处理输出的基础网格进行量化操作,继而对几何信息、连接关系和UV坐标分别进行编码。值得注意的是,此处对基础网格的编码可以替换为任意三维网格编码方法。在该模块中,可以选择性的对UV坐标进行编码。如果选择不对连接关系编码,在解码端需要使用与编码端相同的UV坐标生成方法对UV坐标进行重建。将基础网格各部分码流共同作为基础网格编码模块的输出,即基础网格子码流。对图像进行视频编码得到位移信息子码流。另外,需要对编码后的基础网格进行解码和重建,得到重建的基础网格。对编码后的位移信息进行解码和反量化,得到解码和反量化后的位移信息。然后使用重建的基础网格和解码反量化后的位移信息重建网格。使用重建网格采用纹理图生成算法生成新的纹理图并使用视频编码器对新生成的纹理图进行编码。最后,将patch信息子码流、基础网格子码流和位移信息子码流进行混流,得到编码的输出码流。
下面对简化处理的具体实现方式说明如下。
对于输入的原始网格,即待编码的三维网格,首先进行网格简化的操作。网格简化的重点在于简化的操作以及对应的误差度量。这里的网格简化操作可以为基于边的简化。如图4所示,可以通过合并一条边的两个顶点来达到减少面片数和顶点数的目的。此外,还可通过基于点等网格简化方式对网格进行简化。
在网格简化的过程中需要定义简化的误差度量。例如,可以选取顶点所有相邻面的方程系数之和作为该顶点的误差度量,相应的边的误差度量即为边上两顶点的误差度量之和。在确定好简化操作的方式以及误差度量后就可以开始对网格进行简化。例如,可以将网格分为一片或多片局部网格,先计算片中初始网格的顶点误差来得到每条边的误差。然后将片内的所有边按误差按照某规则,如按照从小到大的规则进行排列。每次简化可以按照某种规则对边进行合并,如选取误差最小的边进行合并,同时计算合并后的顶点位置并更新所有与合并后顶点相关的边的误差,更新边排列的顺序。通过迭代将网格的面简化到某一预期数量。
具体过程包括:
1、顶点误差的计算
顶点误差可以定义为顶点所有相邻面的方程的系数之和。例如,每个相邻面都定义了一个平面,可以用公式一表示:
公式一:D2=(nTv+d)2=vT(nnT)v+2dnTv+d2
其中,D为任意顶点到平面的距离,n为平面的单位法向量,v为顶点的位置向量,d为常数。用二次曲面的形式表示为公式二:Q=(A,b,c)=(nnT,dn,d2);
其中,Q为顶点误差,A,b,c为表示公式一中相应符号的系数。
从公式二进而得到公式三:Q(v)=vTAv+2bTv+c;
由顶点误差为该顶点所有相邻面的方程系数之和,则可令Q1(v)+Q2(v)=(Q1+Q2)(v)=(A1+A2,b1+b2,c1+c2)(v),则由可以得到合并产生的误差为其中,Q(v)为顶点误差,v为对应的顶点,Q1(v)为v相邻平面1的方程,Q2(v)为v相邻平面2的方程,A1,A2,b1,b2,c1,c2为各自相对应的系数。当然,若存在多个相邻面则可向公式四中继续添加相应的平面误差方程。
2、合并顶点
合并顶点过程的一个主要步骤为确定合并后顶点的位置。根据误差公式三,可以选择能让误差尽可能小的顶点位置。例对通过对公式三求偏导数,可以得到,
由上式可知,只有在矩阵A可逆的情况下,才能求得使误差最小的点。因此,此处对于合并后的顶点位置可以有多种取法。如果考虑网格简化的质量,在矩阵A可逆的情况下,选取使误差最小的顶点位置;在矩阵A不可逆的情况下,可以选择边上包括两端点的其中一个使误差最小的点。如果考虑网格简化的复杂度,可以直接选取边的中点或者两端点的其中一个作为合并后的顶点的位置。如果考虑网格简化后量化的效率,还需要调整合并后的顶点位置。由于量化后对于高精度信息需要单独编码,因此,将一部分合并后的顶点位置调整为相应量化参数的倍数,确保在反量化时不需要额外的信息便可恢复原始位置,将减少高精度的几何信息所消耗的数据量。
确定了如何选取合并后的顶点位置后,便可以开始合并顶点的过程。例如,可以先计算初始网格中所有的边的误差,将其按误差按照某规格,如从小到大的顺序进行排列。每次迭代,选取误差满足某规则的边,如误差最小的边。从网格顶点中移除边的两个端点并添加合并后的顶点到网格顶点的集合中。将合并前的两个顶点的所有或部分相邻顶点作为合并后的顶点的相邻顶点,然后更新与该合并顶点相连接的所有点的误差度量,从而得到新产生的边的误差。然后从片的全局来更新边的排列顺序。循环上述过程,直到达到满足有损编码所需要的面数。
3、更新连接关系
在合并顶点之后,由于顶点集中删除了一部分顶点,同时又添加了许多新的顶点,因此需要更新顶点之间的连接关系。例如,在合并顶点过程中可以确定合并后的顶点对应的合并之前的两个顶点。只需要用合并后的顶点的索引替换所有在面中出现的合并之前的两个顶点的索引,然后删除具有重复索引的面,就可以达到更新连接关系的目的。
以上即为网格简化的主要过程。同时,三维网格还可能会携带有属性信息,对属性信息也可能需简化。对于带有属性信息的网格,如纹理坐标,颜色,法向量等,可以将顶点坐标扩展到更高的维度从而计算带有属性信息的顶点误差。以纹理坐标为例,设顶点坐标 为(x,y,z),纹理坐标为(u,v),则扩展后的顶点为(x,y,z,u,v)。设扩展后的三角形T=(p,q,r),为了确定高维空间上的误差度量,首先计算两个标准正交向量,即:

其中,e1,e2为T所在平面上的两个向量,q为,e2为,此处"·"代表向量的点乘,它定义了该高维平面上的一个坐标轴,以p为原点。考虑一个任意点v,另u=p-v,由公式八:‖u‖2=(u·e1)2+…+(μ·en)2
即公式九:(u·e3)2+…+(u·en)2=‖μ‖2-(μ·e1)2-(u·e2)2
由于e1,e2为T所在平面上的两个向量,则公式九左边项即为顶点到T所在平面的距离的平方,即公式十:D2=‖μ‖2-(μ·e1)2-(u·e2)2
对其进行展开并合并后可得到与公式三类似的方程,其中:

公式十二:b=(p·e1)e1+(p·e2)e2-p;
公式十三:c=p·p-(p·e1)2-(p·e2)2
得到上述误差度量后,则可以进行与之前三维信息一样的后续步骤,从而实现了对于带有属性信息的网格的简化。
通常来说,图像的边缘部分更能吸引人们的注意力,从而影响人们对于该图像的质量评价。三维网格也是如此,人们往往更容易注意到边界部分。因此,是否保持边界也是网格简化中影响质量的一个因素。网格的边界一般为几何形状的边界以及纹理的边界。当一条边只属于一个面时,该边即为一个几何边界。当同一个顶点具有两个或多个纹理坐标时,该顶点即为纹理坐标的边界。在网格简化时,以上边界均不应被合并。因此,在每次简化时可以先判断该边上的顶点是否为边界点,若为边界点则跳过,直接进行下一次迭代。
下面对一种可选地网格参数化方法进行具体说明。
该网格参数化方法包括:
(1)UV坐标的重新生成
输入:待处理的原始三维网格(可包含UV坐标也可以不包含UV坐标)
输出:重新生成的UV坐标。
该方法中,可以使用ISO-图表(charts)算法得到重建纹理坐标信息,该算法使用谱分析实现拉伸驱动的三维网格参数化,将三维网格进行UV展开、分片并打包到二维纹理域。设定一个拉伸阈值。该算法的具体实现过程如下:
a)计算表面谱分析,提供一个初始参数化;
b)执行拉伸优化的迭代;
c)如果此派生参数化的拉伸小于阈值,则停止;
d)执行表面谱聚类来将表面划分为charts;
e)使用图割(graph cut)算法优化chart边界;
f)迭代分割charts直到满足拉伸准则;
下面就上述表面谱分析、拉伸优化、表面谱聚类、边界优化四个主要部分分别进行介绍。
1、表面谱分析
表面谱分析基于等距特征映射(isometric feature mapping,IsoMap)降维方法对目标三维网格进行参数化。给定一组高维点,IsoMap计算沿流形的测地距离(geodesic distance)作为相邻点之间的跳跃序列。然后将多维尺度(multidimensional scaling,MDS)算法应用于这些测地线距离,以找到嵌入在低维空间中具有相似成对距离的一组点。给定点数为N的表面,其计算过程如下:
a)计算表面点之间的测地距离平方的对称矩阵DN
b)对DN进行双中心化和归一化得BN,其计算过程如下所示:
其中,I是N维单位矩阵,1是长度为N的单位向量。
c)计算BN的特征值λi及相应的特征向量
d)对于原始表面的每个点i,它在新空间中的嵌入为N维向量它的第j个元素的计算过程如下所示:
BN的特征值λi和相应的特征向量构成了表面形状的谱分解。大特征值对应的特征向量表示表面上的全局低频特征,小特征值对应的特征向量表示高频细节。将高能、低频分量作为chartification和参数化的基础。
虽然需要N个特征值来完全表示一个有N个顶点的表面,但它们中的一小部分能量通常占据大部分能量。因此,只计算n<<N个最大特征值和相应的特征向量,来产生所有点的n维嵌入。
此外,由于高维空间到低维空间的映射不是等距的,该参数化会导致畸变失真。对于每个顶点i,其在嵌入下的测地距离畸变失真(GDD)定义如下所示:
其中,是顶点i的n维嵌入坐标,dgeo(i,j)是点i和点j之间的测地距离。
当n=2时,表面谱分析产生一个所有顶点的GDD平方和最小的表面参数化。
需要注意的是,Isomap算法虽然是沿流形计算的测地距离,但针对输入三维网格存在某些非流形的情况,此方案会进行相应的预处理来消除这些非流形的存在。
2、拉伸优化
由于三维空间到二维空间不是等距的,参数化会导致畸变失真,为了消除失真现象,需要进行拉伸优化处理。畸变失真可以用很多方法来测量,包括角度或区域的保存情况,或者在表面上拉伸或收缩多少参数距离。此算法中关注的是距离畸变,特别是对几何拉伸的定义,其定义了表面局部距离的平均拉伸L2和最坏情况下的拉伸L两个测度。
假设一个带有二维纹理坐标p1,p2,p3的三角形T,其中pi=(si,ti),相应的三维坐标表示为q1,q2,q3,仿射映射S(p)=S(s,t)=q的计算过程如下所示:
公式十七:S(p)=(<p,p2,p3>q1+<p,p3,p1>q2+<p,p1,p2>q3)/<p1,p2,p3>
其中,<a,b,c>表示三角形abc的面积。由于该映射是仿射的,所以它的偏导数在(s,t)上是常数,其计算过程如下所示:

其中,A=<p1,p2,p3>=((s2-s1)(t3-t1)-(s3-s1)(t2-t1))/2
然后计算得雅克比(Jacobian)矩阵[SS,St]的较大和较小奇异值,计算过程如下所示:

其中,a=Ss·Ss,b=Ss·St,c=St·St。奇异值γmax,γmin表示单位长度向量从二维纹理域映射到三维表面时获得的最大长度和最小长度,即最大和最小的局部“拉伸”。在三角形T上的两个拉伸测度定义如下所示:
公式二十三:L(T)=γmax
在整个三维网格M={Ti}上拉伸测度的定义如下所示:

其中,A′(Ti)是三角形Ti在三维空间中的表面面积。
由于L仅依赖域中的一个最坏情况点,L拉伸对任何方法来说都很难控制,但经过几次L2拉伸最小化的迭代可以显著地改善结果。
3、表面谱聚类
如果通过谱分析产生的参数化未能满足拉伸阈值,则它将被划分为更小的charts。由于模型的全局特征对应较大的特征值,因此使用它们进行划分。利用谱分析的结果计算几个代表性顶点,然后围绕这些代表点同时增长图表(grow charts),该方法称为表面谱聚类。具体算法过程如下:
a)将来自谱分析的特征值和相应的特征向量进行从大到小排序,即λ1≥λ2≥...≥λN
b)得到使λnn+1最大化的前n个特征值和特征向量(n≤10)。
c)对目标三维网格中的每个顶点i,计算它的n维嵌入坐标
d)对n个嵌入坐标的每一个,找到坐标最大和最小的两个点,并把它们设置为2n个代表点。
e)移走那些距离小于距离阈值的代表点,产生m≤2n个代表点,可选地,距离阈值为目标三维网格平均边缘长度的10倍。
f)利用表面谱分析中计算的测地距离,通过在代表点周围同时grow charts,将三维网格划分为m个部分。每个三角形被分配给有与该三角形最近的代表点的chart中(从三角形到代表点的测地距离计算为三角形的三个顶点到代表顶点测地距离的平均值)。
4、边界优化
在得到多个charts后,使用graph cut算法对各个charts之间的边界进行优化。Charts边界应满足两个目标:1)它们应在不太参差不齐的情况下穿过高曲率区域,2)它们应将其边界charts的嵌入失真降至最低。此算法将最优边界问题表述为一个grap cutting问题。为了简单起见,下面讨论将表面一分为二的二元情况。当细分成两个以上的charts时,则依次考虑每一对相邻的charts。
假设在chart A和chart B之间寻找一个最佳边界,初始划分是使用表面谱聚类生成的。然后通过将一个区域扩展到初始分割边界的两侧来生成一个中间区域C。中间区域的大小与未剥离patch的总面积成正比。现在利用graph cut算法中方法的扩展从C构建无向流网络图。此处将graph cut算法中两个相邻三角形fi和fj之间的“容量(capacity)”定义修改以下公式二十六所示:
公式二十六:c(fi,fj)=αcang(fi,fj)+(1-α)cdistort(fi,fj)
公式二十六中的第一项对应于沿具有高二面角的边进行非参差不齐切割的第一个目标,计算过程如公式二十七所示:
其中,dang(fi,fj)=1-cosαij,αij是三角形fi和fj法线之间的角度,avg(dang)是相邻三角形之间的平均角距离。
式二十六中的第二项测量了嵌入失真,计算过程如公式二十八和公式二十九所示:

公式二十九:ddistort(fi,fj)=|GDDA(fi)-GDDB(fi)|+|GDDA(fj)-GDDB(fj)|
其中,GDDA(fi)和GDDB(fi)分别是三角形fi在chart A或chart B诱导下嵌入的GDD,avg(ddistort)是所有相邻三角形对上的ddistort(fi,fj)的平均。cdistort(fi,fj)的这种定义更倾向于其相邻三角形平衡了chart A和chart B确定的嵌入之间的GDD的边界边。换句话说,切割应避免将三角形放置在错误的一侧,以产生不必要的变形。
公式二十六中的权重参数α是对上述两个目标的权衡。
此拉伸驱动的chartification和参数化算法的简单实现的花费很大,尤其随着模型顶点数量增加的时候。因此为了加快计算速度,在实际应用中,Iso-charts算法采用了Isomap的扩展算法landmark Isomap。同时,landmark Isomap算法也被用于边界优化时计算中间区域顶点的嵌入坐标,以进一步减小嵌入失真。
最后,利用MCGIM算法中使用的chart packing的算法把上述过程生成的charts打包到二维纹理域上。最终就可以得到重新生成UV坐标的三维网格。
(2)网格细化。
输入:基础网格(包含属性信息);
输出:细化后的网格。
本申请实施例中,可以使用任意网格细化方案对基础网格进行细化。一种可行的细化方案是中点细分方案,它在每次细分迭代中将每个三角形细分为4个子三角形,如图5所示。在每个边的中间引入新顶点。细分过程独立地应用于几何和纹理坐标,因为几何和纹理坐标的连通性通常不同。细分方案计算新引入的顶点v12在边(v1,v2)中心的位置Pos(v12),如下所示:
公式三十中,Pos(v1)和Pos(v2)是顶点v1和v2的位置。
相同的过程用于计算新创建的顶点的纹理坐标。对于法线向量,额外的归一化步骤如下:
式中,N(v12),N(v1)和N(v2)是分别对应与顶点v12,v1和v2的法向量。||x||是对向量x的模2运算。
(3)位移信息计算。
输入:细化后的网格和原始网格(包含属性信息);
输出:位移信息。
图6使用2D曲线展示了预处理方案的基本思想。相同的概念应用于输入3D网格以生成基础网格和位移场。在图6中,输入的2D曲线(由2D折线表示),称为“原始曲线”,首先进行下采样以生成基本曲线/折线,称为“简化曲线”。然后将细分方案应用于简化得到的多段线以生成“细化曲线或细分曲线”。随后对细分的多段线进行变形,以获得更好的原始曲线近似值。即,为细分网格的每个顶点计算位移矢量(图6中的箭头所示),使位移曲线的形状尽可能接近原始曲线的形状。这些位移矢量就是该模块输出的位移信息。
本申请实施例中,基础网格的编码可以使用相关技术中的网格编码器Draco,主要包括五部分:量化、连接关系编码、几何信息编码、UV坐标编码和纹理图编码,下面分别进行说明。
(1)量化
输入:基础网格的几何信息和UV坐标;
输出:量化后的几何信息和UV坐标;
首先,对输入网格的顶点三维坐标进行量化,得到量化后的几何信息。
设某顶点的三维坐标为(x,y,z),量化系数为(QPx,QPy,QPz),量化后的几何信息(xq,yq,zq)的计算过程如下:
公式三十二:xq=f1(x,QPx)
公式三十三:yq=f1(y,QPy)
公式三十四:zq=f1(z,QPz)
其中,公式三十二至公式三十四中的f1函数是量化函数,量化函数的输入为某一维度的坐标和该维度的量化系数,输出为量化后的坐标值。
f1函数可以有多种计算方式,比较通用的一种计算方式如公式三十五至公式三十七所示,使用每个维度的原始坐标除以该维度的量化系数来计算。其中,“/”为除法运算符,对除法运算的结果可以采用不同的方式进行舍入,如四舍五入、向下取整、向上取整等。
公式三十五:xq=x/QPx
公式三十六:yq=y/QPy
公式三十七:zq=z/QPz
当量化系数为2的整数次幂时,f1函数可以使用位运算实现,如公式三十八至公式四十:
公式三十八:xq=x>>log2QPx
公式三十九:yq=y>>log2QPy
公式四十:zq=z>>log2QPz
值得注意的是,无论f1函数采用哪种计算方式,量化系数QPx、QPy和QPz都可以灵活设置。首先,不同分量的量化系数并不一定相等,可以利用不同分量量化参数的相关性,建立QPx、QPy和QPz之间的关系,为不同分量设置不同的量化系数;其次,不同空间区域的量化系数也不一定相等,可以根据局部区域顶点分布的稀疏程度自适应的设置量化参数。
对二维UV坐标的量化与三维坐标量化类似,减少一个维度的量化即可。
(2)连接关系编码。
输入:基础网格连接关系;
输出:编码后的连接关系子码流和顶点编码顺序。
一种可用的连接关系编码方法为压缩(Edgebreaker,EB)算法。EB算法通过遍历三角网格模型的每个三角形后,获得由C、L、E、R、S这5个字符组成的字符串序列,然后用霍夫曼编码方法编码这个字符串序列。EB中定义的五种操作模式如图7所示。其中,C表示待编码顶点v不在边界上的拓扑情况;L和R表示待编码顶点v在边界上且当前三角形除了当前边外还有一条边e在边界上,L和R分别表示e在当前边的不同方向;S把图形分成两部分,同时需要用额外的偏移或其他操作记录分支信息;E表示三角形的3条边都在边界上。
该算法以螺旋形的形式对网格进行编码。在遍历网格过程中始终维持一个由边组成的有向边界,这个边界把网格分成已遍历部分和未遍历部分。然后每遍历一个三角形就输出一个该三角形及边界的拓扑关系操作符,并把该多边形归入已编码部分。其具体遍历过程如下:先选择任意一个三角形形成最初的边界,再选择任意一条边为当前边。Edgebreaker算法采用5个操作符C、L、E、R和S记录当前三角形与边界的拓扑关系。根据不同操作符中箭头的指向,选择下一条作为当前边的边,继续判断待编码顶点所对应的操作模式。按照该步骤循环操作,直至遍历所有顶点。此时,可以得到遍历过程中的操作符串并对该 字符串进行熵编码。另外,使用EB算法还需要输出遍历的顶点顺序给几何信息和UV坐标编码模块。根据EB的编码规则,最终熵编码的模式码字为CCRRSLCRSERRELCRRRCRRRE。
(3)几何信息编码
输入:量化后的几何信息、连接关系和连接关系编码顶点顺序;
输出:编码后的连接关系子码流。
本申请实施例中,可以使用如下平行四边形预测法对几何坐标进行编码:
如图8所示,三角形S1为当前已编码过几何坐标的三角形,对待编码顶点的遍历方式在编码连接关系时与编码连接关系的顶点顺序相同。当不编码连接关系时,顶点遍历顺序与基础网格中的顶点顺序相同。在进行编码遍历时,选取一边作为当前正遍历到的边τ1,使用连接已编码的另外一顶点形成的三角形作为半个平行四边形去预测当前边所对的待编码顶点的三维几何坐标,即图中的A2点作为预测顶点。此时,计算预测顶点与真实顶点(A3)的坐标差值并使用熵编码对其进行编码形成几何信息子码流。其中,S2为预测三角形,S3为待编码三角形。
另外,此处还可以使用两个或者三个平四边形对对待编码几何坐标进行预测,此处不强调具体的编码方法。
(4)UV坐标编码(通过标识控制是否编码UV坐标)
输入:基础网格的UV坐标、连接关系和连接关系编码顶点顺序;
输出:编码后的UV坐标子码流。
可以使用如下平行四边形预测法对UV坐标进行编码:
请参照图8,三角形S1为当前已编码过UV坐标的三角形,对待编码顶点的遍历方式在编码连接关系时与编码连接关系的顶点顺序相同。当不编码连接关系时,顶点遍历顺序与基础网格中的顶点顺序相同。在进行编码遍历时,选取一边作为当前正遍历到的边τ1,使用连接已编码的另外一顶点形成的三角形作为半个平行四边形去预测当前边所对的待编码顶点的UV坐标,即图中的A2点作为预测顶点。此时,计算预测顶点与真实顶点的坐标差值并使用熵编码对其进行编码形成UV坐标子码流。
另外,此处还可以使用两个或者三个平四边形对对待编码UV坐标进行预测,此处不强调具体的编码方法。
如标识标记不需要编码UV坐标,则跳过该模块不编码UV坐标。
在完成基础网格的编码后,需要解码基础网格码流来获得失真后的几何信息和UV坐标。如标识标记不编码UV坐标,则使用基础网格的UV坐标来修正顶点偏移。根据解码的几何信息和UV坐标(根据标识判断使用解码后UV坐标还是编码前的基础网格UV坐标)来修正位移信息中的顶点偏移矢量值。而后,对更新过的位移信息进行编码,一种可行的位移信息编码方式为线性小波变换。
1)位移信息的变换和更新
输入:位移信息;
输出:变换后的位移信息。
更新过程为:
式中,v*是顶点v的相邻顶点的集合。Signal(v)是顶点v的几何或属性值。
小波变换的过程为:
式中,v是在v1和v2的边的中点插入的顶点。Signal(v),Signal(v1)和Signal(v2)分别是顶点v,v1和v2的几何或属性值。
注意,方案中的更新过程可以跳过,即不对位移信息进行更新而直接编码位移信息。
2)变换后的位移信息编码
输入:变换后的位移信息
输出:位移信息子码流
对变换后的位移信息进行量化后,可以使用以下方法变换后的位移信息排布到2D图像中:
方法1:从低频到高频遍历系数。
方法2:对于每个系数,确定应按照块的光栅顺序存储的NxM像素块的索引(例如,N=M=16)。NxM像素块内的位置是通过使用莫顿顺序来计算的。
也可以使用其他排布方案,例如之字形顺序、光栅顺序等。编码器可以在比特流中明确地用信号通知所使用的排布方案。
在将信息排布进2D图像后,可以使用任意视频编码器对图像进行编码得到位移信息子码流。
在编码纹理图前,需要将位移信息子码流进行解码和反量化得到失真后的位移信息。该操作可以保证编解码端使用信息的一致性。使用重建基础网格和失真后的位移信息联合生成重建网格。使用重建网格和原始纹理图生成新的纹理图。
(5)纹理图重生成
输入:重建网格、原始纹理图
输出:新生成的纹理图
使用重建网格和原始纹理图生成新纹理图的算法步骤如下:
a)首先计算原始三维网格边界框(bounding box),获得最大搜索距离。
b)计算目标三维网格在纹理空间中的边界边。
c)将原始三维网格中的面划分为均匀网格(grid)。
d)遍历目标三维网格中的所有面,利用原始纹理图对应的RGBA值光栅化目标纹理图。
e)计算当前面在纹理空间中的bounding box,然后在此bounding box范围内采样出各个像素的中心点,通过判断采样点与当前面的内外关系以及在当前面边界处的外部采样点 在纹理空间中对当前面内部是否有影响来确定当前面对应的像素位置。
f)在最大搜索距离内搜索已划分为均匀grid的原始三维网格中距离目标三维网格当前面的三个点中各自的最近点,得到最近面,该面即作为当前面在原始网格中对应的面,从而得到当前面在原始三维网格中对应的纹理坐标。
g)根据对应的纹理坐标计算出原始网格中对应的面中在原始纹理图相应位置的像素RGBA值并赋值给目标三维网格中当前面在目标纹理图中对应的像素位置。
h)直至遍历完所有面,光栅化结束。
i)将边界边上像素的alpha值转化为255来平滑边界,最后为了便于编码、节省码流,利用pull push填充算法对生成的目标纹理图进行填充。(可以选择是否进行填充)
对于新的纹理图,通常可以直接使用视频编码器对逐帧的纹理图有进行编码,如使用高效率视频编码(High Efficiency Video Coding,HEVC)、通用视频编码(VVC)等编码器,形成属性子码流。此处的视频编码器可以选用任意的视频编码器。
最后,将各路子码流进行混流后形成输出的网格编码码流。
本申请实施例中,编码端根据第一标识信息对目标三维网格对应的基础网格进行编码,所述基础网格包括所述目标三维网格对应的重建纹理坐标信息,获取第一码流,根据网格差异信息,获取第二码流,并根据重建的纹理图信息,获取第三码流,根据所述第一码流、所述第二码流和第三码流,生成目标码流。由于重建纹理坐标数据量在三维网格中比重较大,申请实施例中根据第一标识信息可以选择不对基础网格中的重建纹理坐标信息进行编码,可以极大地节省码率,提高编码效率。
如图9所示,本申请实施例还提供了一种解码方法,包括:
步骤901:解码端对获取的目标码流进行分解,得到第一码流、第二码流和第三码流,所述第一码流是基于目标三维网格对应的基础网格得到的,所述第二码流是基于网格差异信息得到的,所述网格差异信息用于表征所述基础网格与待编码的三维网格之间的差异信息,所述目标三维网格是基于待编码的三维网格得到的,所述第三码流是根据重建的纹理图信息得到的。
步骤902:在所述解码端确定所述第一码流包括重建纹理坐标信息的情况下,根据所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格;
步骤903:在所述解码端确定所述第一码流未包括重建纹理坐标信息的情况下,生成重建纹理坐标信息,并根据生成的重建纹理坐标信息、所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格。
本申请实施例中,编码端根据第一标识信息可以选择不对基础网格中的重建纹理坐标信息进行编码,该情况下解码端可根据已解码的信息生成重建纹理坐标信息,在有损模式下,可以极大地节省码率,提高编码效率。
可选地,本申请实施例的方法,还包括:
所述解码端对获取的目标码流进行分解,得到第一标识信息,所述第一标识信息用于表征编码端是否对重建纹理坐标信息进行编码;
根据所述第一标识信息,确定所述第一码流是否包括重建纹理坐标信息。
本申请实施例中,编码端对用于指示是否对重建纹理坐标信息进行编码的第一标识信息进行编码,这样,使得解码端能够根据该第一标识信息确定是否需要生成重建纹理坐标信息。
可选地,解码端对获取的目标码流进行分解,得到第一码流、第二码流和第三码流之后,还包括:
对所述第一码流进行解码处理,得到第一解码结果;
根据所述第一解码结果,确定所述第一码流是否包括重建纹理坐标信息。
本申请实施例中,编码端也可不对上述第一标识信息进行编码,该情况下解码端可根据第一解码结果确定是否包含重建纹理坐标信息。
可选地,所述第一解码结果还包括:
所述目标三维网格对应的几何信息和连接关系信息。
可选地,所述生成重建纹理坐标信息,包括:
根据所述几何信息和连接关系信息,按照纹理坐标重采样算法,生成所述重建纹理坐标信息。
这里,解码端采用与编码端相同的UV坐标生成方法对UV坐标进行重建,得到重建纹理坐标信息。
可选地,解码端对获取的目标码流进行分解,得到第一码流、第二码流和第三码流,包括:
解码端对获取的目标码流进行分解,得到第一码流、第二码流、第三码流和第四码流,所述第四码流是基于目标三维网格的分片信息确定的;以及
所述根据所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格,包括:根据所述第一解码结果、第二解码结果、第三解码结果和所述第四码流对应的第四解码结果,重建所述目标三维网格;或者,所述根据生成的重建纹理坐标信息、所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格,包括:根据所述第一解码结果、第二解码结果、第三解码结果、所述第四码流对应的第四解码结果和生成的重建纹理坐标信息,重建所述目标三维网格。
本申请实施例中,三维网格的解码框架如图10所示,首先,将目标码流分解成patch信息子码流、几何信息子码流、连接关系子码流,UV坐标子码流(如果有)、纹理图子码流和位移信息子码流。分别对这些子码流进行解码;如码流中包括UV坐标子码流,则不需要重新生成UV坐标;如码流中不包含UV坐标,则需要使用与编码端相同的UV坐标 生成算法重新生成UV坐标。最后,使用各路解码信息重建出三维网格。其中,对纹理图子码流和位移信息子码流使用视频解码器进行解码。对几何信息、连接关系和UV坐标子码流使用与编码端编码方法对应的解码器进行解码。下面对各种信息的解码进行介绍。
1)连接关系解码
输入:待解码的连接关系子码流;
输出:三维网格的连接关系和解码的顶点顺序。
先对连接关系子码流进行解码得到模式字符串。根据字符串中相对应的模式按照编码的顺序来重建连接关系,并将顶点的遍历属性输出至几何信息和UV坐标解码模块。
2)几何信息解码
输入:几何信息子码流、解码的位移信息和连接关系的解码顺序;
输出:三维网格的几何信息。
网格几何坐标的解码过程是编码过程的逆过程:先熵解码出坐标预测残差。再根据已解码三角形按照平行四边法则预测出待解码点的预测坐标。将预测坐标加上熵解码出的残差值即可得到待解码的几何坐标位置。这里的顶点遍历顺序在编码连接关系时与编码连接关系的顶点顺序相同。当不编码连接关系时,顶点遍历顺序与基础网格中的顶点顺序相同。注意,初始三角形的几何坐标不使用预测编码,而是直接编码它们的几何坐标值。在解码端解码出该三角形的几何坐标后作为初始三角形开始遍历解码其他三角形顶点的几何坐标。另外,此处还可能使用两个或者三个平四边形对对待解码的UV坐标进行预测,不强调具体的预测方法。
在解码出几何信息后,需要使用解码的位移信息对解码得到的几何信息进行修正。修正方式为使用位移信息中的位移值将对应顶点沿着法向量方向进行位移。最后得到修正后的几何信息。
3)UV坐标解码及重建(是否解码UV坐标根据第一标识决定)
输入:待解码的UV坐标码流、解码并修正后的几何信息和连接关系的解码顺序;
输出:三维网格重建的UV坐标信息。
如果码流中包含UV坐标子码流,网格UV坐标的解码过程是编码过程的逆过程:先熵解码出坐标预测残差。再根据已解码三角形按照平行四边法则预测出待解码点的预测坐标。将预测坐标加上熵解码出的残差值即可得到待解码的UV坐标位置。注意,初始三角形的UV坐标不使用预测编码,而是直接编码它们的UV坐标值。在解码端解码出该三角形的UV坐标后作为初始三角形开始遍历解码其他三角形顶点的UV坐标。另外,此处还可能使用两个或者三个平四边形对对待解码的UV坐标进行预测,不强调具体的预测方法。
如码流中不包含UV坐标子码流,则使用与编码端相同的UV坐标生成算法,使用解码得到的几何信息和连接关系来生成UV坐标。
在解码或重建出UV坐标后,需要使用解码的位移信息对解码得到的UV坐标进行修正。修正方式为使用位移信息中的位移值将对应顶点沿着法向量方向进行位移。最后得到 修正后的UV坐标。
4)纹理图解码
输入:纹理图子码流;
输出:纹理图。
直接使用视频解码器对纹理图进行解码,可以得到逐帧的纹理图,此处不强调纹理图的文件格式,格式可以为jpg、png等等。
本申请实施例中,编码端根据第一标识信息可以选择不对基础网格中的重建纹理坐标信息进行编码,该情况下解码端可根据已解码的信息生成重建纹理坐标信息,在有损模式下,可以极大地节省码率,提高编码效率。
本申请实施例提供的编码方法,执行主体可以为编码装置。本申请实施例中以编码装置执行编码方法为例,说明本申请实施例提供的编码装置。
如图11所示,本申请实施例还提供了一种编码装置1100,应用于编码端,包括:
第一编码模块1101,用于根据第一标识信息,对目标三维网格对应的基础网格进行编码,获取第一码流,所述基础网格包括所述目标三维网格对应的重建纹理坐标信息,所述第一标识信息用于表征是否对所述重建纹理坐标信息进行编码;
第一获取模块1102,用于根据网格差异信息,获取第二码流,所述网格差异信息用于表征所述基础网格与待编码的三维网格之间的差异信息,所述目标三维网格是基于待编码的三维网格得到的;
第二获取模块1103,用于根据重建的纹理图信息,获取第三码流,所述重建的纹理图信息是根据所述第一码流和第二码流得到的;
第一生成模块1104,用于根据所述第一码流、所述第二码流和所述第三码流,生成目标码流。
可选地,所述第一生成模块包括:
第一获取子模块,用于对所述第一标识信息进行编码,得到编码后的第一标识信息;
第一生成子模块,用于根据所述编码后的第一标识信息、所述第一码流和所述第二码流,生成目标码流。
可选地,所述基础网格还包括所述目标三维网格对应的几何信息和连接关系信息。
可选地,所述第一编码模块用于:
在所述第一标识信息表征对所述目标三维网格对应的重建纹理坐标信息进行编码的情况下,对所述几何信息、连接关系信息和重建纹理坐标信息进行编码,得到第一码流;
和/或,在所述第一标识信息表征对所述目标三维网格对应的重建纹理坐标信息不进行编码的情况下,对所述几何信息和连接关系信息进行编码,得到第一码流。
可选地,本申请实施例的装置,还包括:
第三获取模块,用于在第二获取模块根据重建的纹理图信息,获取第三码流之前,对所述第一码流进行解码和反量化处理,得到重建基础网格;
第四获取模块,用于对所述第二码流进行解码和反量化处理,得到目标网格差异信息;
第二生成模块,用于根据所述重建基础网格和所述目标网格差异信息,按照纹理图生成算法生成重建的纹理图信息。
可选地,所述第一获取模块包括:
第二获取子模块,用于对所述第一码流进行解码,获取所述第一码流对应的重建网格;
更新子模块,用于根据所述重建网格,对所述网格差异信息进行更新,得到更新后的网格差异信息;
第一编码子模块,用于对更新后的网格差异信息进行编码,得到所述第二码流。
可选地,本申请实施例的装置,还包括:
第五获取模块,用于在第一编码模块根据第一标识信息,对目标三维网格对应的基础网格进行编码,获取第一码流之前,在处于有损编码模式的情况下,对待编码的三维网格进行简化处理,得到目标三维网格;在处于无损编码模式的情况下,将待编码的三维网格,确定为目标三维网格。
可选地,所述第一生成模块包括:
第三获取子模块,用于根据所述目标三维网格的分片信息,获取第四码流;
第四获取子模块,用于根据所述第一码流、第二码流、第三码流和第四码流,得到所述目标码流。
本申请实施例中,编码端根据第一标识信息对目标三维网格对应的基础网格进行编码,获取第一码流,并根据网格差异信息,获取第二码流,根据所述第一码流和所述第二码流,生成目标码流。由于重建纹理坐标数据量在三维网格中比重较大,本申请实施例中根据第一标识信息可以选择不对基础网格中的重建纹理坐标信息进行编码,可以极大地节省码率,提高编码效率。
该装置实施例与上述图1所示的编码方法实施例对应,上述方法实施例中关于编码端的各个实施过程和实现方式均可适用于该装置实施例中,且能达到相同的技术效果。
具体地,本申请实施例还提供了一种编码设备,如图12所示,该编码设备1200包括:处理器1201、网络接口1202和存储器1203。其中,网络接口1202例如为通用公共无线接口(common public radio interface,CPRI)。
具体地,本申请实施例的编码设备1200还包括:存储在存储器1203上并可在处理器1201上运行的指令或程序,处理器1201调用存储器1203中的指令或程序执行图11所示各模块执行的方法,并达到相同的技术效果,为避免重复,故不在此赘述。
本申请实施例提供的解码方法,执行主体可以为解码装置。本申请实施例中以解码装置执行解码方法为例,说明本申请实施例提供的解码装置。
如图13所示,本申请实施例还提供了一种解码装置1300,应用于解码端,包括:
第六获取模块1301,用于对获取的目标码流进行分解,得到第一码流、第二码流和第三码流,所述第一码流是基于目标三维网格对应的基础网格得到的,所述第二码流是基 于网格差异信息得到的,所述网格差异信息用于表征所述基础网格与待编码的三维网格之间的差异信息,所述目标三维网格是基于待编码的三维网格得到的,所述第三码流是根据重建的纹理图信息得到的;
重建模块1302,用于在所述解码端确定所述第一码流包括重建纹理坐标信息的情况下,根据所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格;和/或,用于在所述解码端确定所述第一码流未包括重建纹理坐标信息的情况下,生成重建纹理坐标信息,并根据生成的重建纹理坐标信息、所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格。
可选地,本申请实施例的装置,还包括:
第七获取模块,用于对获取的目标码流进行分解,得到第一标识信息,所述第一标识信息用于表征编码端是否对重建纹理坐标信息进行编码;
第一确定模块,用于根据所述第一标识信息,确定所述第一码流是否包括重建纹理坐标信息。
可选地,本申请实施例的装置,还包括:
第八获取模块,用于在第六获取模块对获取的目标码流进行分解,得到第一码流、第二码流和第三码流之后,对所述第一码流进行解码处理,得到第一解码结果;
第二确定模块,用于根据所述第一解码结果,确定所述第一码流是否包括重建纹理坐标信息。
可选地,所述第一解码结果还包括:
所述目标三维网格对应的几何信息和连接关系信息。
可选地,所述重建模块用于根据所述几何信息和连接关系信息,按照纹理坐标重采样算法,生成所述重建纹理坐标信息。
可选地,所述第六获取模块用于对获取的目标码流进行分解,得到第一码流、第二码流、第三码流和第四码流,所述第四码流是基于目标三维网格的分片信息确定的;以及
所述重建模块用于根据所述第一解码结果、第二解码结果、第三解码结果和所述第四码流对应的第四解码结果,重建所述目标三维网格;或者,根据所述第一解码结果、第二解码结果、第三解码结果、所述第四码流对应的第四解码结果和生成的重建纹理坐标信息,重建所述目标三维网格。
本申请实施例中,编码端根据第一标识信息可以选择不对基础网格中的重建纹理坐标信息进行编码,该情况下解码端可根据已解码的信息生成重建纹理坐标信息,在有损模式下,可以极大地节省码率,提高编码效率。
需要说明的是,该装置实施例是与上述图9所示的方法实施例对应的装置,上述方法实施例中的所有关于解码端的实现方式均适用于该装置实施例中,也能达到相同的技术效果,在此不再赘述。
本申请实施例还提供一种解码设备,包括处理器,存储器,存储在存储器上并可在所述处理器上运行的程序或指令,该程序或指令被处理器执行时实现上述的解码方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
本申请实施例还提供一种编码设备,包括处理器,存储器,存储在存储器上并可在所述处理器上运行的程序或指令,该程序或指令被处理器执行时实现上述的解码方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
本申请实施例还提供一种可读存储介质,计算机可读存储介质上存储有程序或指令,该程序或指令被处理器执行时实现上述的编码方法或解码方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
其中,所述处理器为上述实施例中所述的解码设备中的处理器。所述可读存储介质,包括计算机可读存储介质,如计算机只读存储器ROM、随机存取存储器RAM、磁碟或者光盘等。
其中,所述的计算机可读存储介质,如只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等。
本申请实施例还提供了一种编码设备,包括处理器及通信接口,其中,处理器用于根据第一标识信息,对目标三维网格对应的基础网格进行编码,获取第一码流,所述基础网格包括所述目标三维网格对应的重建纹理坐标信息,所述第一标识信息用于表征是否对所述重建纹理坐标信息进行编码;根据网格差异信息,获取第二码流,所述网格差异信息用于表征所述基础网格与待编码的三维网格之间的差异信息,所述目标三维网格是基于待编码的三维网格得到的;根据重建的纹理图信息,获取第三码流,所述重建的纹理图信息是根据所述第一码流和第二码流得到的;根据所述第一码流、所述第二码流和所述第三码流,生成目标码流。
该编码设备实施例是与上述编码方法实施例对应的,上述方法实施例的各个实施过程和实现方式均可适用于该编码设备实施例中,且能达到相同的技术效果。
本申请实施例还提供了一种解码设备,包括处理器及通信接口,其中,所述处理器用于对获取的目标码流进行分解,得到第一码流、第二码流和第三码流,所述第一码流是基于目标三维网格对应的基础网格得到的,所述第二码流是基于网格差异信息得到的,所述网格差异信息用于表征所述基础网格与待编码的三维网格之间的差异信息,所述目标三维网格是基于所述待编码的三维网格得到的,所述第三码流是根据重建的纹理图信息得到的;在所述第一码流包括重建纹理坐标信息的情况下,根据所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格;在确定所述第一码流未包括重建纹理坐标信息的情况下,生成重建纹理坐标信息,并根据生成的重建纹理坐标信息、所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格。
该解码设备实施例是与上述解码方法实施例对应的,上述方法实施例的各个实施过程 和实现方式均可适用于该解码设备实施例中,且能达到相同的技术效果。
具体地,本申请实施例还提供了一种解码设备。具体地,该解码设备的结构如图14所示,该解码设备1400包括:处理器1401、网络接口1402和存储器1403。其中,网络接口1402例如为通用公共无线接口(common public radio interface,CPRI)。具体地,本申请实施例的解码设备1400还包括:存储在存储器1403上并可在处理器1401上运行的指令或程序,处理器1401调用存储器1403中的指令或程序执行图13所示各模块执行的方法,并达到相同的技术效果,为避免重复,故不在此赘述。
可选的,如图15所示,本申请实施例还提供一种通信设备1500,包括处理器1501和存储器1502,存储器1502上存储有可在所述处理器1501上运行的程序或指令,例如,该通信设备1500为编码设备时,该程序或指令被处理器1501执行时实现上述编码方法实施例的各个步骤,且能达到相同的技术效果。该通信设备1500为解码设备时,该程序或指令被处理器1501执行时实现上述解码方法实施例的各个步骤,且能达到相同的技术效果,为避免重复,这里不再赘述。
本申请实施例另提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现上述编码方法或解码方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
应理解,本申请实施例提到的芯片还可以称为系统级芯片,系统芯片,芯片系统或片上系统芯片等。
本申请实施例另提供了一种计算机程序/程序产品,所述计算机程序/程序产品被存储在存储介质中,所述计算机程序/程序产品被至少一个处理器执行以实现上述编码方法或解码方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
本申请实施例还提供了一种通信系统,至少包括:编码设备和解码设备。所述编码设备可以为如图12所示的编码设备,可用于执行如图1所述的编码方法的步骤。所述解码设备可以为如图14所示的解码设备,可用于执行如图9所述的解码方法的步骤。且能达到相同的技术效果,为避免重复,这里不再赘述。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外,需要指出的是,本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能,还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能,例如,可以按不同于所描述的次序来执行所描述的方法,并且还可以添加、省去、或组合各种步骤。另外,参照某些示例所描述的特征可在其他示例中被组合。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可 借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对相关技术做出贡献的部分可以以计算机软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本申请的保护之内。

Claims (31)

  1. 一种编码方法,包括:
    编码端根据第一标识信息,对目标三维网格对应的基础网格进行编码,获取第一码流,所述基础网格包括所述目标三维网格对应的重建纹理坐标信息,所述第一标识信息用于表征是否对所述重建纹理坐标信息进行编码;
    所述编码端根据网格差异信息,获取第二码流,所述网格差异信息用于表征所述基础网格与待编码的三维网格之间的差异信息,所述目标三维网格是基于待编码的三维网格得到的;
    所述编码端根据重建的纹理图信息,获取第三码流,所述重建的纹理图信息是根据所述第一码流和第二码流得到的;
    所述编码端根据所述第一码流、所述第二码流和所述第三码流,生成目标码流。
  2. 根据权利要求1所述的方法,其中,所述编码端根据所述第一码流和所述第二码流,生成目标码流,包括:
    对所述第一标识信息进行编码,得到编码后的第一标识信息;
    根据所述编码后的第一标识信息、所述第一码流和所述第二码流,生成目标码流。
  3. 根据权利要求1所述的方法,其中,所述基础网格还包括所述目标三维网格对应的几何信息和连接关系信息。
  4. 根据权利要求3所述的方法,其中,所述编码端根据第一标识信息,对目标三维网格对应的基础网格进行编码,获取第一码流,包括:
    在所述第一标识信息表征对所述目标三维网格对应的重建纹理坐标信息进行编码的情况下,对所述几何信息、连接关系信息和重建纹理坐标信息进行编码,得到第一码流;
    和/或,在所述第一标识信息表征对所述目标三维网格对应的重建纹理坐标信息不进行编码的情况下,对所述几何信息和连接关系信息进行编码,得到第一码流。
  5. 根据权利要求3所述的方法,其中,所述编码端根据重建的纹理图信息,获取第三码流之前,还包括:
    对所述第一码流进行解码和反量化处理,得到重建基础网格;
    对所述第二码流进行解码和反量化处理,得到目标网格差异信息;
    根据所述重建基础网格和所述目标网格差异信息,按照纹理图生成算法生成重建的纹理图信息。
  6. 根据权利要求1所述的方法,其中,所述编码端根据网格差异信息,获取第二码流,包括:
    对所述第一码流进行解码,获取所述第一码流对应的重建网格;
    根据所述重建网格,对所述网格差异信息进行更新,得到更新后的网格差异信息;
    对更新后的网格差异信息进行编码,得到所述第二码流。
  7. 根据权利要求1所述的方法,其中,所述编码端根据第一标识信息,对目标三维网格对应的基础网格进行编码,获取第一码流之前,还包括:
    在处于有损编码模式的情况下,对待编码的三维网格进行简化处理,得到目标三维网格;
    在处于无损编码模式的情况下,将待编码的三维网格,确定为目标三维网格。
  8. 根据权利要求1所述的方法,其中,所述编码端根据所述第一码流、所述第二码流和所述第三码流,生成目标码流,包括:
    根据所述目标三维网格的分片信息,获取第四码流;
    根据所述第一码流、第二码流、第三码流和第四码流,得到所述目标码流。
  9. 一种解码方法,包括:
    解码端对获取的目标码流进行分解,得到第一码流、第二码流和第三码流,所述第一码流是基于目标三维网格对应的基础网格得到的,所述第二码流是基于网格差异信息得到的,所述网格差异信息用于表征所述基础网格与待编码三维网格之间的差异信息,所述目标三维网格是基于待编码的三维网格得到的,所述第三码流是根据重建的纹理图信息得到的;
    在所述解码端确定所述第一码流包括重建纹理坐标信息的情况下,根据所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格;
    在所述解码端确定所述第一码流未包括重建纹理坐标信息的情况下,生成重建纹理坐标信息,并根据生成的重建纹理坐标信息、所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格。
  10. 根据权利要求9所述的方法,其中,还包括:
    所述解码端对获取的目标码流进行分解,得到第一标识信息,所述第一标识信息用于表征编码端是否对重建纹理坐标信息进行编码;
    根据所述第一标识信息,确定所述第一码流是否包括重建纹理坐标信息。
  11. 根据权利要求9所述的方法,其中,解码端对获取的目标码流进行分解,得到第一码流、第二码流和第三码流之后,还包括:
    对所述第一码流进行解码处理,得到第一解码结果;
    根据所述第一解码结果,确定所述第一码流是否包括重建纹理坐标信息。
  12. 根据权利要求9或至11任一项所述的方法,其中,所述第一解码结果还包括:
    所述目标三维网格对应的几何信息和连接关系信息。
  13. 根据权利要求12所述的方法,其中,所述生成重建纹理坐标信息,包括:
    根据所述几何信息和连接关系信息,按照纹理坐标重采样算法,生成所述重建纹理坐标信息。
  14. 根据权利要求9所述的方法,其中,解码端对获取的目标码流进行分解,得到第 一码流、第二码流和第三码流,包括:
    解码端对获取的目标码流进行分解,得到第一码流、第二码流、第三码流和第四码流,所述第四码流是基于目标三维网格的分片信息确定的;以及
    所述根据所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格,包括:根据所述第一解码结果、第二解码结果、第三解码结果和所述第四码流对应的第四解码结果,重建所述目标三维网格;或者,所述根据生成的重建纹理坐标信息、所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格,包括:根据所述第一解码结果、第二解码结果、第三解码结果、所述第四码流对应的第四解码结果和生成的重建纹理坐标信息,重建所述目标三维网格。
  15. 一种编码装置,应用于编码端,包括:
    第一编码模块,用于根据第一标识信息,对目标三维网格对应的基础网格进行编码,获取第一码流,所述基础网格包括所述目标三维网格对应的重建纹理坐标信息,所述第一标识信息用于表征是否对所述重建纹理坐标信息进行编码;
    第一获取模块,用于根据网格差异信息,获取第二码流,所述网格差异信息用于表征所述基础网格与待编码的三维网格之间的差异信息,所述目标三维网格是基于待编码的三维网格得到的;
    第二获取模块,用于根据重建的纹理图信息,获取第三码流,所述重建的纹理图信息是根据所述第一码流和第二码流得到的;
    第一生成模块,用于根据所述第一码流、所述第二码流和所述第三码流,生成目标码流。
  16. 根据权利要求15所述的装置,其中,所述第一生成模块包括:
    第一获取子模块,用于对所述第一标识信息进行编码,得到编码后的第一标识信息;
    第一生成子模块,用于根据所述编码后的第一标识信息、所述第一码流和所述第二码流,生成目标码流。
  17. 根据权利要求15所述的装置,其中,所述基础网格还包括所述目标三维网格对应的几何信息和连接关系信息。
  18. 根据权利要求17所述的装置,其中,所述第一编码模块用于:
    在所述第一标识信息表征对所述目标三维网格对应的重建纹理坐标信息进行编码的情况下,对所述几何信息、连接关系信息和重建纹理坐标信息进行编码,得到第一码流;
    和/或,在所述第一标识信息表征对所述目标三维网格对应的重建纹理坐标信息不进行编码的情况下,对所述几何信息和连接关系信息进行编码,得到第一码流。
  19. 根据权利要求17所述的装置,其中,还包括:
    第三获取模块,用于在第二获取模块根据重建的纹理图信息,获取第三码流之前,对所述第一码流进行解码和反量化处理,得到重建基础网格;
    第四获取模块,用于对所述第二码流进行解码和反量化处理,得到目标网格差异信息;
    第二生成模块,用于根据所述重建基础网格和所述目标网格差异信息,按照纹理图生成算法生成重建的纹理图信息。
  20. 根据权利要求15所述的装置,其中,所述第一获取模块包括:
    第二获取子模块,用于对所述第一码流进行解码,获取所述第一码流对应的重建网格;
    更新子模块,用于根据所述重建网格,对所述网格差异信息进行更新,得到更新后的网格差异信息;
    第一编码子模块,用于对更新后的网格差异信息进行编码,得到所述第二码流。
  21. 根据权利要求15所述的装置,其中,还包括:
    第五获取模块,用于在第一编码模块根据第一标识信息,对目标三维网格对应的基础网格进行编码,获取第一码流之前,在处于有损编码模式的情况下,对待编码的三维网格进行简化处理,得到目标三维网格;在处于无损编码模式的情况下,将待编码的三维网格,确定为目标三维网格。
  22. 根据权利要求15所述的装置,其中,所述第一生成模块包括:
    第三获取子模块,用于根据所述目标三维网格的分片信息,获取第四码流;
    第四获取子模块,用于根据所述第一码流、第二码流、第三码流和第四码流,得到所述目标码流。
  23. 一种解码装置,应用于解码端,包括:
    第六获取模块,用于对获取的目标码流进行分解,得到第一码流、第二码流和第三码流,所述第一码流是基于目标三维网格对应的基础网格得到的,所述第二码流是基于网格差异信息得到的,所述网格差异信息用于表征所述基础网格与待编码的三维网格之间的差异信息,所述目标三维网格是基于待编码的三维网格得到的,所述第三码流是根据重建的纹理图信息得到的;
    重建模块,用于在所述解码端确定所述第一码流包括重建纹理坐标信息的情况下,根据所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格;和/或,用于在所述解码端确定所述第一码流未包括重建纹理坐标信息的情况下,生成重建纹理坐标信息,并根据生成的重建纹理坐标信息、所述第一码流对应的第一解码结果、所述第二码流对应的第二解码结果和所述第三码流对应的第三解码结果,重建目标三维网格。
  24. 根据权利要求23所述的装置,其中,还包括:
    第七获取模块,用于对获取的目标码流进行分解,得到第一标识信息,所述第一标识信息用于表征编码端是否对重建纹理坐标信息进行编码;
    第一确定模块,用于根据所述第一标识信息,确定所述第一码流是否包括重建纹理坐标信息。
  25. 根据权利要求23所述的装置,其中,还包括:
    第八获取模块,用于在第六获取模块对获取的目标码流进行分解,得到第一码流、第二码流和第三码流之后,对所述第一码流进行解码处理,得到第一解码结果;
    第二确定模块,用于根据所述第一解码结果,确定所述第一码流是否包括重建纹理坐标信息。
  26. 根据权利要求23或至25任一项所述的装置,其中,所述第一解码结果还包括:
    所述目标三维网格对应的几何信息和连接关系信息。
  27. 根据权利要求26所述的装置,其中,所述重建模块用于根据所述几何信息和连接关系信息,按照纹理坐标重采样算法,生成所述重建纹理坐标信息。
  28. 根据权利要求23所述的装置,其中,所述第六获取模块用于对获取的目标码流进行分解,得到第一码流、第二码流、第三码流和第四码流,所述第四码流是基于目标三维网格的分片信息确定的;以及
    所述重建模块用于根据所述第一解码结果、第二解码结果、第三解码结果和所述第四码流对应的第四解码结果,重建所述目标三维网格;或者,根据所述第一解码结果、第二解码结果、第三解码结果、所述第四码流对应的第四解码结果和生成的重建纹理坐标信息,重建所述目标三维网格。
  29. 一种编码设备,包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如权利要求1至8任一项所述的编码方法的步骤。
  30. 一种解码设备,包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如权利要求9至14任一项所述的解码方法的步骤。
  31. 一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如权利要求1至8任一项所述的编码方法的步骤,或者实现如权利要求9至14任一项所述的解码方法的步骤。
PCT/CN2023/096097 2022-05-31 2023-05-24 编码方法、解码方法、装置及设备 WO2023231872A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210613984.5A CN117197263A (zh) 2022-05-31 2022-05-31 编码方法、解码方法、装置及设备
CN202210613984.5 2022-05-31

Publications (1)

Publication Number Publication Date
WO2023231872A1 true WO2023231872A1 (zh) 2023-12-07

Family

ID=88983732

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/096097 WO2023231872A1 (zh) 2022-05-31 2023-05-24 编码方法、解码方法、装置及设备

Country Status (2)

Country Link
CN (1) CN117197263A (zh)
WO (1) WO2023231872A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20060083111A (ko) * 2005-01-14 2006-07-20 한국전자통신연구원 효과적인 텍스처 매핑을 위한 3차원 메쉬 정보의 텍스처좌표 부호화 및 복호화 방법
CN101626509A (zh) * 2009-08-10 2010-01-13 北京工业大学 三维网格编码、解码方法及编码、解码装置
CN104243958A (zh) * 2014-09-29 2014-12-24 联想(北京)有限公司 三维网格数据的编码、解码方法以及编码、解码装置
US20180253867A1 (en) * 2017-03-06 2018-09-06 Canon Kabushiki Kaisha Encoding and decoding of texture mapping data in textured 3d mesh models
US20210090301A1 (en) * 2019-09-24 2021-03-25 Apple Inc. Three-Dimensional Mesh Compression Using a Video Encoder

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20060083111A (ko) * 2005-01-14 2006-07-20 한국전자통신연구원 효과적인 텍스처 매핑을 위한 3차원 메쉬 정보의 텍스처좌표 부호화 및 복호화 방법
CN101626509A (zh) * 2009-08-10 2010-01-13 北京工业大学 三维网格编码、解码方法及编码、解码装置
CN104243958A (zh) * 2014-09-29 2014-12-24 联想(北京)有限公司 三维网格数据的编码、解码方法以及编码、解码装置
US20180253867A1 (en) * 2017-03-06 2018-09-06 Canon Kabushiki Kaisha Encoding and decoding of texture mapping data in textured 3d mesh models
US20210090301A1 (en) * 2019-09-24 2021-03-25 Apple Inc. Three-Dimensional Mesh Compression Using a Video Encoder

Also Published As

Publication number Publication date
CN117197263A (zh) 2023-12-08

Similar Documents

Publication Publication Date Title
Krivokuća et al. A volumetric approach to point cloud compression–part ii: Geometry compression
Chou et al. A volumetric approach to point cloud compression—Part I: Attribute compression
US11412259B2 (en) Transform method, inverse transform method, coder, decoder and storage medium
Huang et al. Octree-Based Progressive Geometry Coding of Point Clouds.
Peng et al. Technologies for 3D mesh compression: A survey
US9734595B2 (en) Method and apparatus for near-lossless compression and decompression of 3D meshes and point clouds
WO2021062772A1 (zh) 预测方法、编码器、解码器及计算机存储介质
WO2013029232A1 (en) Multi-resolution 3d textured mesh coding
Krivokuća et al. A volumetric approach to point cloud compression
WO2021062771A1 (zh) 颜色分量预测方法、编码器、解码器及计算机存储介质
WO2023231872A1 (zh) 编码方法、解码方法、装置及设备
US20220392114A1 (en) Method and apparatus for calculating distance based weighted average for point cloud coding
WO2024193373A1 (zh) 编码处理方法、解码处理方法及相关设备
Marvie et al. Coding of dynamic 3D meshes
WO2023193709A1 (zh) 编码、解码方法、装置及设备
WO2024212981A1 (zh) 一种三维网格序列编解码方法及装置
WO2023155778A1 (zh) 编码方法、装置及设备
WO2023179706A1 (zh) 编码方法、解码方法及终端
WO2023155045A1 (zh) 预测的方法和装置、编码器、解码器和编解码系统
WO2023197990A1 (zh) 编码方法、解码方法及终端
WO2024174092A1 (zh) 编解码方法、码流、编码器、解码器以及存储介质
WO2023133710A1 (zh) 编码方法、解码方法、编码器、解码器和编解码系统
WO2024213067A1 (en) Decoding method, encoding method, bitstream, decoder, encoder and storage medium
WO2024193487A1 (zh) 三维网格位移信息编码方法、解码方法、装置及终端
WO2024065408A1 (zh) 编解码方法、码流、编码器、解码器以及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23815056

Country of ref document: EP

Kind code of ref document: A1