WO2022191132A1 - 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 - Google Patents

三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 Download PDF

Info

Publication number
WO2022191132A1
WO2022191132A1 PCT/JP2022/009732 JP2022009732W WO2022191132A1 WO 2022191132 A1 WO2022191132 A1 WO 2022191132A1 JP 2022009732 W JP2022009732 W JP 2022009732W WO 2022191132 A1 WO2022191132 A1 WO 2022191132A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
unit
attribute information
data
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2022/009732
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
真人 大川
賀敬 井口
敏康 杉尾
孝啓 西
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Intellectual Property Corp of America
Original Assignee
Panasonic Intellectual Property Corp of America
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Panasonic Intellectual Property Corp of America filed Critical Panasonic Intellectual Property Corp of America
Priority to JP2023505546A priority Critical patent/JP7785744B2/ja
Priority to CN202280019188.7A priority patent/CN116917943A/zh
Publication of WO2022191132A1 publication Critical patent/WO2022191132A1/ja
Priority to US18/242,729 priority patent/US12432382B2/en
Anticipated expiration legal-status Critical
Priority to US19/320,638 priority patent/US20260006246A1/en
Priority to JP2025227277A priority patent/JP2026031682A/ja
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • H04N19/463Embedding additional information in the video signal during the compression process by compressing encoding parameters before transmission
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the present disclosure relates to a three-dimensional data encoding method, a three-dimensional data decoding method, a three-dimensional data encoding device, and a three-dimensional data decoding device.
  • 3D data will spread in a wide range of fields, such as computer vision, map information, monitoring, infrastructure inspection, or video distribution for autonomous operation of automobiles or robots.
  • Three-dimensional data is acquired in various ways, such as range sensors such as range finders, stereo cameras, or a combination of multiple monocular cameras.
  • a representation method As one of the three-dimensional data representation methods, there is a representation method called a point cloud that expresses the shape of a three-dimensional structure using a point group in a three-dimensional space.
  • a point cloud stores the position and color of the point cloud.
  • Point clouds are expected to become mainstream as a method of expressing three-dimensional data, but point clouds have a very large amount of data. Therefore, in the storage or transmission of 3D data, it is essential to compress the amount of data by encoding, as with 2D moving images (one example is MPEG-4 AVC or HEVC standardized by MPEG). Become.
  • point cloud compression is partially supported by a public library (Point Cloud Library) that performs point cloud-related processing.
  • Point Cloud Library a public library that performs point cloud-related processing.
  • Patent Document 1 Japanese Patent Document 1
  • An object of the present disclosure is to provide a three-dimensional data encoding method, a three-dimensional data decoding method, a three-dimensional data encoding device, or a three-dimensional data decoding device that can improve encoding efficiency.
  • a 3D data encoding method transforms attribute information of a 3D point of at least one of a plurality of frames constituting a sequence, and encodes the transformed attribute information.
  • said bitstream further comprising at least one first parameter of said transform provided for said sequence and said transform provided for each of said at least one frame and at least one second parameter of
  • a three-dimensional data decoding method generates decoding attribute information by decoding a bitstream, and inversely transforms the decoding attribute information, so that at least generating 3D point attribute information for one frame, the bitstream further comprising at least one first parameter of the inverse transform provided for the sequence and for each of the at least one frame; at least one second parameter of said inverse transform provided in .
  • the present disclosure can provide a three-dimensional data encoding method, a three-dimensional data decoding method, a three-dimensional data encoding device, or a three-dimensional data decoding device that can improve encoding efficiency.
  • FIG. 1 is a diagram showing the configuration of a three-dimensional data encoding/decoding system according to Embodiment 1.
  • FIG. 2 is a diagram showing a configuration example of point cloud data according to the first embodiment.
  • FIG. 3 is a diagram showing a configuration example of a data file describing point cloud data information according to the first embodiment.
  • FIG. 4 is a diagram showing types of point cloud data according to the first embodiment.
  • 5 is a diagram showing a configuration of a first encoding unit according to Embodiment 1.
  • FIG. 6 is a block diagram of a first encoding unit according to Embodiment 1.
  • FIG. 7 is a diagram showing a configuration of a first decoding unit according to Embodiment 1.
  • FIG. 1 is a diagram showing the configuration of a three-dimensional data encoding/decoding system according to Embodiment 1.
  • FIG. 2 is a diagram showing a configuration example of point cloud data according to the first embodiment.
  • FIG. 3 is a diagram showing a
  • FIG. 8 is a block diagram of a first decoding unit according to Embodiment 1.
  • FIG. 9 is a block diagram of a three-dimensional data encoding device according to Embodiment 1.
  • FIG. 10 is a diagram showing an example of position information according to Embodiment 1.
  • FIG. 11 is a diagram showing an example of an octree representation of position information according to Embodiment 1.
  • FIG. 12 is a block diagram of a three-dimensional data decoding device according to Embodiment 1.
  • FIG. 13 is a block diagram of an attribute information encoding unit according to Embodiment 1.
  • FIG. 14 is a block diagram of an attribute information decoding unit according to Embodiment 1.
  • FIG. 15 is a block diagram showing a configuration of an attribute information encoding unit according to Embodiment 1.
  • FIG. 16 is a block diagram of an attribute information encoding unit according to Embodiment 1.
  • FIG. 17 is a block diagram showing a configuration of an attribute information decoding unit according to Embodiment 1.
  • FIG. 18 is a block diagram of an attribute information decoding unit according to Embodiment 1.
  • FIG. 19 is a diagram showing a configuration of a second encoding section according to Embodiment 1.
  • FIG. 20 is a block diagram of a second encoding unit according to Embodiment 1.
  • FIG. 21 is a diagram showing a configuration of a second decoding unit according to Embodiment 1.
  • FIG. 22 is a block diagram of a second decoding unit according to Embodiment 1.
  • FIG. 23 is a diagram showing a protocol stack related to PCC-encoded data according to Embodiment 1.
  • FIG. 24 is a diagram showing configurations of an encoding unit and a multiplexing unit according to Embodiment 2.
  • FIG. 25 is a diagram showing a configuration example of encoded data according to Embodiment 2.
  • FIG. 26 is a diagram illustrating a configuration example of encoded data and a NAL unit according to Embodiment 2.
  • FIG. 27 is a diagram illustrating an example of semantics of pcc_nal_unit_type according to the second embodiment;
  • FIG. 28 is a diagram illustrating an example of the transmission order of NAL units according to Embodiment 2.
  • FIG. 29 is a flowchart of processing by the three-dimensional data encoding device according to Embodiment 2.
  • FIG. 30 is a flowchart of processing by the three-dimensional data decoding device according to Embodiment 2.
  • FIG. 31 is a flowchart of multiplexing processing according to Embodiment 2.
  • FIG. 32 is a flowchart of demultiplexing processing according to Embodiment 2.
  • FIG. 33 is a block diagram showing the configuration of a three-dimensional data encoding device according to Embodiment 3.
  • FIG. 34 is a block diagram showing a configuration of a three-dimensional data decoding device according to Embodiment 3.
  • FIG. 35 is a diagram showing a first example of SPS syntax according to Embodiment 3.
  • FIG. 36 is a diagram showing a configuration example of a bitstream according to Embodiment 3.
  • FIG. 37 is a diagram showing a second example of SPS syntax according to the third embodiment.
  • 38 is a diagram showing a first example of syntax of conversion information according to Embodiment 3.
  • FIG. 39 is a diagram showing a second example of syntax of conversion information according to Embodiment 3.
  • FIG. 40 is a flowchart showing a processing procedure of the 3D data encoding device according to Embodiment 3.
  • FIG. 41 is a flow chart showing a processing procedure of the three-dimensional data decoding device according to Embodiment 3.
  • FIG. 42 is a block diagram for explaining another example of processing of the 3D data encoding device according to Embodiment 3.
  • FIG. 43 is a block diagram for explaining another example of processing of the three-dimensional data decoding device according to Embodiment 3.
  • FIG. 44 is a diagram illustrating an example of SEI syntax according to the fourth embodiment;
  • FIG. 45 is a diagram illustrating a syntax example of ply_format_info( ) according to Embodiment 4.
  • FIG. 46 is a diagram illustrating a syntax example of las_format_info( ) according to Embodiment 4.
  • FIG. 47 is a diagram illustrating a syntax example of public_header_block( ) according to Embodiment 4.
  • FIG. 48 is a diagram illustrating a syntax example of variable_length_records( ) according to Embodiment 4.
  • FIG. 49 is a diagram illustrating a syntax example of point_data_records( ) according to Embodiment 4.
  • FIG. 50 is a diagram illustrating a syntax example of extended_variable_length_records( ) according to Embodiment 4.
  • FIG. 51 is a block diagram of a three-dimensional data encoding device according to the first example of Embodiment 4.
  • FIG. 52 is a block diagram of a three-dimensional data decoding device according to the first example of Embodiment 4.
  • FIG. 53 is a diagram illustrating an example of processing by a conversion unit according to the first example of the fourth embodiment;
  • FIG. 54 is a diagram illustrating an example of processing by a conversion unit according to the first example of the fourth embodiment;
  • FIG. 55 is a diagram illustrating an example of processing by the inverse transforming unit according to the first example of Embodiment 4.
  • FIG. 56 is a diagram illustrating an example of processing by the inverse transforming unit according to the first example of Embodiment 4.
  • FIG. 57 is a diagram showing a syntax example of SPS according to Embodiment 4.
  • FIG. 58 is a diagram illustrating a syntax example of attribute_parameter(i) according to Embodiment 4.
  • FIG. 59 is a flowchart of three-dimensional data encoding processing according to Embodiment 4.
  • FIG. 60 is a flowchart of three-dimensional data decoding processing according to Embodiment 4.
  • FIG. 61 is a block diagram of a three-dimensional data encoding device according to a second example of Embodiment 4.
  • FIG. 62 is a diagram illustrating an example of processing by a conversion unit according to the second example of the fourth embodiment;
  • FIG. 63 is a diagram illustrating an example of processing by an inverse transforming unit according to the second example of Embodiment 4.
  • FIG. 64 is a block diagram showing the configuration of a three-dimensional data encoding apparatus according to the third example of Embodiment 4.
  • FIG. 65 is a diagram depicting an example of processing by a conversion unit according to the third example of the fourth embodiment;
  • FIG. 66 is a diagram illustrating an example of processing by a conversion unit according to the third example of the fourth embodiment;
  • FIG. 67 is a diagram illustrating an example of processing by a conversion unit according to the third example of the fourth embodiment
  • FIG. 68 is a diagram illustrating an example of processing by an inverse transforming unit according to the third example of Embodiment 4.
  • FIG. 69 is a diagram illustrating an example of processing by an inverse transforming unit according to the third example of Embodiment 4.
  • FIG. 70 is a diagram illustrating an example of processing by an inverse transforming unit according to the third example of Embodiment 4.
  • FIG. 71 is a block diagram of a three-dimensional data encoding device according to a fourth example of Embodiment 4.
  • FIG. 72 is a block diagram of a three-dimensional data decoding device according to a fourth example of Embodiment 4.
  • FIG. 73 is a diagram illustrating an example of processing by a conversion unit according to the fifth example of the fourth embodiment;
  • FIG. 74 is a diagram showing a syntax example of SPS and SEI according to Embodiment 4.
  • FIG. 75 is a diagram showing a storage example of conversion information according to Embodiment 4.
  • FIG. 76 is a flowchart of three-dimensional data encoding processing according to Embodiment 4.
  • FIG. 77 is a flowchart of three-dimensional data decoding processing according to Embodiment 4.
  • a 3D data encoding method transforms attribute information of a 3D point of at least one of a plurality of frames constituting a sequence, and encodes the transformed attribute information.
  • said bitstream further comprising at least one first parameter of said transform provided for said sequence and said transform provided for each of said at least one frame and at least one second parameter of
  • the three-dimensional data encoding method can, for example, selectively control the switching of transformation parameters both in units of sequences and in units of frames. As a result, appropriate conversion processing can be performed, and coding efficiency can be improved.
  • the second parameter may be used.
  • the first parameter may be used.
  • At least one of multiplication or division by a first value and addition or subtraction by a second value is performed on the attribute information, and the at least one first parameter and the at least one second Each of the two parameters may indicate at least one of said first value or said second value.
  • the bitstream includes multiple types of information of the three-dimensional point including the attribute information, and the bitstream further includes first information indicating whether each of the multiple types of information is to be compressed.
  • the multiple types of information may include multiple types of attribute information of the three-dimensional point.
  • the plurality of types of information may include positional information of the three-dimensional point.
  • the bitstream includes first control information for the sequence, the first control information includes information indicating a list of the plurality of types of information, and the first information is the first control information. and information indicating the number of the information to be compressed in the list.
  • the bitstream may further include second information indicating the format type of the point cloud data including the attribute information.
  • a three-dimensional data decoding method generates decoding attribute information by decoding a bitstream, and inversely transforms the decoding attribute information, so that at least generating 3D point attribute information for one frame, the bitstream further comprising at least one first parameter of the inverse transform provided for the sequence and for each of the at least one frame; at least one second parameter of said inverse transform provided in .
  • the three-dimensional data decoding method can decode attribute information from a bitstream with improved coding efficiency.
  • the decoding attribute information of the frame to be processed may be inversely transformed using the second parameter.
  • the first parameter may be used to inverse transform the decoding attribute information of the frame to be processed.
  • At least one of multiplication or division by a first value and addition or subtraction by a second value is performed on the decoding attribute information, and the at least one first parameter and the at least one each of the two second parameters may represent at least one of said first value or said second value.
  • the bitstream includes multiple types of information of the three-dimensional point including the attribute information, and the bitstream further includes first information indicating whether each of the multiple types of information is to be compressed.
  • the multiple types of information may include multiple types of attribute information of the three-dimensional point.
  • the plurality of types of information may include positional information of the three-dimensional point.
  • the bitstream includes first control information for the sequence, the first control information includes information indicating a list of the plurality of types of information, and the first information is the first control information. and information indicating the number of the information to be compressed in the list.
  • the bitstream may further include second information indicating the format type of the point cloud data including the attribute information.
  • a 3D data encoding device includes a processor and a memory, and the processor uses the memory to encode at least one frame out of a plurality of frames forming a sequence. and encoding the transformed attribute information to generate a bitstream, the bitstream further comprising at least one of the transforms provided for the sequence and at least one second parameter of said transform, provided for each of said at least one frame.
  • the 3D data encoding device can, for example, selectively control the switching of transformation parameters both in units of sequences and in units of frames. As a result, appropriate conversion processing can be performed, and coding efficiency can be improved.
  • a three-dimensional data decoding device includes a processor and a memory, wherein the processor generates decoding attribute information by decoding a bitstream using the memory, and the generating 3D point attribute information of at least one of a plurality of frames constituting a sequence by inversely transforming the decoded attribute information, wherein the bitstream is further provided for the sequence; , at least one first parameter of said inverse transform and at least one second parameter of said inverse transform, provided for each of said at least one frame.
  • the 3D data decoding device can decode the attribute information from the bitstream with improved coding efficiency.
  • a three-dimensional data encoding method and a three-dimensional data encoding device for providing a function of transmitting and receiving necessary information according to the application in encoded data of a three-dimensional point cloud, and the encoding 3D data decoding method and 3D data decoding device for decoding coded data, 3D data multiplexing method for multiplexing the coded data, and 3D data transmission method for transmitting the coded data do.
  • a first encoding method and a second encoding method are being studied as encoding methods (encoding schemes) for point cloud data.
  • encoding methods encoding schemes
  • the method of storing in the format is not defined, and MUX processing (multiplexing) in the encoding unit, transmission, or storage cannot be performed as it is.
  • PCC Point Cloud Compression
  • FIG. 1 is a diagram showing a configuration example of a three-dimensional data encoding/decoding system according to this embodiment.
  • the 3D data encoding/decoding system includes a 3D data encoding system 4601, a 3D data decoding system 4602, a sensor terminal 4603, and an external connection section 4604.
  • the three-dimensional data encoding system 4601 generates encoded data or multiplexed data by encoding point cloud data, which is three-dimensional data.
  • the 3D data encoding system 4601 may be a 3D data encoding device implemented by a single device, or may be a system implemented by a plurality of devices. Also, the 3D data encoding device may include some of the plurality of processing units included in the 3D data encoding system 4601 .
  • the 3D data encoding system 4601 includes a point cloud data generation system 4611, a presentation unit 4612, an encoding unit 4613, a multiplexing unit 4614, an input/output unit 4615, and a control unit 4616.
  • the point cloud data generation system 4611 includes a sensor information acquisition section 4617 and a point cloud data generation section 4618 .
  • the sensor information acquisition unit 4617 acquires sensor information from the sensor terminal 4603 and outputs the sensor information to the point cloud data generation unit 4618.
  • the point cloud data generation unit 4618 generates point cloud data from the sensor information and outputs the point cloud data to the encoding unit 4613.
  • the presentation unit 4612 presents sensor information or point cloud data to the user. For example, the presentation unit 4612 displays information or an image based on sensor information or point cloud data.
  • the encoding unit 4613 encodes (compresses) the point cloud data, and outputs the obtained encoded data, the control information obtained in the encoding process, and other additional information to the multiplexing unit 4614.
  • Additional information includes, for example, sensor information.
  • the multiplexing unit 4614 generates multiplexed data by multiplexing the encoded data input from the encoding unit 4613, control information, and additional information.
  • the format of multiplexed data is, for example, a file format for storage or a packet format for transmission.
  • the input/output unit 4615 (eg, communication unit or interface) outputs the multiplexed data to the outside.
  • the multiplexed data is accumulated in an accumulation unit such as an internal memory.
  • a control unit 4616 (or an application execution unit) controls each processing unit. That is, the control unit 4616 controls encoding, multiplexing, and the like.
  • the sensor information may be input to the encoding unit 4613 or the multiplexing unit 4614. Also, the input/output unit 4615 may output the point cloud data or the encoded data as they are to the outside.
  • a transmission signal (multiplexed data) output from the 3D data encoding system 4601 is input to the 3D data decoding system 4602 via the external connection unit 4604 .
  • a three-dimensional data decoding system 4602 generates point cloud data, which is three-dimensional data, by decoding encoded data or multiplexed data.
  • the 3D data decoding system 4602 may be a 3D data decoding device implemented by a single device, or may be a system implemented by a plurality of devices. Also, the 3D data decoding device may include some of the plurality of processing units included in the 3D data decoding system 4602 .
  • the 3D data decoding system 4602 includes a sensor information acquisition unit 4621, an input/output unit 4622, a demultiplexing unit 4623, a decoding unit 4624, a presentation unit 4625, a user interface 4626, and a control unit 4627.
  • the sensor information acquisition unit 4621 acquires sensor information from the sensor terminal 4603.
  • the input/output unit 4622 acquires the transmission signal, decodes the multiplexed data (file format or packet) from the transmission signal, and outputs the multiplexed data to the demultiplexing unit 4623.
  • the demultiplexing unit 4623 acquires encoded data, control information and additional information from the multiplexed data, and outputs the encoded data, control information and additional information to the decoding unit 4624.
  • the decoding unit 4624 reconstructs the point cloud data by decoding the encoded data.
  • the presentation unit 4625 presents the point cloud data to the user. For example, the presentation unit 4625 displays information or images based on point cloud data.
  • User interface 4626 acquires instructions based on user operations.
  • the control unit 4627 (or application execution unit) controls each processing unit. That is, the control unit 4627 controls demultiplexing, decoding, presentation, and the like.
  • the input/output unit 4622 may acquire the point cloud data or the encoded data as they are from the outside. Also, the presentation unit 4625 may acquire additional information such as sensor information and present information based on the additional information. Also, the presentation unit 4625 may present based on a user's instruction acquired by the user interface 4626 .
  • the sensor terminal 4603 generates sensor information, which is information obtained by the sensor.
  • the sensor terminal 4603 is a terminal equipped with a sensor or a camera, and includes, for example, a moving object such as an automobile, a flying object such as an airplane, a mobile terminal, or a camera.
  • the sensor information that can be acquired by the sensor terminal 4603 is, for example, (1) the distance between the sensor terminal 4603 and the object, or the reflectance of the object, obtained from a LIDAR, millimeter wave radar, or infrared sensor; It is the distance between the camera and the object obtained from the monocular camera image or the stereo camera image, the reflectance of the object, or the like.
  • Sensor information may also include sensor attitude, orientation, gyro (angular velocity), position (GPS information or altitude), velocity, acceleration, or the like.
  • Sensor information may also include temperature, atmospheric pressure, humidity, magnetism, or the like.
  • the external connection unit 4604 is implemented by an integrated circuit (LSI or IC), an external storage unit, communication with a cloud server via the Internet, broadcasting, or the like.
  • LSI integrated circuit
  • IC integrated circuit
  • FIG. 2 is a diagram showing the configuration of point cloud data.
  • FIG. 3 is a diagram showing a configuration example of a data file in which information about point cloud data is described.
  • Point cloud data includes data of multiple points. Data of each point includes position information (three-dimensional coordinates) and attribute information for the position information. A collection of multiple points is called a point cloud. For example, a point cloud indicates the three-dimensional shape of an object.
  • Position information such as three-dimensional coordinates is sometimes called geometry.
  • the data of each point may include attribute information (attribute) of a plurality of attribute types.
  • the attribute type is, for example, color or reflectance.
  • One attribute information may be associated with one location information, or attribute information having a plurality of different attribute types may be associated with one location information. Also, a plurality of pieces of attribute information of the same attribute type may be associated with one piece of position information.
  • the configuration example of the data file shown in FIG. 3 is an example in which position information and attribute information correspond one-to-one. there is
  • the positional information is, for example, three-axis information of x, y, and z.
  • the attribute information is, for example, RGB color information.
  • a typical data file is a ply file.
  • FIG. 4 is a diagram showing types of point cloud data. As shown in FIG. 4, point cloud data includes static objects and dynamic objects.
  • a static object is 3D point cloud data at an arbitrary time (a certain time).
  • a dynamic object is 3D point cloud data that changes over time.
  • three-dimensional point cloud data at a certain time will be called a PCC frame or a frame.
  • the object may be a point cloud with a limited area, such as normal video data, or a large-scale point cloud with an unrestricted area, such as map information.
  • point cloud data with various densities
  • sparse point cloud data and dense point cloud data there may be sparse point cloud data and dense point cloud data.
  • a point cloud data generation unit 4618 generates point cloud data based on the sensor information obtained by the sensor information acquisition unit 4617 .
  • the point cloud data generation unit 4618 generates position information as point cloud data, and adds attribute information for the position information to the position information.
  • the point cloud data generation unit 4618 may process the point cloud data when generating position information or adding attribute information. For example, the point cloud data generation unit 4618 may reduce the amount of data by deleting point clouds with overlapping positions. Also, the point cloud data generation unit 4618 may transform the position information (position shift, rotation, normalization, etc.), and may render the attribute information.
  • point cloud data generation system 4611 is included in the three-dimensional data encoding system 4601 in FIG. 1, it may be provided independently outside the three-dimensional data encoding system 4601.
  • the encoding unit 4613 generates encoded data by encoding the point cloud data based on a predefined encoding method.
  • a predefined encoding method There are roughly two types of encoding methods as follows. The first is an encoding method using position information, and this encoding method is hereinafter referred to as a first encoding method. The second is an encoding method using a video codec, and this encoding method is hereinafter referred to as a second encoding method.
  • the decoding unit 4624 decodes the point cloud data by decoding the encoded data based on a predefined encoding method.
  • the multiplexing unit 4614 generates multiplexed data by multiplexing the encoded data using an existing multiplexing method.
  • the generated multiplexed data is transmitted or stored.
  • the multiplexing unit 4614 multiplexes other media such as video, audio, subtitles, applications, files, or reference time information in addition to PCC-encoded data.
  • the multiplexing unit 4614 may further multiplex attribute information related to sensor information or point cloud data.
  • Multiplexing methods or file formats include ISOBMFF, MPEG-DASH, which is a transmission method based on ISOBMFF, MMT, MPEG-2 TS Systems, and RMP.
  • the demultiplexing unit 4623 extracts PCC-encoded data, other media, time information, etc. from the multiplexed data.
  • the input/output unit 4615 transmits the multiplexed data using a method suitable for the transmission medium or storage medium, such as broadcasting or communication.
  • the input/output unit 4615 may communicate with other devices via the Internet, or may communicate with a storage unit such as a cloud server.
  • http http, ftp, TCP, UDP, etc. are used.
  • a PULL-type communication method may be used, or a PUSH-type communication method may be used.
  • Either wired transmission or wireless transmission may be used.
  • Ethernet registered trademark
  • USB registered trademark
  • RS-232C HDMI
  • coaxial cable or the like is used.
  • wireless transmission wireless LAN, Wi-Fi (registered trademark), Bluetooth (registered trademark), millimeter waves, or the like is used.
  • DVB-T2 DVB-S2, DVB-C2, ATSC3.0, or ISDB-S3 is used.
  • FIG. 5 is a diagram showing the configuration of the first encoding unit 4630, which is an example of the encoding unit 4613 that performs encoding according to the first encoding method.
  • FIG. 6 is a block diagram of the first encoding unit 4630. As shown in FIG. The first encoding unit 4630 generates encoded data (encoded stream) by encoding the point cloud data using the first encoding method.
  • This first encoding section 4630 includes a position information encoding section 4631 , an attribute information encoding section 4632 , an additional information encoding section 4633 and a multiplexing section 4634 .
  • the first encoding unit 4630 has the feature of performing encoding with the three-dimensional structure in mind. Also, the first encoding unit 4630 is characterized in that the attribute information encoding unit 4632 performs encoding using information obtained from the position information encoding unit 4631 .
  • the first encoding method is also called GPCC (Geometry based PCC).
  • the point cloud data is PCC point cloud data such as a PLY file or PCC point cloud data generated from sensor information, and includes position information (Position), attribute information (Attribute), and other additional information (MetaData). including.
  • the position information is input to the position information encoding section 4631
  • the attribute information is input to the attribute information encoding section 4632
  • the additional information is input to the additional information encoding section 4633 .
  • the position information encoding unit 4631 generates encoded position information (compressed geometry), which is encoded data, by encoding the position information.
  • the positional information encoding unit 4631 encodes the positional information using an N-ary tree structure such as an octatree. Specifically, in the octree, the object space is divided into 8 nodes (subspaces), and 8-bit information (occupancy code) indicating whether or not each node contains a point cloud is generated. . Also, the node containing the point cloud is further divided into 8 nodes, and 8-bit information is generated indicating whether or not each of the 8 nodes contains the point cloud. This process is repeated until the number of point groups included in a predetermined layer or node is equal to or less than the threshold.
  • the attribute information encoding unit 4632 encodes using the configuration information generated by the position information encoding unit 4631 to generate encoded attribute information (Compressed Attribute), which is encoded data. For example, the attribute information encoding unit 4632 determines a reference point (reference node) to be referenced in encoding the target point (target node) to be processed, based on the octree structure generated by the position information encoding unit 4631. do. For example, the attribute information encoding unit 4632 refers to a node whose parent node in the octree is the same as the target node among peripheral nodes or adjacent nodes. Note that the reference relationship determination method is not limited to this.
  • the attribute information encoding process may include at least one of the quantization process, the prediction process, and the arithmetic encoding process.
  • referencing means using a reference node to calculate the predicted value of attribute information, or the state of the reference node (for example, the state of the reference node that indicates whether or not the reference node includes a point cloud) to determine the encoding parameters. information).
  • the coding parameter is a quantization parameter in quantization processing, a context in arithmetic coding, or the like.
  • the additional information encoding unit 4633 generates encoded additional information (Compressed MetaData), which is encoded data, by encoding compressible data among the additional information.
  • Compressed MetaData encoded additional information
  • the multiplexing unit 4634 multiplexes the encoded position information, encoded attribute information, encoded additional information, and other additional information to generate an encoded stream (compressed stream), which is encoded data.
  • the generated encoded stream is output to a system layer processing unit (not shown).
  • FIG. 7 is a diagram showing the configuration of the first decoding unit 4640.
  • FIG. 8 is a block diagram of the first decoding unit 4640.
  • the first decoding unit 4640 generates point cloud data by decoding encoded data (encoded stream) encoded by the first encoding method using the first encoding method.
  • This first decoding unit 4640 includes a demultiplexing unit 4641 , a position information decoding unit 4642 , an attribute information decoding unit 4643 and an additional information decoding unit 4644 .
  • An encoded stream (compressed stream), which is encoded data, is input to the first decoding unit 4640 from a system layer processing unit (not shown).
  • the demultiplexing unit 4641 separates encoded position information (Compressed Geometry), encoded attribute information (Compressed Attribute), encoded additional information (Compressed MetaData), and other additional information from the encoded data.
  • the position information decoding unit 4642 generates position information by decoding the encoded position information. For example, the position information decoding unit 4642 restores position information of a point group represented by three-dimensional coordinates from encoded position information represented by an N-ary tree structure such as an octatree.
  • the attribute information decoding unit 4643 decodes the encoded attribute information based on the configuration information generated by the position information decoding unit 4642. For example, the attribute information decoding unit 4643 determines a reference point (reference node) to refer to in decoding the target point (target node) to be processed, based on the octtree structure obtained by the position information decoding unit 4642 . For example, the attribute information decoding unit 4643 refers to a node whose parent node in the octree is the same as the target node among peripheral nodes or adjacent nodes. Note that the reference relationship determination method is not limited to this.
  • the attribute information decoding process may include at least one of the inverse quantization process, the prediction process, and the arithmetic decoding process.
  • referencing means using the reference node to calculate the predicted value of the attribute information, or the state of the reference node (for example, occupancy information indicating whether or not the reference node includes a point group) to determine the parameters for decoding.
  • the decoding parameter is a quantization parameter in inverse quantization processing, a context in arithmetic decoding, or the like.
  • the additional information decoding unit 4644 generates additional information by decoding the encoded additional information. Also, the first decoding unit 4640 uses the additional information required for the decoding processing of the position information and the attribute information at the time of decoding, and outputs the additional information required for the application to the outside.
  • FIG. 9 is a block diagram of position information encoding section 2700 according to this embodiment.
  • the positional information encoding unit 2700 includes an octree generation unit 2701 , a geometric information calculation unit 2702 , an encoding table selection unit 2703 and an entropy encoding unit 2704 .
  • the octree generation unit 2701 generates, for example, an octree from the input position information, and generates an occupancy code for each node of the octree.
  • the geometric information calculation unit 2702 acquires information indicating whether or not the node adjacent to the target node is an occupied node. For example, the geometric information calculation unit 2702 calculates occupancy information (information indicating whether or not the adjacent node is an occupied node) of the adjacent node from the occupancy code of the parent node to which the target node belongs.
  • the geometric information calculation unit 2702 may store encoded nodes in a list and search for adjacent nodes from the list. Note that the geometric information calculation unit 2702 may switch the adjacent node according to the position within the parent node of the target node.
  • the encoding table selection unit 2703 selects an encoding table to be used for entropy encoding of the target node using the occupation information of adjacent nodes calculated by the geometric information calculation unit 2702 .
  • the coding table selection unit 2703 may generate a bit string using occupation information of adjacent nodes and select a coding table for index numbers generated from the bit string.
  • the entropy encoding unit 2704 generates encoded position information and metadata by performing entropy encoding on the occupancy code of the target node using the encoding table of the selected index number.
  • the entropy coding unit 2704 may add information indicating the selected coding table to the coding position information.
  • Position information (position data) is encoded after being converted into an octatree structure (octtree structure).
  • An octree structure consists of nodes and leaves. Each node has 8 nodes or leaves and each leaf has voxel (VXL) information.
  • FIG. 10 is a diagram showing an example structure of position information including a plurality of voxels.
  • FIG. 11 is a diagram showing an example of converting the position information shown in FIG. 10 into an octatree structure. Among the leaves shown in FIG. 11, leaves 1, 2, and 3 represent voxels VXL1, VXL2, and VXL3 shown in FIG.
  • node 1 corresponds to the entire space containing the position information in FIG.
  • the entire space corresponding to node 1 is divided into 8 nodes, and among the 8 nodes, the node containing valid VXL is further divided into 8 nodes or leaves, and this process is repeated for the hierarchy of the tree structure.
  • each node corresponds to a subspace and has information (occupancy code) indicating at which position after division the next node or leaf is located as node information.
  • the lowest layer block is set as a leaf, and the number of point groups included in the leaf is stored as leaf information.
  • FIG. 12 is a block diagram of position information decoding section 2710 according to this embodiment.
  • the position information decoding unit 2710 includes an octree generation unit 2711 , a geometric information calculation unit 2712 , an encoding table selection unit 2713 and an entropy decoding unit 2714 .
  • the octree generation unit 2711 generates an octree of a certain space (node) using bitstream header information or metadata. For example, the octree generation unit 2711 generates a large space (root node) using the sizes of a certain space in the x-axis, y-axis, and z-axis directions added to the header information.
  • An octree is generated by generating eight small spaces A (nodes A0 to A7) by dividing into two along the y-axis and z-axis respectively. Also, nodes A0 to A7 are set in order as target nodes.
  • the geometric information calculation unit 2712 acquires occupancy information indicating whether or not the node adjacent to the target node is an occupied node. For example, the geometric information calculation unit 2712 calculates occupation information of adjacent nodes from the occupancy code of the parent node to which the target node belongs. Alternatively, the geometric information calculation unit 2712 may store the decoded nodes in a list and search for adjacent nodes in the list. Note that the geometric information calculation unit 2712 may switch the adjacent node according to the position within the parent node of the target node.
  • the encoding table selection unit 2713 selects an encoding table (decoding table) to be used for entropy decoding of the target node using the occupation information of adjacent nodes calculated by the geometric information calculation unit 2712 .
  • the encoding table selection unit 2713 may generate a bit string using occupation information of adjacent nodes, and select an encoding table for index numbers generated from the bit string.
  • the entropy decoding unit 2714 generates position information by entropy decoding the occupancy code of the target node using the selected encoding table. Note that the entropy decoding unit 2714 may decode and acquire the information of the selected encoding table from the bitstream, and entropy-decode the occupancy code of the target node using the encoding table indicated by the information. .
  • FIG. 13 is a block diagram showing a configuration example of the attribute information encoding unit A100.
  • the attribute information encoder may include multiple encoders that perform different encoding methods. For example, the attribute information encoding unit may switch between the following two methods depending on the use case.
  • the attribute information encoding unit A100 includes an LoD attribute information encoding unit A101 and a conversion attribute information encoding unit A102.
  • the LoD attribute information encoding unit A101 classifies each 3D point into multiple layers using the position information of the 3D point, predicts the attribute information of the 3D point belonging to each layer, and encodes the prediction residual. become Here, each classified hierarchy is called LoD (Level of Detail).
  • the transform attribute information encoding unit A102 encodes attribute information using RAHT (Region Adaptive Hierarchical Transform). Specifically, the transform attribute information encoding unit A102 applies RAHT or Haar transform to each piece of attribute information based on the position information of the three-dimensional point, thereby generating high frequency components and low frequency components of each layer. and encode those values using quantization, entropy coding, or the like.
  • RAHT Restion Adaptive Hierarchical Transform
  • FIG. 14 is a block diagram showing a configuration example of the attribute information decoding unit A110.
  • the attribute information decoding unit may include multiple decoding units that perform different decoding methods. For example, the attribute information decoding unit may decode by switching between the following two methods based on information included in the header or metadata.
  • the attribute information decoding unit A110 includes an LoD attribute information decoding unit A111 and a conversion attribute information decoding unit A112.
  • the LoD attribute information decoding unit A111 classifies each 3D point into multiple layers using the position information of the 3D point, and decodes the attribute value while predicting the attribute information of the 3D point belonging to each layer.
  • the conversion attribute information decoding unit A112 decodes attribute information using RAHT (Region Adaptive Hierarchical Transform). Specifically, the transform attribute information decoding unit A112 converts the attribute values by applying inverse RAHT or inverse Haar transform to the high frequency components and low frequency components of each attribute value based on the position information of the three-dimensional points. Decrypt.
  • RAHT Resource Adaptive Hierarchical Transform
  • FIG. 15 is a block diagram showing the configuration of an attribute information encoding unit 3140, which is an example of the LoD attribute information encoding unit A101.
  • the attribute information encoding unit 3140 includes an LoD generation unit 3141, a surrounding search unit 3142, a prediction unit 3143, a prediction residual calculation unit 3144, a quantization unit 3145, an arithmetic coding unit 3146, and an inverse quantization unit. 3147 , a decoded value generator 3148 , and a memory 3149 .
  • the LoD generation unit 3141 generates LoD using the position information of the three-dimensional points.
  • the surrounding search unit 3142 uses the LoD generation result of the LoD generation unit 3141 and the distance information indicating the distance between each 3D point to search for neighboring 3D points adjacent to each 3D point.
  • the prediction unit 3143 generates a predicted value of the attribute information of the target three-dimensional point to be encoded.
  • the prediction residual calculation unit 3144 calculates (generates) the prediction residual of the predicted value of the attribute information generated by the prediction unit 3143 .
  • the quantization unit 3145 quantizes the prediction residual of the attribute information calculated by the prediction residual calculation unit 3144.
  • the arithmetic coding unit 3146 arithmetically codes the prediction residual after being quantized by the quantization unit 3145 .
  • the arithmetic coding unit 3146 outputs the bitstream including the arithmetically coded prediction residuals to, for example, a 3D data decoding device.
  • prediction residual may be binarized by the quantization unit 3145, for example, before being arithmetically coded by the arithmetic coding unit 3146.
  • the arithmetic coding unit 3146 may initialize the coding table used for arithmetic coding before arithmetic coding.
  • the arithmetic coding unit 3146 may initialize the coding table used for arithmetic coding for each layer.
  • the arithmetic coding unit 3146 may include information indicating the position of the layer in which the coding table is initialized in the bitstream and output it.
  • the inverse quantization unit 3147 inversely quantizes the prediction residual after quantization by the quantization unit 3145 .
  • the decoded value generation unit 3148 generates a decoded value by adding the predicted value of the attribute information generated by the prediction unit 3143 and the prediction residual after inverse quantization by the inverse quantization unit 3147 .
  • the memory 3149 is a memory that stores the decoded value of the attribute information of each three-dimensional point decoded by the decoded value generation unit 3148 .
  • the prediction unit 3143 generates a predicted value using the decoded value of the attribute information of each 3D point stored in the memory 3149 when generating a predicted value for a 3D point that has not yet been encoded. .
  • FIG. 16 is a block diagram of an attribute information encoding unit 6600, which is an example of the conversion attribute information encoding unit A102.
  • the attribute information encoding unit 6600 includes a sorting unit 6601, a Haar transform unit 6602, a quantization unit 6603, an inverse quantization unit 6604, an inverse Haar transform unit 6605, a memory 6606, and an arithmetic coding unit 6607. Prepare.
  • the sorting unit 6601 generates Morton codes using the position information of the three-dimensional points, and sorts the multiple three-dimensional points in the order of the Morton codes.
  • the Haar transform unit 6602 applies Haar transform to attribute information to generate encoded coefficients.
  • a quantization unit 6603 quantizes the encoded coefficients of the attribute information.
  • the inverse quantization unit 6604 inversely quantizes the encoded coefficients after quantization.
  • the inverse Haar transform unit 6605 applies inverse Haar transform to the encoded coefficients.
  • the memory 6606 stores attribute information values of a plurality of decoded three-dimensional points. For example, attribute information of decoded 3D points stored in memory 6606 may be used for prediction of unencoded 3D points.
  • the arithmetic coding unit 6607 calculates ZeroCnt from the quantized coding coefficients and arithmetically codes ZeroCnt. Also, the arithmetic coding unit 6607 arithmetically codes the quantized non-zero coding coefficients. The arithmetic coding unit 6607 may binarize the coding coefficients before arithmetic coding. Also, the arithmetic coding unit 6607 may generate and code various types of header information.
  • FIG. 17 is a block diagram showing the configuration of an attribute information decoding unit 3150, which is an example of the LoD attribute information decoding unit A111.
  • the attribute information decoding unit 3150 includes an LoD generation unit 3151, a surrounding search unit 3152, a prediction unit 3153, an arithmetic decoding unit 3154, an inverse quantization unit 3155, a decoded value generation unit 3156, and a memory 3157. .
  • the LoD generation unit 3151 generates LoD using the position information of the three-dimensional points decoded by the position information decoding unit (not shown in FIG. 17).
  • the surrounding search unit 3152 uses the LoD generation result of the LoD generation unit 3151 and the distance information indicating the distance between each 3D point to search for neighboring 3D points adjacent to each 3D point.
  • the prediction unit 3153 generates a predicted value of the attribute information of the target three-dimensional point to be decoded.
  • the arithmetic decoding unit 3154 arithmetically decodes the prediction residual in the bitstream acquired from the attribute information encoding unit 3140 shown in FIG.
  • the arithmetic decoding unit 3154 may initialize a decoding table used for arithmetic decoding.
  • the arithmetic decoding unit 3154 initializes the decoding table used for arithmetic decoding for the layer on which the arithmetic coding unit 3146 shown in FIG. 15 has performed the coding process.
  • the arithmetic decoding unit 3154 may initialize a decoding table used for arithmetic decoding for each layer.
  • the arithmetic decoding unit 3154 may initialize the decoding table based on information, included in the bitstream, indicating the position of the layer where the coding table is initialized.
  • the inverse quantization unit 3155 inversely quantizes the prediction residual arithmetically decoded by the arithmetic decoding unit 3154 .
  • a decoded value generation unit 3156 adds the prediction value generated by the prediction unit 3153 and the prediction residual after inverse quantization by the inverse quantization unit 3155 to generate a decoded value.
  • the decoded value generator 3156 outputs the decoded attribute information data to another device.
  • the memory 3157 is a memory that stores the decoded value of the attribute information of each three-dimensional point decoded by the decoded value generation unit 3156 . For example, when generating a predicted value for a 3D point that has not yet been decoded, the prediction unit 3153 generates a predicted value using the decoded value of attribute information for each 3D point stored in the memory 3157. .
  • FIG. 18 is a block diagram of an attribute information decoding section 6610, which is an example of the conversion attribute information decoding section A112.
  • the attribute information decoding unit 6610 includes an arithmetic decoding unit 6611 , an inverse quantization unit 6612 , an inverse Haar transform unit 6613 and a memory 6614 .
  • the arithmetic decoding unit 6611 arithmetically decodes ZeroCnt and the encoded coefficients included in the bitstream. Note that the arithmetic decoding unit 6611 may decode various types of header information.
  • the inverse quantization unit 6612 inversely quantizes the arithmetically decoded coding coefficients.
  • the inverse Haar transform unit 6613 applies inverse Haar transform to the encoded coefficients after inverse quantization.
  • the memory 6614 stores values of attribute information of a plurality of decoded three-dimensional points. For example, attribute information of decoded 3D points stored in memory 6614 may be used for prediction of undecoded 3D points.
  • FIG. 19 is a diagram showing the configuration of the second encoding section 4650.
  • FIG. 20 is a block diagram of the second encoding section 4650. As shown in FIG.
  • the second encoding unit 4650 generates encoded data (encoded stream) by encoding the point cloud data using the second encoding method.
  • the second encoding unit 4650 includes an additional information generation unit 4651, a position image generation unit 4652, an attribute image generation unit 4653, a video encoding unit 4654, an additional information encoding unit 4655, and a multiplexing unit 4656. including.
  • the second encoding unit 4650 generates a position image and an attribute image by projecting the three-dimensional structure onto a two-dimensional image, and encodes the generated position image and attribute image using an existing video encoding method. It has the characteristics of The second encoding method is also called VPCC (Video based PCC).
  • VPCC Video based PCC
  • Point cloud data is PCC point cloud data such as a PLY file, or PCC point cloud data generated from sensor information, and includes position information (Position), attribute information (Attribute), and other additional information (MetaData) include.
  • the additional information generation unit 4651 generates map information for a plurality of two-dimensional images by projecting the three-dimensional structure onto the two-dimensional images.
  • the position image generation unit 4652 generates a position image (Geometry Image) based on the position information and the map information generated by the additional information generation unit 4651.
  • This position image is, for example, a distance image in which a distance (Depth) is indicated as a pixel value.
  • this distance image may be an image of a plurality of point groups viewed from one viewpoint (an image obtained by projecting a plurality of point groups onto a single two-dimensional plane), or may be an image obtained by projecting a plurality of point groups onto a single two-dimensional plane. It may be a plurality of images obtained by viewing , or a single image obtained by integrating these plurality of images.
  • the attribute image generation unit 4653 generates an attribute image based on the attribute information and the map information generated by the additional information generation unit 4651.
  • This attribute image is, for example, an image in which attribute information (for example, color (RGB)) is indicated as pixel values.
  • RGB color
  • this image may be an image of a plurality of point groups viewed from one viewpoint (an image obtained by projecting a plurality of point groups onto a single two-dimensional plane), or a plurality of point groups viewed from a plurality of viewpoints. It may be a plurality of viewed images or a single image obtained by integrating these plurality of images.
  • the video encoding unit 4654 encodes the position image and the attribute image using a video encoding method to generate a coded position image (Compressed Geometry Image) and a coded attribute image (Compressed Attribute Image), which are coded data. ).
  • a video encoding method Any known encoding method may be used as the video encoding method.
  • the video encoding system is AVC, HEVC, or the like.
  • the additional information encoding unit 4655 generates encoded additional information (Compressed MetaData) by encoding additional information and map information included in the point cloud data.
  • the multiplexing unit 4656 multiplexes the encoded position image, encoded attribute image, encoded additional information, and other additional information to generate an encoded stream (compressed stream) as encoded data.
  • the generated encoded stream is output to a system layer processing unit (not shown).
  • FIG. 21 is a diagram showing the configuration of the second decoding unit 4660.
  • FIG. 22 is a block diagram of the second decoding unit 4660.
  • the second decoding unit 4660 generates point cloud data by decoding the encoded data (encoded stream) encoded by the second encoding method using the second encoding method.
  • This second decoding unit 4660 includes a demultiplexing unit 4661 , a video decoding unit 4662 , an additional information decoding unit 4663 , a position information generation unit 4664 and an attribute information generation unit 4665 .
  • An encoded stream (compressed stream), which is encoded data, is input to the second decoding unit 4660 from a system layer processing unit (not shown).
  • the demultiplexing unit 4661 separates the encoded position image (Compressed Geometry Image), the encoded attribute image (Compressed Attribute Image), the encoded additional information (Compressed MetaData), and other additional information from the encoded data. .
  • the video decoding unit 4662 generates a position image and an attribute image by decoding the encoded position image and the encoded attribute image using the video encoding method.
  • Any known encoding scheme may be used as the video encoding scheme.
  • the video encoding system is AVC, HEVC, or the like.
  • the additional information decoding unit 4663 generates additional information including map information and the like by decoding the encoded additional information.
  • the position information generation unit 4664 generates position information using the position image and map information.
  • the attribute information generation unit 4665 generates attribute information using the attribute image and the map information.
  • the second decoding unit 4660 uses the additional information required for decoding during decoding, and outputs the additional information required for the application to the outside.
  • FIG. 23 is a diagram showing a protocol stack related to PCC encoded data.
  • FIG. 23 shows an example of multiplexing data of other media such as video (for example, HEVC) or audio to PCC-encoded data and transmitting or storing the multiplexed data.
  • video for example, HEVC
  • audio to PCC-encoded data and transmitting or storing the multiplexed data.
  • the multiplexing method and file format have functions for multiplexing, transmitting or storing various encoded data.
  • the encoded data In order to transmit or store the encoded data, the encoded data must be converted to a multiplexed format.
  • HEVC defines a technique of storing encoded data in a data structure called a NAL unit and storing the NAL unit in an ISOBMFF.
  • encoded data position information (Geometry), attribute information (Attribute), additional information (Metadata)
  • position information (Geometry), attribute information (Attribute), additional information (Metadata)
  • attribute information (Attribute)
  • Metadata additional information
  • multiplexing processing in the multiplexing unit will be described.
  • additional information may also be referred to as a parameter set or control information.
  • the dynamic object three-dimensional point cloud data that changes over time
  • FIG. 4 the dynamic object (three-dimensional point cloud data that changes over time) explained in FIG. 4 will be described as an example. A similar method may be used.
  • FIG. 24 is a diagram showing configurations of an encoding section 4801 and a multiplexing section 4802 included in the three-dimensional data encoding device according to this embodiment.
  • the encoding unit 4801 corresponds to, for example, the first encoding unit 4630 or the second encoding unit 4650 described above.
  • Multiplexer 4802 corresponds to multiplexer 4634 or 4656 described above.
  • the encoding unit 4801 encodes point cloud data of a plurality of PCC (Point Cloud Compression) frames to generate encoded data (Multiple Compressed Data) of a plurality of position information, attribute information and additional information.
  • PCC Point Cloud Compression
  • the multiplexing unit 4802 converts data of multiple data types (position information, attribute information, and additional information) into NAL units, thereby converting the data into a data structure that takes into account data access in the decoding device.
  • FIG. 25 is a diagram showing a configuration example of encoded data generated by the encoding unit 4801.
  • FIG. The arrows in the figure indicate the dependency related to decoding of encoded data, and the origin of the arrow depends on the data at the tip of the arrow. That is, the decoding device decodes the data indicated by the arrow, and uses the decoded data to decode the original data indicated by the arrow.
  • to depend means that the data of the dependence destination is referred to (used) in the processing (encoding, decoding, etc.) of the data of the dependence source.
  • the encoding unit 4801 generates encoded position data (compressed geometry data) for each frame by encoding the position information of each frame. Also, the encoding position data is represented by G(i). i indicates a frame number, frame time, or the like.
  • the encoding unit 4801 generates a position parameter set (GPS(i)) corresponding to each frame.
  • a position parameter set contains parameters that can be used to decode the encoded position data. Also, the encoded position data for each frame depends on the corresponding position parameter set.
  • the encoded position data consisting of multiple frames is defined as a position sequence (Geometry Sequence).
  • the encoding unit 4801 generates a position sequence parameter set (Geometry Sequence PS: also referred to as position SPS) that stores parameters commonly used in decoding processing for a plurality of frames in the position sequence.
  • the position sequence depends on the position SPS.
  • the encoding unit 4801 encodes the attribute information of each frame to generate encoded attribute data (Compressed Attribute Data) for each frame. Also, the encoded attribute data is represented by A(i). Also, FIG. 25 shows an example in which attribute X and attribute Y exist, where the encoded attribute data of attribute X is represented by AX(i) and the encoded attribute data of attribute Y is represented by AY(i). .
  • the encoding unit 4801 generates an attribute parameter set (APS(i)) corresponding to each frame.
  • the attribute parameter set of attribute X is represented by AXPS(i)
  • the attribute parameter set of attribute Y is represented by AYPS(i).
  • the attribute parameter set contains parameters that can be used to decode the encoded attribute information.
  • the encoded attribute data depends on the corresponding attribute parameter set.
  • the encoded attribute data consisting of multiple frames is defined as an attribute sequence.
  • the encoding unit 4801 generates an attribute sequence parameter set (Attribute Sequence PS: also referred to as attribute SPS) that stores parameters commonly used in decoding processing for multiple frames in the attribute sequence.
  • attribute sequence depends on the attribute SPS.
  • the encoded attribute data depends on the encoded position data.
  • FIG. 25 shows an example in which there are two types of attribute information (attribute X and attribute Y).
  • attribute information for example, two encoding units generate respective data and metadata.
  • an attribute sequence is defined for each type of attribute information, and an attribute SPS is generated for each type of attribute information.
  • FIG. 25 shows an example in which there is one type of position information and two types of attribute information
  • the present invention is not limited to this. good.
  • encoded data can be generated in a similar manner.
  • point cloud data that does not have attribute information
  • encoding section 4801 does not need to generate a parameter set related to attribute information.
  • the encoding unit 4801 generates a PCC stream PS (PCC Stream PS: also referred to as stream PS), which is a parameter set for the entire PCC stream.
  • the encoding unit 4801 stores parameters that can be commonly used in the decoding process for one or more position sequences and one or more attribute sequences in the stream PS.
  • the stream PS includes identification information indicating the codec of the point cloud data, information indicating the algorithm used for encoding, and the like.
  • the position sequence and attribute sequence depend on the stream PS.
  • An access unit is a basic unit for accessing data during decoding, and consists of one or more pieces of data and one or more pieces of metadata.
  • an access unit is composed of location information and one or more pieces of attribute information at the same time.
  • a GOF is a random access unit and is composed of one or more access units.
  • the encoding unit 4801 generates an access unit header (AU Header) as identification information indicating the beginning of the access unit.
  • the encoding unit 4801 stores parameters related to access units in access unit headers.
  • the access unit header contains the structure or information of the encoded data included in the access unit.
  • the access unit header includes parameters commonly used for data included in the access unit, such as parameters related to decoding of encoded data.
  • the encoding section 4801 may generate an access unit delimiter that does not include parameters related to access units instead of the access unit header.
  • This access unit delimiter is used as identification information indicating the head of the access unit.
  • the decoding device identifies the beginning of the access unit by detecting the access unit header or access unit delimiter.
  • the encoding unit 4801 generates a GOF header as identification information indicating the beginning of the GOF.
  • the encoding unit 4801 stores GOF-related parameters in the GOF header.
  • the GOF header contains the structure or information of the encoded data contained in the GOF.
  • the GOF header includes parameters commonly used for data included in the GOF, such as parameters related to decoding of encoded data.
  • the encoding unit 4801 may generate a GOF delimiter that does not include parameters related to GOF instead of the GOF header.
  • This GOF delimiter is used as identification information indicating the beginning of the GOF.
  • the decoding device identifies the beginning of the GOF by detecting the GOF header or GOF delimiter.
  • an access unit is defined as a PCC frame unit.
  • the decoding device accesses the PCC frame based on the identification information at the beginning of the access unit.
  • a GOF is defined as one random access unit.
  • the decoding device accesses the random access unit based on the identification information at the beginning of the GOF.
  • a PCC frame may be defined as a random access unit if the PCC frames are independent of each other and can be decoded independently.
  • two or more PCC frames may be assigned to one access unit, and a plurality of random access units may be assigned to one GOF.
  • the encoding unit 4801 may define and generate parameter sets or metadata other than the above.
  • the encoding unit 4801 may generate SEI (Supplemental Enhancement Information) that stores parameters (optional parameters) that may not necessarily be used during decoding.
  • SEI Supplemental Enhancement Information
  • FIG. 26 is a diagram showing an example of encoded data and NAL units.
  • encoded data includes a header and a payload as shown in FIG.
  • the encoded data may include length information indicating the length (data amount) of the encoded data, header, or payload.
  • the encoded data may not include a header.
  • the header includes, for example, identification information for specifying data.
  • This identification information indicates, for example, data type or frame number.
  • the header includes, for example, identification information that indicates the reference relationship.
  • This identification information is, for example, information stored in a header when there is a dependency relationship between data, and is information for referencing a reference destination from a reference source.
  • the referenced header includes identification information for specifying the data.
  • the referrer header includes identification information indicating the referrer.
  • the identification information for specifying the data or the identification information indicating the reference relationship may be omitted.
  • the multiplexing unit 4802 stores the encoded data in the payload of the NAL unit.
  • the NAL unit header includes pcc_nal_unit_type, which is identification information of encoded data.
  • FIG. 27 is a diagram showing an example of the semantics of pcc_nal_unit_type.
  • values 0 to 10 of pcc_nal_unit_type correspond to encoding position data (Geometry), encoding attribute X data in codec 1.
  • AttributeX encoded attribute Y data (AttributeY), position PS (Geom.PS), attribute XPS (AttrX.PS), attribute YPS (AttrX.PS), position SPS (Geometry Sequence PS), attribute XSPS (AttributeX Sequence PS), attribute YSPS (Attribute Y Sequence PS), AU header (AU Header), and GOF header (GOF Header). Also, values 11 and later are assigned to codec 1 backup.
  • pcc_codec_type is codec 2 (Codec2: second encoding method)
  • pcc_nal_unit_type values 0 to 2 are assigned to codec data A (DataA), metadata A (MetaDataA), and metadata B (MetaDataB) . Values 3 and after are assigned to codec 2 spares.
  • the multiplexing unit 4802 collectively transmits NAL units in GOF or AU units. Multiplexing section 4802 places a GOF header at the beginning of GOF and an AU header at the beginning of AU.
  • the multiplexing unit 4802 may arrange a sequence parameter set (SPS) for each AU so that the decoding device can decode from the next AU even if data is lost due to packet loss or the like.
  • SPS sequence parameter set
  • the decoding device decodes the reference source data after decoding the reference destination data.
  • the multiplexing unit 4802 first sends the reference destination data so that the decoding device can decode the data in the order in which it is received without rearranging the data.
  • FIG. 28 is a diagram showing an example of the transmission order of NAL units.
  • FIG. 28 shows three examples of location information priority, parameter priority, and data integration.
  • the transmission order with priority on location information is an example of transmitting information related to location information and information related to attribute information together.
  • this transmission order the transmission of the information regarding the position information is completed earlier than the transmission of the information regarding the attribute information.
  • a decoding device that does not decode attribute information may be able to set a non-processing time by ignoring the decoding of attribute information. Also, for example, in the case of a decoding device that desires to decode position information quickly, there is a possibility that position information can be decoded more quickly by obtaining encoded data of position information early.
  • the attribute XSPS and the attribute YSPS are integrated and described as the attribute SPS, but the attribute XSPS and the attribute YSPS may be arranged separately.
  • the parameter set is transmitted first, and the data is transmitted later.
  • the multiplexing unit 4802 may transmit the NAL units in any order.
  • order identification information may be defined, and multiplexing section 4802 may have a function of transmitting NAL units in order of multiple patterns.
  • the order identification information of the NAL units is stored in the stream PS.
  • the three-dimensional data decoding device may perform decoding based on the order identification information.
  • a desired transmission order may be instructed from the 3D data decoding device to the 3D data encoding device, and the 3D data encoding device (multiplexer 4802) may control the transmission order according to the instructed transmission order.
  • the multiplexing unit 4802 may generate encoded data by merging a plurality of functions as long as the restrictions on the transmission order are complied with, such as the transmission order of data integration.
  • the GOF header and AU header may be integrated, or the AXPS and AYPS may be integrated.
  • pcc_nal_unit_type defines an identifier indicating data having multiple functions.
  • PS has levels such as frame-level PS, sequence-level PS, and PCC-sequence-level PS. The following methods may be used.
  • the default PS value is indicated by a higher PS. Also, when the value of the lower PS is different from the value of the higher PS, the PS value is indicated by the lower PS. Alternatively, the value of PS is not described in the upper PS, but the value of PS is described in the lower PS. Alternatively, information indicating whether the PS value is indicated by the lower PS, the upper PS, or both is indicated in either or both of the lower PS and the upper PS. Alternatively, a lower PS may be merged into a higher PS. Alternatively, when a lower PS and a higher PS overlap, multiplexing section 4802 may omit transmission of one of them.
  • the encoding unit 4801 or the multiplexing unit 4802 may divide data into slices or tiles and transmit the divided data.
  • the divided data includes information for identifying the divided data
  • the parameter set includes parameters used for decoding the divided data.
  • pcc_nal_unit_type defines an identifier indicating that the data is data for storing data or parameters related to tiles or slices.
  • FIG. 29 is a flowchart of processing by the three-dimensional data encoding device (encoding section 4801 and multiplexing section 4802) relating to the NAL unit transmission order.
  • the 3D data encoding device determines the order of NAL unit transmission (location information priority or parameter set priority) (S4801). For example, a 3D data encoding device determines the transmission order based on a designation from a user or an external device (for example, a 3D data decoding device).
  • the 3D data encoding device sets the order identification information included in the stream PS to positional information priority (S4803). That is, in this case, the order identification information indicates that the NAL units are sent in the order of location information priority. Then, the 3D data encoding device sends out NAL units in the order of position information priority (S4804).
  • the 3D data encoding device sets the order identification information included in the stream PS to parameter set priority (S4805). That is, in this case, the order identification information indicates that the NAL units are sent in a parameter set priority order. Then, the 3D data encoding apparatus transmits NAL units in the order of parameter set priority (S4806).
  • FIG. 30 is a flow chart of processing by the three-dimensional data decoding device relating to the NAL unit transmission order.
  • the 3D data decoding device analyzes the order identification information included in the stream PS (S4811).
  • the three-dimensional data decoding device decodes the NAL units assuming that the NAL unit transmission order is positional information priority ( S4813).
  • the three-dimensional data decoding device decodes the NAL units assuming that the NAL unit transmission order is parameter set priority. (S4814).
  • the three-dimensional data decoding device acquires NAL units related to position information without acquiring all NAL units in step S4813, and decodes the position information from the acquired NAL units. good too.
  • FIG. 31 is a flow chart of processing by the three-dimensional data encoding device (multiplexer 4802) for generating AUs and GOFs in multiplexing NAL units.
  • the 3D data encoding device determines the type of encoded data (S4821). Specifically, the three-dimensional data encoding device determines whether the encoded data to be processed is data at the beginning of AU, data at the beginning of GOF, or other data.
  • the 3D data encoding device places the GOF header and AU header at the beginning of the encoded data belonging to the GOF to generate a NAL unit ( S4823).
  • the 3D data encoding device places the AU header at the head of the encoded data belonging to the AU to generate a NAL unit (S4824).
  • the 3D data encoding device places the encoded data after the AU header of the AU to which the encoded data belongs. to generate a NAL unit (S4825).
  • FIG. 32 is a flow chart of processing of the three-dimensional data decoding device relating to AU and GOF access in NAL unit demultiplexing.
  • the 3D data decoding device determines the type of encoded data included in the NAL unit by analyzing nal_unit_type included in the NAL unit (S4831). Specifically, the three-dimensional data decoding device determines whether the encoded data included in the NAL unit is data at the beginning of AU, data at the beginning of GOF, or other data.
  • the 3D data decoding device determines that the NAL unit is the start position of random access, and accesses the NAL unit. Then, the decoding process is started (S4833).
  • the 3D data decoding device determines that the NAL unit is the AU head, and determines that the data included in the NAL unit is to decode the AU (S4834).
  • the three-dimensional data decoding device does not process the NAL unit.
  • FIG. 33 is a block diagram showing the configuration of the three-dimensional data encoding device according to this embodiment. Note that FIG. 33 omits illustration of an encoding unit that encodes position information, which is provided in the three-dimensional data encoding device.
  • the 3D data encoding device 10600 includes a transforming section 10610 and an encoding section 10620 .
  • the conversion unit 10610 performs conversion processing on the input attribute information before inputting it to the encoding unit 10620 .
  • Conversion processing is, for example, at least one of offset (offset processing) and scaling (scaling processing), which will be described later.
  • Transformation section 10610 has scale section 10611 and offset section 10612 .
  • the conversion section 10610 may have at least one of the offset section 10612 and the scale section 10611 .
  • the conversion unit 10610 may not have the scale unit 10611 when only offsetting attribute information.
  • the scaling unit 10611 performs scaling (multiplication or division), which is an example of conversion processing, on the input attribute information, and converts the input attribute information into scale values (more specifically, information indicating scale values used for scaling). certain scale information).
  • the offset unit 10612 performs offset (addition or subtraction), which is another example of conversion processing, on the scaled attribute information, and indicates an offset value (more specifically, an offset value used for the offset). offset information) is output.
  • the encoding unit 10620 encodes the attribute information (post-transformation attribute information) converted by the conversion unit 10610, and also encodes conversion information such as offset values or scale values as additional information (metadata).
  • the encoding unit 10620 includes an attribute information encoding unit 10621 and an additional information encoding unit 10622.
  • the attribute information encoding unit 10621 encodes post-conversion attribute information, which is attribute information converted by the conversion unit 10610 .
  • the additional information encoding section 10622 encodes additional information including conversion information such as scale values and offset values output by the conversion section 10610 .
  • the conversion unit 10610 converts the input attribute When the information format has a negative value, an offset value is added to the attribute information to convert the attribute information to a positive value.
  • the scaling unit 10611 is , when the format of the input attribute information is not an integer, the input attribute information (more specifically, the numerical value indicated by the input attribute information) is multiplied by the scale value to convert the attribute information into a positive value. .
  • the encoding unit 10620 supports encoding of 8-bit unsigned integer type (positive integer) attribute information, and the input attribute information is a 32-bit signed floating point number within the range [ ⁇ 1, 1]. , the attribute information is first converted into a scaled_value, which is an 8-bit signed integer value in the range [ ⁇ 127, 128], by scaling, rounding, rounding down, or rounding up.
  • scale is an example of a scale value, and is a value to be multiplied by the value indicated by the attribute information.
  • the scaled attribute information is converted to an 8-bit unsigned integer type in the range [0, 255] by the offset.
  • offset_attribute scaled_attribute+offset.
  • the offset value and/or scale value which are transform information used for transform, are input to the encoding unit 10620 and encoded as additional information.
  • additional information encoding unit 10622 may encode conversion information as it is as additional information, or may encode information from which conversion information can be derived as additional information.
  • the information from which the offset value can be derived and the information from which the scale value can be derived may be indicated independently, or may be indicated by common information.
  • the encoding unit 10620 stores the value of N (an integer equal to or greater than 1) in the additional information and encodes it, that is, encodes the additional information indicating the value of N as transform information.
  • N may be predetermined as indicating the number of bits of the unsigned integer type.
  • the additional information indicates that N indicates the number of bits of the unsigned integer type. may not be included in
  • the conversion unit 10610 may determine each of the offset and scale (offset value and scale value) based on the values and features indicated by the attribute information forming the three-dimensional point group.
  • the scaling unit 10611 may round the value of the attribute information after scaling by rounding, rounding down, or rounding up.
  • the conversion unit 10610 does not need to convert the attribute information when the value indicated by the attribute information is a positive integer and does not need to be converted.
  • conversion section 10610 may not output the scale value and offset value, or may output information indicating that conversion was not performed as conversion information.
  • the encoding unit 10620 encodes attribute information that has not been converted by the conversion unit 10610 .
  • FIG. 34 is a block diagram showing the configuration of the three-dimensional data decoding device according to this embodiment. Note that FIG. 34 omits illustration of a decoding unit that decodes encoded position information, which is provided in the three-dimensional data decoding device.
  • the three-dimensional data decoding device 10630 includes a decoding section 10640 and an inverse transformation section 10650.
  • the decoding unit 10640 receives encoded attribute information (encoded attribute information) and encoded additional information (encoded additional information), and decodes the encoded attribute information and the encoded additional information.
  • the decoding section 10640 has an attribute information decoding section 10641 and an additional information decoding section 10642 .
  • the attribute information decoding unit 10641 generates decoded attribute information by decoding the encoded attribute information.
  • the additional information decoding unit 10642 extracts transform information and the like indicating offset values, scale values and the like by decoding the encoded additional information.
  • the inverse transform unit 10650 performs inverse transform processing on the decoded attribute information based on the transform information.
  • the inverse transform processing is at least one of inverse offset (inverse offset processing) and inverse scaling (inverse scaling processing), which will be described later.
  • the inverse transform section 10650 has an inverse offset section 10651 and an inverse scale section 10652 .
  • the reverse offset unit 10651 reversely offsets the decoding attribute information using the offset value extracted from the transform information, which is an example of reverse transform processing.
  • the reverse offset unit 10651 performs a conversion on the decoded attribute information that is the reverse of the conversion performed on the attribute information by the transform unit 10610 (more specifically, the offset unit 10612).
  • inverse offset section 10651 subtracts the offset value from the value indicated by the decoded attribute information when transform section 10610 adds the offset value to the value indicated by the attribute information.
  • the inverse scaling unit 10652 inversely scales the reverse-offset decoded attribute information using the scale value extracted from the additional information, which is another example of inverse transform processing. That is, the inverse scaling unit 10652 performs a conversion on the decoded attribute information that is the reverse of the conversion performed on the attribute information by the transforming unit 10610 (more specifically, the scaling unit 10611). For example, when the conversion unit 10610 multiplies the value indicated by the attribute information by the scale value, the inverse scale unit 10652 divides the value indicated by the decoded attribute information by the scale value.
  • scaling and inverse scaling the processing amount may be reduced by using a shift operation (bit shift) without using multiplication and/or division by indicating a scale value as a power of 2 or the like. That is, scaling and inverse scaling are processes of performing at least one of multiplication, division, and shift operations on the value indicated by the attribute information.
  • the inverse transform unit 10650 included in the 3D data decoding device 10630 performs the inverse transform processing based on the transform information included in the encoded data, so that the transform unit included in the 3D data encoding device 10600 It becomes possible to reproduce the attribute information before being converted by 10610 .
  • the three-dimensional data decoding device 10630 does not necessarily have to perform the inverse transform process, and may select whether or not to perform the inverse transform process based on the application or use case.
  • the order is such that the offset unit 10612 is positioned after the scale unit 10611 (at the rear stage), and in the inverse transform unit 10650, the reverse scale unit 10652 is positioned after the reverse offset unit 10651.
  • the transform unit 10610 may be configured such that the scale unit 10611 is positioned after the offset unit 10612
  • the inverse transform unit 10650 may be configured such that the inverse offset unit 10651 is positioned after the inverse scale unit 10652.
  • the three-dimensional data encoding device 10600 selects which configuration to use based on the type of attribute information (attribute_type).
  • information (order information) indicating the order in which the scale and offset are executed may be stored as a flag or the like in the additional information and transmitted to the three-dimensional data decoding device 10630 .
  • the 3D data decoding device 10630 may select the order in which the inverse scaling and the inverse offset are performed based on the order information, and perform the inverse transform processing on the decoded attribute information in the selected order.
  • FIG. 35 is a diagram showing a first example of SPS (Sequence parameter set) syntax according to the present embodiment.
  • the SPS indicates the SPS identifier (sps_idx) and additional information (common_information()) related to the entire sequence.
  • the SPS includes an attribute information identifier (attribute_type), the number of dimensions of the attribute information (num_dimension), an identifier (instance_id) for identifying instances of the same attribute type, and the like.
  • attribute_info additional information and other additional information related to attribute information are shown.
  • 3D point cloud data may have no attribute information corresponding to position information, or may have one or more attribute information corresponding to position information.
  • the three-dimensional data encoding device 10600 when the three-dimensional point cloud data has a plurality of pieces of attribute information corresponding to one piece of position information, the three-dimensional data encoding device 10600 generates transformation information corresponding to each piece of attribute information, and converts the generated transformation Store the information in the side information (in other words, generate the side information including the conversion information).
  • the number of dimensions of the attribute information is 1 or more, and if the number of dimensions is 2 or more, the conversion information may be common to all dimensions. Of course, even when the number of dimensions of attribute information is two or more, conversion information may be individually set for all dimensions.
  • the attribute information is composed of three-dimensional color information and one-dimensional reflectance
  • common conversion information is applied to R (Red), G (Green), and B (Blue) in the color information
  • Conversion information different from color information may be applied to the reflectance.
  • the conversion information may be stored in additional information common to the plurality of specific attribute information.
  • conversion information may be indicated for each type of attribute information on the assumption that common conversion information is used in a plurality of instances for the same attribute type (attribute_type).
  • the conversion information is not used and not indicated in the additional information in the case of a specific attribute type.
  • FIG. 36 is a diagram showing a configuration example of a bitstream according to this embodiment.
  • the conversion information is stored, for example, in the SPS.
  • the conversion information does not have to be stored in the SPS, and may be stored in, for example, a parameter set (APS/Attribute Parameter Set) related to encoding of attribute information, or other additional information such as a slice header.
  • a parameter set APS/Attribute Parameter Set
  • the conversion information may be stored in additional information such as SEI (Supplemental Enhancement Information).
  • SEI Supplemental Enhancement Information
  • attribute information is color information
  • conversion information is not generated.
  • attribute information is information indicating a normal vector
  • conversion information is generated and included in the bitstream together with the SPS or APS.
  • Attribute0 included in the bitstream is color information (Color), which is an example of attribute information
  • Attribute1 included in the bitstream is normal vector information (Normal_Vector), which is another example of attribute information
  • transform_flag may be replaced with transform_information_type, and based on transform_information_type, the combination indicating the offset value and scale value may be switched, or the method and/or format indicating the offset and/or scale value may be switched.
  • the bitstream includes information (transform identification information) indicating whether or not transform processing has been performed, such as transform flag information such as transform_flag and transform type information such as transform_information_type.
  • FIG. 37 is a diagram showing a second example of SPS syntax according to the present embodiment.
  • FIG. 38 is a diagram showing a first example of syntax of conversion information according to this embodiment.
  • X, Y, and Z are arbitrary integers. X, Y, and Z may be predetermined values, or information indicating these values is included in the bit stream and transmitted from the 3D data encoding device 10600 to the 3D data decoding device 10630. good too.
  • a value obtained by subtracting 1 may be stored in the additional information instead of offset_lo2 and scale_log2.
  • FIG. 39 is a diagram showing a second example of syntax of conversion information according to the present embodiment.
  • the syntax is switched such that either the offset value or the scale value is indicated, both are indicated, or neither is indicated, based on the type of attribute information (attribute_type).
  • the order information (transform_order) indicating the order of the configuration of the offset unit 10612 and the scale unit 10611 included in the transform unit 10610 included in the three-dimensional data encoding device 10600 is, for example, , may be included in the bitstream as additional information.
  • the three-dimensional data decoding device 10630 can extract a predetermined syntax based on additional information such as transform_type, attribute_type, and transform_order, and apply it to the inverse transformation unit 10650 .
  • FIG. 40 is a flow chart showing the processing procedure of the three-dimensional data encoding device according to this embodiment.
  • the 3D data encoding device 10600 determines whether or not to convert the input attribute information (S10601).
  • the input attribute information is converted (S10602). For example, the 3D data encoding device 10600 offsets and scales the input attribute information.
  • the 3D data encoding device 10600 encodes the additional information including the conversion information and the converted attribute information (S10604). After step S ⁇ b>10604 , for example, the 3D data encoding device 10600 generates a bitstream containing the encoded information as encoded data and transmits the bitstream to the 3D data decoding device 10630 .
  • transform_flag 0 is set (S10605).
  • the 3D data encoding device 10600 encodes additional information that does not include conversion information and attribute information that has not undergone conversion processing, that is, the input attribute information (S10606). After step S10606, for example, the 3D data encoding device 10600 generates a bitstream containing the encoded information as encoded data, and transmits it to the 3D data decoding device 10630.
  • FIG. 41 is a flow chart showing the processing procedure of the three-dimensional data decoding device according to this embodiment.
  • the three-dimensional data decoding device 10630 receives the bitstream transmitted by the three-dimensional data encoding device 10600, decodes the encoded data contained in the received bitstream, and transforms the decoded encoded data into The included additional information is analyzed (S10611).
  • the 3D data decoding device 10630 decodes the attribute information of the encoded data included in the bitstream (S10612).
  • the 3D data decoding device 10630 When determining that the transform_flag included in the bitstream is set to 1 (Yes in S10613), the 3D data decoding device 10630 extracts transformation information from the additional information, and extracts the attribute information based on the extracted transformation information. inverse transformation processing is executed (S10614).
  • FIG. 42 is a block diagram for explaining another example of the processing of the 3D data encoding device according to this embodiment.
  • FIG. 43 is a block diagram for explaining another example of processing of the three-dimensional data decoding device according to this embodiment.
  • the three-dimensional data encoding device stores the format information indicating the data format of the attribute information input to the conversion unit 10660 and the format of the attribute information encoded after conversion in the additional information, and includes the format information. Additional information may be encoded by encoding section 10670 .
  • the 3D data decoding device 10630 Based on the format information extracted by the decoding unit 10680, the 3D data decoding device 10630 inversely transforms the decoded attribute information using the inverse transforming unit 10690, so that the transforming unit 10660 included in the 3D data encoding device Attribute information before conversion processing in can be reproduced.
  • the format information is, for example, information indicating data type, number of bits, signed or unsigned.
  • the format information is information such as int8, uint16, and float16. A number such as 8 in int8 indicates the number of bits.
  • the format information may indicate the file format of the point cloud data (for example, ply file, pcd file, Las file, txt file, csv file, etc.) before conversion processing is performed, or the conversion processing may indicate the file format of the point cloud data after is performed.
  • the format information includes at least one of the file format of the point cloud data before the conversion process is executed and the file format of the point cloud data after the conversion process is executed as SEI, which is the extended information.
  • SEI the extended information.
  • the SEI may include header information included in the respective file format.
  • the file format of the point cloud data includes, for example, the ply format which is a polygon file format, the las format which is a format of LiDAR data by laser survey, and the pcd format which is a point cloud file format.
  • At least one of these file formats and header information may be included in user_data.
  • the 3D data encoding device can encode the attribute information using the format information extracted by the conversion unit 10660.
  • the three-dimensional data decoding device can inversely transform the decoded attribute information using the format information, and can reconstruct header information or the like that is not to be encoded.
  • transform processing such as offset and scale may be performed, and after decoding the position information, inverse transform processing may be performed.
  • conversion information or format information may be stored in additional information such as SPS.
  • the three-dimensional data encoding device may have a conversion unit that performs at least one or both of the position information and the attribute information.
  • the three-dimensional data decoding device may have an inverse transformation unit that performs one or both (that is, at least one) of the position information and the attribute information. In such a case, either one or both (ie, at least one) of the conversion information of the position information and the conversion information of the attribute information may be included in the additional information.
  • offset, scaling, and quantization have been described as methods for transforming the input point cloud data, but the methods are not limited to these methods, and other transform processing methods may be used.
  • the transformation process may use predetermined linear transformation or non-linear transformation means, such as transformation using a predetermined function or approximation.
  • the additional information may include not only information indicating the format of the attribute information, but also information indicating the order of the point cloud data, information indicating the sorting order, timestamp information, and the like.
  • FIG. 44 is a diagram showing an example of SEI syntax.
  • the SEI may include at least part of the header information of the format according to the format_id.
  • ply_format_info( ) contains ply format header information.
  • las_format_info( ) contains las format header information.
  • the SEI may also include header information for other commonly known point cloud formats.
  • FIG. 45 is a diagram showing a syntax example of ply_format_info().
  • ply_format_info( ) contains the information contained in the ply format header or part of it.
  • ply_format_info( ) includes format_info, element_info, property_num, property_info, property_type, scale, offset, is_property, fill_property, sps_id, and icon_id.
  • format_info indicates whether the data is in binary format or text format. Note that if the data is in binary format, ply_format_info( ) may include information indicating whether the data is big endian or little endian.
  • element_info indicates the type of data (for example, vertices (vertices) indicating points constituting three-dimensional points or three-dimensional polygon data). Note that ply_format_info( ) may include information indicating the number of data.
  • property_num indicates the number of properties (also called components) included in the ply format.
  • ply_format_info( ) includes property_info, property_type, scale, offset, is_property, fill_property, sps_id, and component_id as information for each property.
  • “property_info” indicates the type of property and includes the property name or identifier.
  • the types of properties include each element of position information (x, y, and z coordinates in a Cartesian coordinate system, or distance, horizontal angle, and elevation angle in a polar coordinate system), each element of color information ( R, G, B), reflectance, or other information elements.
  • property_type indicates a data type such as float, int, or unsigned int.
  • scale indicates a scale value used for scaling processing.
  • offset indicates an offset value used for offset processing.
  • ply_format_info( ) contains information for specifying the compressed data of the encoded data. This information is, for example, sps_id, which is an SPS identifier, and component_id, which is a property (component) identifier.
  • the property is associated with the component_id-th attribute information among the plurality of attribute information described in the SPS (sequence parameter set) having the specified sps_id.
  • ply_format_info() includes fill_property.
  • fill_property indicates a value to be restored as the data value of the property upon decoding.
  • ply_format_info( ) may include fill_property for the number of data.
  • the property data is composed of decrypted data specified by sps_id and component_id. If the property is not to be decrypted, the data of the property is configured using the value indicated by fill_property. Also, in the syntax, fill_property indicates an example of a value commonly used for all points included in the point group, but may indicate a different value for each point.
  • ply_format_info() may contain information about other properties as needed.
  • FIG. 46 is a diagram showing a syntax example of las_format_info().
  • las_format_info( ) contains the information contained in the las format header.
  • las_format_info includes public_header_block(), variable_length_records(), point_data_records(), and extended_variable_length_records().
  • public_header_block() contains header information about data stored in Point Data Records in las format.
  • Variable_length_records() contains arbitrary variable length header information stored in Variable Length Records in las format.
  • point_data_records( ) contains information about the position information or attribute information of the point cloud stored in Point Data Records in las format.
  • extended_variable_length_records( ) contains extended information of the end of las format stored in Extended Variable Length Records of las format.
  • FIG. 47 is a diagram showing a syntax example of public_header_block( ).
  • Public_header_block() may contain the information contained in the las format Public Header Block as it is, or may contain a part of it.
  • FIG. 48 is a diagram showing a syntax example of variable_length_records().
  • Variable_length_records( ) may contain the information contained in the las format Variable Length Record Header as it is, or may contain a part of it.
  • FIG. 49 is a diagram showing a syntax example of point_data_records().
  • the point_data_records may contain information about the position information or attribute information of the point cloud contained in the las-format Point Data Records.
  • point_data_records() includes component_num, component_info, scale, offset, is_component, and fill_component.
  • component_num indicates the number of properties (components) at one point included in the las format.
  • point_data_records( ) includes information (component_info, scale, offset, is_component, fill_component) conforming to the Point Data Record Format defined in the las format as information for each property. Note that point_data_records( ) may include information about all properties, or may include information about some properties.
  • component_info is information indicating the correspondence between properties and actual data, and specifically includes property names or property IDs (identifiers).
  • point_data_records( ) may contain information for each property, the number of which is indicated by property_num. For example, this information includes scale, offset, is_component, and fill_component.
  • “scale” indicates a scale value used for scaling processing.
  • offset indicates an offset value used for offset processing.
  • is_component indicates whether the target property is a compression target. If the property of interest is not to be compressed, point_data_records() will contain the fill_component. fill_component indicates a value to be restored as the data value of the property upon decoding. Also, if restoration with different values for each point is required, point_data_records( ) may include fill_components as many as the number of data.
  • point_data_records() may contain information about other properties as needed.
  • FIG. 50 is a diagram showing a syntax example of extended_variable_length_records( ).
  • extended_variable_length_records( ) may contain the information contained in Extended Variable Length Records in las format as it is, or may contain a part of it.
  • FIG. 51 is a block diagram showing the configuration of a three-dimensional data encoding device 13900 according to the first example. It should be noted that FIG. 1 mainly shows a processing unit related to encoding of attribute information, and omits a description of a processing unit related to encoding of position information.
  • the three-dimensional data encoding device 13900 includes a transform section 13901 and an encoding section 13902 .
  • the conversion unit 13901 generates post-conversion attribute information by converting the attribute information included in the point cloud data.
  • the conversion section 13901 includes a scale section 13903 and an attribute information accuracy determination section 13904 .
  • the attribute information accuracy determination unit 13904 determines the accuracy of attribute information to be encoded. Specifically, the attribute information accuracy determination unit 13904 determines valid bits from a plurality of bits included in attribute information, and determines a scale value used for division, multiplication, or bit shift for extracting valid bits. do.
  • a scaling unit 13903 generates post-conversion attribute information by scaling (multiplying or dividing) the attribute information using the scale value determined by the attribute information accuracy determination unit 13904 .
  • Transformation section 13901 also outputs transformation information including scale values to encoding section 13902 .
  • the encoding unit 13902 includes an attribute information encoding unit 13905 and an additional information encoding unit 13906.
  • the attribute information encoding unit 13905 generates encoded attribute information by encoding post-conversion attribute information.
  • Additional information encoding section 13906 generates encoded additional information by encoding transform information including scale values as additional information (also referred to as metadata or control information).
  • the encoded attribute information and encoded additional information are included in a bitstream (also referred to as an encoded bitstream or encoded data) output by the 3D data encoding device 13900 .
  • FIG. 52 is a block diagram showing the configuration of the three-dimensional data decoding device 13910 according to the first example. It should be noted that FIG. 1 mainly shows a processing unit related to decoding of attribute information, and omits a description of a processing unit related to decoding of position information.
  • This three-dimensional data decoding device 13910 decodes a bitstream including encoded attribute information and encoded additional information generated by the three-dimensional data encoding device 13900 shown in FIG. 51, for example.
  • the three-dimensional data decoding device 13910 includes a decoding section 13911 and an inverse transformation section 13912 .
  • the decoding unit 13911 includes an attribute information decoding unit 13913 and an additional information decoding unit 13914.
  • the attribute information decoding unit 13913 generates decoded attribute information by decoding the encoded attribute information.
  • Additional information decoding section 13914 generates transform information including a scale value by decoding the encoded additional information.
  • the inverse transform unit 13912 includes an inverse scale unit 13915.
  • the inverse scaling unit 13915 generates attribute information by inverse scaling the decoded attribute information using the scale value.
  • the conversion unit 13901 examines the positions of valid bits of all attribute information to be encoded (attribute information of points 0 (Point 0) to point n (Point n)).
  • 53 and 54 are diagrams showing examples of processing by the conversion unit 13901.
  • the bit depth of attribute information is 16 bits.
  • the attribute information is represented by 16 bits, with the upper 8 bits being valid bits and the lower 8 bits being invalid bits.
  • bit depth may also be expressed as bit width or bit precision.
  • the bit depth of the attribute information is 16 bits
  • the lower 8 bits are valid bits
  • the upper 8 bits are invalid bits.
  • the low-order 8 bits are validated, and post-conversion attribute information with a bit depth of 8 bits is generated.
  • the conversion unit 13901 may generate attribute information with a bit depth of 8 bits by clipping the upper 8 bits and validating the lower 8 bits without performing scaling processing.
  • the conversion unit 13901 converts attribute information with a bit depth of 16 bits into post-conversion attribute information with 8 bits. Therefore, since the encoding section 13902 can handle values smaller than 16-bit attribute information, compression efficiency can be improved.
  • the conversion unit 13901 does not need to investigate the accuracy of attribute information. For example, when the position of the effective bit of the attribute information is determined by the format standard of the attribute information to be encoded, the conversion unit 13901 may determine the scale value (scale) according to that rule.
  • the conversion unit 13901 converts 16-bit attribute information in which the upper 8 bits are invalid (filled with 0) without changing the bit depth to 8 bits. You may output as post-attribute information.
  • the conversion unit 13901 may further include an offset unit that performs offset processing.
  • the offset processing may be performed after the scaling processing, or may be performed before the scaling processing.
  • the transform unit 13901 may change the scale value for each encoding group unit such as slice, tile, or frame. Also, the conversion unit 13901 may change the scale value for each type of attribute information. This allows the conversion unit 13901 to use different scale values between units or types.
  • 55 and 56 are diagrams showing an example of processing by the inverse transformation unit 13912.
  • the bit depth of the decoding attribute information is 8 bits.
  • the bit depth of the decoding attribute information is 8 bits.
  • the inverse transforming unit 13912 may generate attribute information with a bit depth of 16 bits by validating the lower 8 bits without performing scaling processing.
  • the 3D data decoding device 13910 can decode the bitstream generated by the 3D data encoding device 13900 described above to restore the original attribute information. That is, the 3D data decoding device 13910 can decode a bitstream with improved compression efficiency.
  • the inverse transform unit 13912 included in the 3D data decoding device 13910 performs the inverse transform processing based on the transform information included in the bitstream, thereby converting the transform unit 13901 included in the 3D data encoding device 13900 into You can restore the attribute information before it is converted by .
  • the three-dimensional data decoding device 13910 does not necessarily have to perform the inverse transform process, and may select whether or not to perform the inverse transform process based on the application or use case.
  • the inverse transformation unit 13912 may further include an inverse offset unit that performs inverse offset processing.
  • the inverse scaling process may be performed after or before the inverse offset process.
  • the inverse transform unit 13912 may perform inverse transform by changing the scale value for each encoding group unit such as slice, tile, or frame. Also, the inverse transformation unit 13912 may change the scale value for each type of attribute information. This allows the inverse transform unit 13912 to use different scale values between units or types.
  • FIG. 57 is a diagram showing a syntax example of SPS (sequence parameter set).
  • the SPS includes sps_idx, common_information( ), attribute_type, instance_id, num_dimension, num_attribute_parameter, attribute_parameter(i), and attribute_info( ).
  • sps_idx is an SPS identifier.
  • common_information( ) is additional information related to the entire sequence.
  • attribute_type, instance_id, num_dimension, num_attribute_parameter, and attribute_parameter(i) are set for each attribute information.
  • attribute_type is an identifier of attribute information.
  • instance_id is an identifier for identifying instances of the same attribute type.
  • num_dimension indicates the number of dimensions of attribute information.
  • attribute_info( ) is other additional information related to attribute information.
  • attribute_parameter( ) includes various types of additional information (metadata) for each attribute information and has a general format.
  • num_attribute_parameter indicates the number of attribute_parameter( ).
  • FIG. 58 is a diagram showing a syntax example of attribute_parameter(i).
  • Attr_param_type indicates the type of attribute_parameter.
  • attribute_source_offset indicates an offset value for restoring the decrypted attribute information to the original (source) attribute information.
  • attribute_source_scale indicates a scale value for restoring the decoded attribute information to the original attribute information.
  • attribute_scale_frac_bits indicates the number of bits for expressing the value below the decimal point of the scale value.
  • a scale value is calculated using this attribute_scale_frac_bits and the following (formula a1).
  • FIG. 59 is a flow chart of three-dimensional data encoding processing by the three-dimensional data encoding device 13900.
  • the 3D data encoding device 13900 determines whether or not to convert attribute information (S13901). For example, the 3D data encoding device 13900 determines whether or not to convert attribute information based on an instruction or setting from the outside. Note that the three-dimensional data encoding device 13900 may determine whether to convert the attribute information based on the point cloud data to be encoded, such as the attribute information to be encoded.
  • the 3D data encoding device 13900 investigates the accuracy and valid bit positions of the attribute information of all 3D points included in the point cloud data to be encoded ( S13902). Next, the 3D data encoding device 13900 determines scale values based on the valid bit positions (S13903). Next, the three-dimensional data encoding device 13900 generates post-conversion attribute information by converting the attribute information using the determined scale value (S13904). Next, the 3D data encoding device 13900 stores the transform information including the scale value in the additional information (S13905).
  • the scale value is stored in attribute_source_scale. Specifically, the scale value is indicated using attribute_source_scale_num_bits, attribute_source_scale, and attribute_scale_frac_bits.
  • the 3D data encoding device 13900 encodes the additional information including the scale value and the post-transform attribute information to generate a bitstream (encoded data) (S13906).
  • FIG. 60 is a flowchart of 3D data decoding processing by the 3D data decoding device 13910.
  • the 3D data decoding device 13910 decodes the encoded additional information included in the bitstream and analyzes the obtained additional information (S13911).
  • the 3D data decoding device 13910 generates decoded attribute information by decoding the encoded attribute information included in the bitstream (S13912).
  • the 3D data decoding device 13910 If it is determined that the attribute information should be inversely transformed (Yes in S13913), the 3D data decoding device 13910 generates attribute information by inversely transforming the decoded attribute information based on the scale value included in the additional information (S13914). . On the other hand, when it is determined not to inversely transform the attribute information (No in S13913), the 3D data decoding device 13910 does not inversely transform and outputs the decoded attribute information as attribute information.
  • FIG. 61 is a block diagram showing the configuration of 3D data encoding apparatus 13900A according to the second example. It should be noted that FIG. 1 mainly shows a processing unit related to encoding of attribute information, and omits a description of a processing unit related to encoding of position information.
  • a three-dimensional data encoding device 13900A shown in FIG. 61 differs from the three-dimensional data encoding device 13900 shown in FIG. Specifically, the conversion unit 13901A includes an attribute information resolution determination unit 13904A instead of the attribute information accuracy determination unit 13904.
  • the attribute information resolution determination unit 13904A determines resolution of attribute information to be encoded.
  • the attribute information resolution determination unit 13904A determines how many times the value of the attribute information is that of the reference signal, and determines the scale used for division, multiplication, or bit shift to extract the reference signal.
  • a scaling unit 13903 generates post-conversion attribute information by scaling (multiplying or dividing) the attribute information using the scale value determined by the attribute information resolution determination unit 13904A.
  • Transformation section 13901A also outputs transformation information including scale values to encoding section 13902 .
  • the encoding unit 13902 includes an attribute information encoding unit 13905 and an additional information encoding unit 13906.
  • the attribute information encoding unit 13905 generates encoded attribute information by encoding post-conversion attribute information.
  • Additional information encoding section 13906 generates encoded additional information by encoding transform information including scale values as additional information.
  • the encoded attribute information and encoded additional information are included in the bitstream output by the three-dimensional data encoding device 13900A.
  • the configuration of the 3D data decoding device 13910 that decodes the bitstream generated by the 3D data encoding device 13900A is the same as the configuration shown in FIG.
  • FIG. 62 is a diagram showing an example of processing by conversion section 13901A.
  • the bit depth of the attribute information is 16 bits, and the attribute information is in increments of 255, that is, multiples of 255.
  • the conversion unit 13901A can convert attribute information with a bit depth of 16 bits into post-conversion attribute information with 8 bits. Therefore, since the encoding section 13902 can handle values smaller than 16-bit attribute information, compression efficiency can be improved.
  • the conversion unit 13901A does not need to investigate the accuracy of attribute information. For example, when the resolution of the attribute information is known according to the format standard of the attribute information to be encoded, the conversion section 13901A may determine the scale value (scale) according to the rule.
  • the conversion unit 13901A does not change the bit depth to 8 bits, and converts 16-bit data with the upper 8 bits invalid (filled with 0).
  • the attribute information may be output as attribute information after conversion.
  • the conversion section 13901A may further include an offset section that performs offset processing.
  • the offset processing may be performed after the scaling processing, or may be performed before the scaling processing.
  • the transformation unit 13901A may change the scale value for each coding group unit such as slice, tile, or frame. Also, the conversion unit 13901A may change the scale value for each type of attribute information. This allows the conversion unit 13901A to use different scale values between units or types.
  • the 3D data decoding device 13910 can decode the bitstream generated by the 3D data encoding device 13900A described above to restore the original attribute information. That is, the 3D data decoding device 13910 can decode a bitstream with improved compression efficiency.
  • inverse transform section 13912 included in 3D data decoding apparatus 13910 performs inverse transform processing based on the transform information included in the bitstream, thereby transforming section 13901A included in 3D data encoding apparatus 13900A. You can restore the attribute information before it is converted by .
  • the three-dimensional data decoding device 13910 does not necessarily have to perform the inverse transform processing, and may select whether or not to perform the inverse transform processing based on the application or use case.
  • the inverse transformation unit 13912 may further include an inverse offset unit that performs inverse offset processing.
  • the inverse scaling process may be performed after or before the inverse offset process.
  • the inverse transform unit 13912 may perform inverse transform by changing the scale value for each encoding group unit such as slice, tile, or frame. Also, the inverse transformation unit 13912 may change the scale value for each type of attribute information. This allows the inverse transform unit 13912 to use different scale values between units or types.
  • the 3D data encoding device 13900A may store the value by which the inverse transformation unit 13912 divides the attribute information in the additional information.
  • the procedure of processing by the three-dimensional data encoding device 13900A of the second example is generally the same as the flowchart of the first example shown in FIG.
  • the 3D data encoding device 13900A checks the resolution of the attribute information of all 3D points included in the point cloud data to be encoded.
  • the procedure of processing by the three-dimensional data decoding device 13910 of the second example is the same as the flowchart of the first example shown in FIG.
  • FIG. 64 is a block diagram showing the configuration of a three-dimensional data encoding device 13900B according to the third example. It should be noted that FIG. 1 mainly shows a processing unit related to encoding of attribute information, and omits a description of a processing unit related to encoding of position information.
  • a three-dimensional data encoding device 13900B shown in FIG. 64 differs from the three-dimensional data encoding device 13900 shown in FIG. Specifically, the conversion unit 13901B includes an attribute information range determination unit 13904B instead of the attribute information accuracy determination unit 13904. FIG. In addition, conversion section 13901B further includes offset section 13907 .
  • the attribute information range determination unit 13904B determines the range of values of attribute information to be encoded.
  • the attribute information range determination unit 13904B determines to perform at least one of scaling processing and offset processing when the attribute information value range exceeds the allowable value of the system or when it is determined that the efficiency is poor.
  • the attribute information range determination unit 13904B uses a scale value ( scale) and/or an offset value used for addition or subtraction.
  • the case where the efficiency is poor is the case where the data amount can be reduced by scaling or offset processing, for example, the case where the data amount can be reduced by offset processing as shown in FIG. 67 which will be described later.
  • a scaling unit 13903 generates post-scaling attribute information by scaling (multiplying or dividing) the attribute information using the scale value determined by the attribute information range determination unit 13904B.
  • the offset unit 13907 generates post-conversion attribute information by offsetting (adding or subtracting) post-scaling attribute information using the offset value determined by the attribute information range determination unit 13904B.
  • the encoding unit 13902 includes an attribute information encoding unit 13905 and an additional information encoding unit 13906.
  • the attribute information encoding unit 13905 generates encoded attribute information by encoding post-conversion attribute information.
  • Additional information encoding section 13906 generates encoded additional information by encoding transform information including scale values and offset values as additional information. This encoded attribute information and encoded additional information are included in the bitstream output by the three-dimensional data encoding device 13900B.
  • the configuration of the 3D data decoding device 13910 that decodes the bitstream generated by the 3D data encoding device 13900B is roughly the same as the configuration shown in FIG. However, the inverse transformation unit 13912 generates attribute information by performing inverse offset processing in addition to inverse scaling processing on the decoded attribute information.
  • the conversion unit 13901B investigates the range of values of all attribute information to be encoded (attribute information of point 0 (Point 0) to point n (Point n)).
  • 65, 66, and 67 are diagrams showing examples of processing by the conversion unit 13901B.
  • the bit depth that the system can handle is 16 bits
  • the maximum bit depth of the attribute information is 17 bits. In this case, the bit depth of the attribute information overflows by 1 bit with respect to the range supported by the system.
  • the bit depth that the system can support is unsigned 16 bits and the minimum attribute information value is -5027.
  • the values that the system can handle are unsigned integer values and the attribute information is 64-bit floating point.
  • the three-dimensional data encoding device 13900B can convert attribute information having a value exceeding the range that the system can handle into post-conversion attribute information having a value within the range by scaling processing or offset processing.
  • the three-dimensional data encoding device 13900B may perform scaling processing for the purpose of quantization for reducing the amount of data.
  • the larger the scale value the larger the quantization error in the inverse transform, but the amount of data can be reduced accordingly.
  • the conversion unit 13901B does not investigate the range of the attribute information, and for example, if the range of the attribute information is determined by the format standard of the attribute information to be encoded, the scale value (scale) and the offset value are determined according to the rule. (offset) may be determined.
  • the conversion unit 13901B may perform rounding addition when performing processing equivalent to division in the scaling processing.
  • the rounding addition is rounding, which means rounding down or rounding up the number of bits at the end.
  • the transformation unit 13901B may change the scale value and the offset value for each encoding group unit such as slice, tile, or frame. Also, the conversion unit 13901B may change the scale value and the offset value for each type of attribute information. This allows the conversion unit 13901B to use different scale values between units or types.
  • FIG. 68, 69, and 70 are diagrams showing examples of processing by the inverse transforming unit 13912.
  • FIG. 68 the inverse transform unit 13912 inverse transforms the decoding attribute information with a bit depth of 16 bits using the scale value included in the additional information.
  • the bit depth of the decoding attribute information is unsigned 16 bits.
  • the bit depth of the decoding attribute information is unsigned 8 bits.
  • the three-dimensional data encoding device 13900B can convert attribute information having a value exceeding the range that the system can handle into post-conversion attribute information having a value within the range by scaling processing or offset processing. Also, the 3D data decoding device 13910 can decode the compressed data generated in this way and restore the original attribute information.
  • scaling processing may be performed for the purpose of quantization to reduce the amount of data.
  • the larger the scale value the larger the quantization error of the attribute information restored by inverse transformation, but the amount of data can be reduced accordingly.
  • the inverse transform unit 13912 included in the three-dimensional data decoding device 13910 performs the inverse transform processing based on the transform information included in the bitstream, thereby converting the transform unit 13901B included in the three-dimensional data encoding device 13900B into You can restore the attribute information before it is converted by .
  • the three-dimensional data decoding device 13910 does not necessarily have to perform the inverse transform processing, and may select whether or not to perform the inverse transform processing based on the application or use case.
  • the three-dimensional data decoding device 13910 may perform rounding addition when performing processing equivalent to division in the inverse scaling processing.
  • the inverse scaling process may be performed after or before the inverse offset process.
  • the inverse transform unit 13912 may perform inverse transform by changing the scale value and the offset value for each encoding group unit such as slice, tile, or frame. Also, the inverse transformation unit 13912 may change the scale value and the offset value for each type of attribute information. This allows the inverse transform unit 13912 to use different scale values and offset values between units or types.
  • FIG. 71 is a block diagram showing the configuration of a 3D data encoding device 13900C according to the fourth example. It should be noted that FIG. 1 mainly shows a processing unit related to encoding of attribute information, and omits a description of a processing unit related to encoding of position information.
  • a three-dimensional data encoding device 13900C shown in FIG. 71 differs from the three-dimensional data encoding device 13900 shown in FIG. Specifically, conversion section 13901C further includes scale value generation section 13908 .
  • the scale value generation unit 13908 generates the integer part of the scale value, the decimal part of the scale value, and the scale value from the decimal precision scale value determined by the attribute information accuracy determination unit 13904. and a fractional offset value of .
  • the additional information encoding unit 13906 converts the transform information including the value of the integer part of the scale value, the value of the decimal part of the scale value, and the offset value of the decimal part of the scale value generated by the scale value generation unit 13908 into Encode as additional information. This additional information is included in the bitstream and sent to the 3D data decoding device.
  • FIG. 72 is a block diagram showing the configuration of a 3D data decoding device 13910C according to the fourth example. It should be noted that FIG. 1 mainly shows a processing unit related to decoding of attribute information, and omits a description of a processing unit related to decoding of position information.
  • a three-dimensional data decoding device 13910C shown in FIG. 72 differs from the three-dimensional data decoding device 13910 shown in FIG.
  • the side information decoding unit 13914 decodes the encoded side information included in the bitstream to generate the value of the integer part of the scale value, the value of the decimal part of the scale value, and the offset value of the decimal part of the scale value. .
  • the inverse scale unit 13915C calculates the scale value using the value of the integer part of the scale value, the value of the decimal part of the scale value, and the offset value of the decimal part of the scale value.
  • the inverse scaling unit 13915C generates attribute information by inverse scaling the decoded attribute information using the calculated scale value.
  • Three-dimensional data encoding device 13900C and three-dimensional data decoding device 13910C may always use either one of (formula a2) and (formula a3), or one of (formula a2) and (formula a3) may be selectively used while switching between .
  • the three-dimensional data encoding device 13900C can encode attribute information with a bit depth of 16 bits as attribute information with a bit depth of 8 bits. can handle. Therefore, compression efficiency can be improved. Also, the 3D data encoding device 13900C can perform scaling that requires precision after the decimal point.
  • the 3D data decoding device 13910C can decode the bitstream generated by the 3D data encoding device 13900C to restore the original attribute information.
  • an inverse transform unit 13912C included in the three-dimensional data decoding device 13910C performs inverse transform processing on the decoding attribute information based on the transform information included in the bitstream. Attribute information before being converted by the conversion unit 13901C can be restored. Note that the three-dimensional data decoding device 13910C does not necessarily have to perform the inverse transform processing, and may select whether or not to perform the inverse transform processing based on the application or use case.
  • the three-dimensional data encoding device 13900C includes an attribute information accuracy determination unit 13904 that determines the accuracy of attribute information to be encoded, as in the first example, has been described.
  • the encoding device 13900C includes an attribute information resolution determination unit 13904A for determining the resolution of attribute information or an attribute information range determination unit 13904B for determining the range of attribute information. You may prepare.
  • the conversion unit 13901C does not have to investigate the accuracy of the attribute information. For example, when the position of the effective bit of the attribute information is determined by the format standard of the attribute information to be encoded, the conversion unit 13901C may determine the scale value (scale) according to that rule.
  • the conversion unit 13901C does not change the bit depth to 8 bits, and converts 16-bit attribute information with the upper 8 bits invalid (filled with 0).
  • the information may be output as post-conversion attribute information.
  • the conversion unit 13901C may further include an offset unit that performs offset processing.
  • the offset processing may be performed after the scaling processing, or may be performed before the scaling processing.
  • the conversion unit 13901C may change the scale value for each encoding group unit such as slice, tile, or frame. Also, the conversion unit 13901C may change the scale value for each type of attribute information. This allows the conversion unit 13901C to use different scale values between units or types.
  • the inverse transformation unit 13912C may further include an inverse offset unit that performs inverse offset processing.
  • the inverse scaling process may be performed after or before the inverse offset process.
  • the inverse transform unit 13912C may perform inverse transform by changing the scale value for each encoding group unit such as slice, tile, or frame. Also, the inverse transformation unit 13912C may change the scale value for each type of attribute information. This allows the inverse transform unit 13912C to use different scale values between units or types.
  • any two or more of the above first to fourth examples may be combined. Any one of the first to fourth examples, or a combination of two or more of them, may be applied to the position information as well as the attribute information. In that case, conversion information, format information, or the like may be stored in additional information such as SPS.
  • the conversion unit may include an offset calculation unit.
  • This offset calculation unit calculates the offset component of the entire attribute information.
  • the conversion unit may offset the attribute information using the calculated offset component (offset value).
  • the offset calculation unit may calculate the offset component using the attribute information of the entire sequence, or may calculate the offset component for each frame using the attribute information within the frame.
  • FIG. 73 is a diagram showing an example of this conversion processing.
  • the attribute information is 64-bit time information (time)
  • the conversion unit calculates the offset component for each frame.
  • time 64-bit time information
  • one piece of time information gps_time
  • the time may be the time when the point was acquired, or the difference information with respect to any reference.
  • the conversion unit calculates a component 2312000 common to all points as an offset component, and offsets the offset component from the scaled time information. do.
  • the attribute information can be converted into 8-bit positive information.
  • the code amount can be reduced.
  • the three-dimensional data encoding device may encode the difference value from the previous point when the value of the attribute information simply increases or decreases. Further, when all the difference values are the same, the three-dimensional data encoding device may include the difference value in the additional information and transmit it, and may not encode the attribute information for each point. That is, the 3D data encoding device does not have to include the attribute information for each point in the bitstream.
  • Attribute_parameter(i) which is transform information including the scale value and the offset value described above, may be stored in the SPS or may be stored outside the SPS.
  • conversion information may be stored in APS (Attribute Parameter Set), which is a parameter set related to encoding of attribute information, or other additional information such as a slice header.
  • APS Attribute Parameter Set
  • SEI Supplemental Enhancement Information
  • FIG. 74 is a diagram showing a syntax example of SPS and SEI.
  • SEI includes sps_idx and attribute_idx. This allows the three-dimensional data decoding device to identify the corresponding attribute information described in the SPS.
  • attribute_idx indicates that the SEI corresponds to the attribute information indicated in the attribute_idx-th position among the plurality of attribute information indicated in the numAttribute loop in the SPS.
  • the attribute information to be processed corresponding to the SEI is attribute_idx-th among a plurality of pieces of attribute information indicated in the numAttribute loop in the SPS having the same sps_idx as sps_idx included in the SEI. It can be determined that it corresponds to the indicated attribute information.
  • the conversion information when conversion information is included in the SPS, the conversion information is applied to the values of the attribute information of the entire sequence. Further, when conversion information exists in the frame-level (per frame) SEI provided for each frame, the conversion information may be applied to the value of the attribute information of the frame.
  • the transform information indicated in the frame-level SEI may be applied.
  • both transform information indicated in the SPS and frame-level SEI may be applied. That is, the frame-level SEI including transform information may not be provided for all frames, and may be provided for some frames.
  • the 3D data encoding apparatus may store in the bitstream information specifying which of the SPS and the frame-level SEI is to be prioritized.
  • the three-dimensional data encoding device may set the transform coefficient of the SPS to an average value of multiple transform coefficients of multiple SEIs.
  • the three-dimensional data decoding device when the SEI is lost due to a communication error or the like, processing can be performed using the conversion information included in the SPS instead. Also, the difference in processing results in this case can be reduced.
  • a PS may be provided with levels, such as a frame-level PS (parameter set), a sequence-level PS, and a PCC sequence-level PS containing multiple sequences.
  • levels such as a frame-level PS (parameter set), a sequence-level PS, and a PCC sequence-level PS containing multiple sequences.
  • the PCC sequence level is the highest level
  • the sequence level is the next highest level
  • the frame level is the lowest.
  • the transformation information may be stored in the following manner.
  • the default conversion information is stored in the higher PS.
  • Default transform information is transform information that is used when no transform information is used in the lower PS, and is transform information that may be used in a plurality of lower levels. Also, if the transform information used at the lower level is different from the transform information contained in the higher PS, the transform information is stored in the lower PS.
  • the conversion information may not be stored in the upper PS, and the conversion information may be stored in the lower PS.
  • the information included in the lower PS may be included in the higher PS.
  • the lower PS and the higher PS overlap one of them may not be included in the bitstream.
  • FIG. 75 is a diagram showing a storage example of conversion information.
  • the example shown in (A) of FIG. 75 is an example in which conversion information exists in the SPS and no conversion information exists in the SEI.
  • val1 is the decoded value (decoded attribute information) of the attribute information belonging to the frame 1
  • val2 is the decoded value (decoded attribute information) of the attribute information belonging to the frame 2.
  • FIG. Conversion attribute information val1' and val2' can be derived as follows using SPS conversion information (offset@sps and scale@sps).
  • val1′ val1 ⁇ scale@sps+offset@sps
  • val2′ val2 ⁇ scale@sps+offset@sps
  • (B) of FIG. 75 is an example in which there is no conversion information in the SPS and there is conversion information in the frame-level SEI.
  • the post-transformation attribute information val1' and val2' are obtained using the transformation information of SEI1 of frame 1 (offset@SEI1 and scale@SEI1) and the transformation information of SEI2 of frame 2 (offset@SEI2 and scale@SEI2). , can be derived as follows.
  • val1′ val1 ⁇ scale@SEI1+offset@SEI1
  • the post-conversion attribute information val1' and val2' can be derived as follows using the SEI conversion information.
  • val1′ val1 ⁇ scale@SEI1+offset@SEI1
  • val2′ val2 ⁇ scale@SEI2+offset@SEI2
  • post-conversion attribute information val1′ and val2′ can be derived as follows using both SPS conversion information and SEI conversion information.
  • val1′ val1 ⁇ scale@SPS ⁇ scale@SEI1+offset@SPS+offset@SEI1
  • the SPS contains sequence-level transform information, and the presence or absence of frame-level transform information changes for each frame.
  • the frame-level transform information is applied to the value of the attribute information of the frame.
  • sequence-level transform information is applied to frames for which frame-level transform information does not exist.
  • post-transformation attribute information val1' of frame 1 in which frame-level SEI does not exist can be derived as follows using SPS transformation information.
  • val1′ val1 ⁇ scale@SPS+offset@SPS
  • post-conversion attribute information val2′ of frame 2 in which frame-level SEI exists can be derived as follows using SEI conversion information.
  • val2′ val2 ⁇ scale@SEI2+offset@SEI2
  • val2′ val2 ⁇ scale@SEI2+offset@SEI2
  • One or more defined points when it is specified that the encoding unit and decoding unit of position information or attribute information correspond to encoding in a predetermined format (e.g., a positive integer with a predetermined number of bits or less)
  • a predetermined format e.g., a positive integer with a predetermined number of bits or less
  • the 3D data encoding device notifies format information to the 3D data decoding device using SEI, which is extended information.
  • SEI which is extended information.
  • the three-dimensional data decoding device can use this format information to convert the decoded point cloud data into point cloud data in the point cloud format before encoding.
  • the 3D data encoding device stores format information for each point cloud format in the SEI, so that the 3D data decoding device can Point cloud data of each point cloud format can be restored.
  • the attribute information to be encoded is a normal vector, data in the range of -1.000 to 1.000, and three-dimensional data expressed in floating point
  • the conversion unit 13901B and the inverse conversion unit 13912 of the third example the attribute information is converted into a positive integer and the post-conversion attribute information is inversely converted.
  • point cloud data compression processing by the encoding unit 13902 and the decoding unit 13911 corresponding to processing of positive integers of a predetermined number of bits or less can be realized.
  • the transformation unit 13901 and the inverse transformation unit 13912 of the first example are used.
  • point cloud data compression processing using the encoding unit 13902 and the decoding unit 13911 can be realized.
  • the attribute information to be encoded is data composed of multiples of a predetermined value, such as reflectance in which all values of the attribute information are multiples of 255
  • the conversion unit of the second example By using the 13901A and the inverse transforming unit 13912, the point cloud data compression processing using the encoding unit 13902 and the decoding unit 13911 can be realized.
  • the encoding unit and decoding unit Compression processing of point cloud data using can be realized.
  • the three-dimensional data encoding device can reduce the amount of information by offsetting the reference value and encoding the difference from the reference value.
  • the present embodiment is useful for applications that convert various point cloud data formats and encode and decode using general-purpose encoding and decoding methods.
  • AR Augmented Reality
  • VR Virtual Reality
  • a plurality of pieces of color information and normal vectors may be added to points as attribute information. Therefore, by using the above method, efficient decoding and encoding can be realized.
  • the decoded point cloud data is presented using a presentation device such as a display device, the three-dimensional data decoding device outputs the decoded point cloud data as it is to the presentation device without inverse transformation. good too.
  • format information or conversion information is stored in additional information (various parameter sets, SEI, inventory, etc.) included in a bitstream
  • the method of storing format information or conversion information is not limited to this.
  • the format information or transform information may not be stored in the bitstream, but sent to the 3D data decoding device in some other form.
  • a three-dimensional data encoding device stores conversion information in data or a file in a data format defined separately by another standard, separate from the bitstream, and stores this data or file together with the bitstream in a three-dimensional data format. It may be sent to a data decoding device. At that time, the transmission means used for the bitstream and the data containing the conversion information may be the same or different.
  • the 3D data encoding device does not have to send the transform information to the 3D data decoding device.
  • the 3D data encoding device may send an identifier instead of transform information to the 3D data decoding device.
  • the three-dimensional data encoding device performs the processing shown in FIG.
  • the 3D data encoding device transforms the attribute information of the 3D points of at least one of the frames constituting the sequence (S13921), and encodes the transformed attribute information to generate a bitstream. Generate (S13922).
  • the bitstream further comprises at least one first parameter of the transform provided for the sequence (e.g. offset, scale@sps in FIG. 75) and at least one of the transform provided for each of the at least one frame. and one second parameter (for example, offset, scale@SEI1, offset, scale@SEI2 in FIG. 75). That is, the 3D data encoding device stores the first parameter and the second parameter in the bitstream.
  • the bitstream includes first control information (for example, SPS) in sequence units and second control information (frame-level SEI) in frame units.
  • the first control information includes at least one first parameter and the second control information includes at least one second parameter. That is, the 3D data encoding device stores the first parameter in the first control information and stores the second parameter in the second control information.
  • the 3D data encoding device can, for example, selectively control the switching of transformation parameters both in units of sequences and in units of frames. As a result, appropriate conversion processing can be performed, and coding efficiency can be improved.
  • the first control information including at least one first parameter and the second control information including at least one second parameter may be set for each property (component).
  • the 3D data encoding device stores the second parameter, which is the parameter used for conversion processing of the attribute information of the frame to be processed, in the bitstream.
  • the three-dimensional data encoding device may perform transformation on a part of a plurality of frames and may not perform transformation on other frames.
  • the three-dimensional data encoding device can reduce the amount of code by performing the conversion on the frame for which the amount of code would be large if the conversion is not performed.
  • the three-dimensional data encoding device can reduce the amount of processing and improve the processing speed by not performing conversion on frames that are less effective in reducing the amount of code.
  • the 3D data encoding device can improve the processing speed while improving the compression efficiency.
  • the 3D data encoding device stores the first parameter, which is the parameter used in the conversion process of the attribute information of the frame to be processed, in the bitstream.
  • the transform involves multiplying or dividing the attribute information by a first value (eg, scale) and/or adding or subtracting a second value (eg, offset).
  • a first value e.g. scale
  • a second value e.g. offset
  • the bitstream includes multiple types of information of 3D points including attribute information, and the bitstream further includes first information (e.g., is_property) indicating whether each of the multiple types of information is to be compressed.
  • first information e.g., is_property
  • the 3D data encoding device stores information of a plurality of types of 3D points and the first information in the bitstream.
  • multiple types of information include multiple types of attribute information of three-dimensional points (for example, multiple types of color information, normal vectors, etc.).
  • the multiple types of information include positional information of three-dimensional points.
  • the three-dimensional data encoding device may determine whether to convert the attribute information according to the type of the attribute information.
  • the three-dimensional data encoding device may determine whether to convert the attribute information according to the data size of the attribute information.
  • the multiple types of attribute information include first attribute information and second attribute information, and the 3D data encoding device converts the first attribute information and does not convert the second attribute information.
  • the three-dimensional data encoding device can adaptively switch whether or not to convert the attribute information according to the type of the attribute information.
  • the three-dimensional data encoding device determines whether to compress the attribute information according to the type of the attribute information.
  • the three-dimensional data encoding device may determine whether to compress the attribute information according to the data size of the attribute information.
  • the multiple types of attribute information include first attribute information and second attribute information, and the 3D data encoding device compresses the first attribute information and does not compress the second attribute information.
  • the three-dimensional data encoding device can adaptively switch between compressing and not compressing the attribute information according to the type of the attribute information.
  • the bitstream includes first control information (eg, SPS) for the sequence, the first control information includes information indicating a list of multiple types of information, and the first information specifies the first control information.
  • first control information eg, SPS
  • the first control information includes information indicating a list of multiple types of information
  • the first information specifies the first control information.
  • the bitstream further includes second information (for example, format_id) indicating the format type of the point cloud data including the attribute information. That is, the 3D data encoding device stores the second information in the bitstream.
  • second information for example, format_id
  • a three-dimensional data encoding device includes a processor and memory, and the processor uses the memory to perform the above processing.
  • the three-dimensional data decoding device performs the processing shown in FIG.
  • the 3D data decoding device decodes the bitstream to generate decoding attribute information (S13931), and inversely transforms the decoding attribute information to obtain the 3D data of at least one of the plurality of frames constituting the sequence. Attribute information of the original point is generated (S13932).
  • the bitstream further includes at least one first parameter of the inverse transform provided for the sequence (e.g. offset, scale@sps in FIG. 75) and at least one inverse transform provided for each of the frames. and at least one second parameter (eg, offset, scale@SEI1, offset, scale@SEI2 in FIG. 75). That is, the 3D data encoding device stores the first parameter and the second parameter in the bitstream. For example, the 3D data decoding device performs inverse transform using at least one first parameter or at least one second parameter.
  • the bitstream includes first control information (for example, SPS) in sequence units and second control information (frame-level SEI) in frame units.
  • the first control information includes at least one first parameter and the second control information includes at least one second parameter. That is, the three-dimensional data decoding device acquires the first parameter from the first control information and acquires the second parameter from the second control information.
  • the 3D data decoding device can decode the attribute information from the bitstream with improved coding efficiency.
  • the 3D data decoding device uses the second parameter to inverse transform the decoding attribute information of the frame to be processed.
  • the 3D data decoding device uses the first parameter to inverse transform the decoding attribute information of the frame to be processed.
  • the decoding attribute information includes at least one of multiplication or division of a first value (eg, scale) and addition or subtraction of a second value (eg, offset). wherein each of the at least one first parameter and the at least one second parameter is indicative of at least one of the first value or the second value.
  • the bitstream includes multiple types of information of 3D points including attribute information, and the bitstream further includes first information (e.g., is_property) indicating whether each of the multiple types of information is to be compressed.
  • first information e.g., is_property
  • the 3D data decoding device uses the first information to decode or acquire at least one of a plurality of types of 3D point information from the bitstream.
  • multiple types of information include multiple types of attribute information of three-dimensional points (for example, multiple types of color information, normal vectors, etc.).
  • the multiple types of information include positional information of three-dimensional points.
  • the bitstream includes first control information (eg, SPS) for the sequence, the first control information includes information indicating a list of multiple types of information, and the first information specifies the first control information.
  • first control information eg, SPS
  • the first control information includes information indicating a list of multiple types of information
  • the first information specifies the first control information.
  • bitstream further includes second information (for example, format_id) indicating the format type of the point cloud data including the attribute information.
  • second information for example, format_id
  • a three-dimensional data decoding device includes a processor and memory, and the processor uses the memory to perform the above processing.
  • each processing unit included in the three-dimensional data encoding device, the three-dimensional data decoding device, etc. according to the above embodiments is typically realized as an LSI, which is an integrated circuit. These may be made into one chip individually, or may be made into one chip so as to include part or all of them.
  • circuit integration is not limited to LSIs, and may be realized with dedicated circuits or general-purpose processors.
  • An FPGA Field Programmable Gate Array
  • a reconfigurable processor that can reconfigure the connections and settings of the circuit cells inside the LSI may be used.
  • each component may be configured with dedicated hardware or implemented by executing a software program suitable for each component.
  • Each component may be realized by reading and executing a software program recorded in a recording medium such as a hard disk or a semiconductor memory by a program execution unit such as a CPU or processor.
  • the present disclosure may be implemented as a three-dimensional data encoding method, a three-dimensional data decoding method, or the like executed by a three-dimensional data encoding device, a three-dimensional data decoding device, or the like.
  • the division of functional blocks in the block diagram is an example, and a plurality of functional blocks can be realized as one functional block, one functional block can be divided into a plurality of functional blocks, and some functions can be moved to other functional blocks.
  • single hardware or software may process the functions of a plurality of functional blocks having similar functions in parallel or in a time division manner.
  • each step in the flowchart is executed is for illustrative purposes in order to specifically describe the present disclosure, and orders other than the above may be used. Also, some of the above steps may be executed concurrently (in parallel) with other steps.
  • the present disclosure can be applied to a 3D data encoding device and a 3D data decoding device.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
PCT/JP2022/009732 2021-03-09 2022-03-07 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 Ceased WO2022191132A1 (ja)

Priority Applications (5)

Application Number Priority Date Filing Date Title
JP2023505546A JP7785744B2 (ja) 2021-03-09 2022-03-07 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置
CN202280019188.7A CN116917943A (zh) 2021-03-09 2022-03-07 三维数据编码方法、三维数据解码方法、三维数据编码装置、以及三维数据解码装置
US18/242,729 US12432382B2 (en) 2021-03-09 2023-09-06 Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device
US19/320,638 US20260006246A1 (en) 2021-03-09 2025-09-05 Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device
JP2025227277A JP2026031682A (ja) 2021-03-09 2025-12-03 三次元データ変換方法、三次元データ逆変換方法、三次元データ変換装置及び三次元データ逆変換装置

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202163158607P 2021-03-09 2021-03-09
US63/158,607 2021-03-09

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/242,729 Continuation US12432382B2 (en) 2021-03-09 2023-09-06 Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device

Publications (1)

Publication Number Publication Date
WO2022191132A1 true WO2022191132A1 (ja) 2022-09-15

Family

ID=83226703

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/009732 Ceased WO2022191132A1 (ja) 2021-03-09 2022-03-07 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置

Country Status (4)

Country Link
US (2) US12432382B2 (https=)
JP (2) JP7785744B2 (https=)
CN (1) CN116917943A (https=)
WO (1) WO2022191132A1 (https=)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024244900A1 (zh) * 2023-05-31 2024-12-05 腾讯科技(深圳)有限公司 点云处理方法、装置及计算机设备、存储介质
WO2025187250A1 (ja) * 2024-03-06 2025-09-12 ソニーセミコンダクタソリューションズ株式会社 符号化方法、復号化方法及び情報処理システム
WO2026018778A1 (ja) * 2024-07-19 2026-01-22 ソニーセミコンダクタソリューションズ株式会社 符号化方法および復号化方法、ならびに、情報処理システム

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021261499A1 (ja) * 2020-06-22 2021-12-30 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置
WO2025217923A1 (zh) * 2024-04-19 2025-10-23 上海交通大学 编解码方法、码流、编解码器以及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020101021A1 (ja) * 2018-11-16 2020-05-22 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置
WO2020162495A1 (ja) * 2019-02-05 2020-08-13 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置
WO2020189709A1 (ja) * 2019-03-18 2020-09-24 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置
WO2021002444A1 (ja) * 2019-07-02 2021-01-07 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2013282645B2 (en) * 2012-06-26 2016-05-19 Nec Corporation Video encoding device, video decoding device, video encoding method, video decoding method, and program
CN104246831B (zh) 2012-07-30 2016-12-28 三菱电机株式会社 地图显示装置
US20170214943A1 (en) * 2016-01-22 2017-07-27 Mitsubishi Electric Research Laboratories, Inc. Point Cloud Compression using Prediction and Shape-Adaptive Transforms
US10223810B2 (en) * 2016-05-28 2019-03-05 Microsoft Technology Licensing, Llc Region-adaptive hierarchical transform and entropy coding for point cloud compression, and corresponding decompression
CN107403456B (zh) * 2017-07-28 2019-06-18 北京大学深圳研究生院 一种基于kd树和优化图变换的点云属性压缩方法
EP3937132B1 (en) * 2018-04-09 2025-05-28 BlackBerry Limited Methods and devices for binary entropy coding of point clouds
US10984541B2 (en) * 2018-04-12 2021-04-20 Samsung Electronics Co., Ltd. 3D point cloud compression systems for delivery and access of a subset of a compressed 3D point cloud
US10964102B2 (en) * 2018-07-09 2021-03-30 Sony Corporation Adaptive sub-band based coding of hierarchical transform coefficients of three-dimensional point cloud
US10979730B2 (en) * 2019-03-20 2021-04-13 Tencent America LLC Techniques and apparatus for interframe point cloud attribute coding
US11069023B2 (en) * 2019-05-24 2021-07-20 Nvidia Corporation Techniques for efficiently accessing memory and avoiding unnecessary computations
US12026922B2 (en) * 2020-06-26 2024-07-02 Qualcomm Incorporated Attribute parameter coding for geometry-based point cloud compression

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020101021A1 (ja) * 2018-11-16 2020-05-22 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置
WO2020162495A1 (ja) * 2019-02-05 2020-08-13 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置
WO2020189709A1 (ja) * 2019-03-18 2020-09-24 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置
WO2021002444A1 (ja) * 2019-07-02 2021-01-07 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024244900A1 (zh) * 2023-05-31 2024-12-05 腾讯科技(深圳)有限公司 点云处理方法、装置及计算机设备、存储介质
WO2025187250A1 (ja) * 2024-03-06 2025-09-12 ソニーセミコンダクタソリューションズ株式会社 符号化方法、復号化方法及び情報処理システム
WO2026018778A1 (ja) * 2024-07-19 2026-01-22 ソニーセミコンダクタソリューションズ株式会社 符号化方法および復号化方法、ならびに、情報処理システム

Also Published As

Publication number Publication date
US12432382B2 (en) 2025-09-30
US20230421814A1 (en) 2023-12-28
JP7785744B2 (ja) 2025-12-15
US20260006246A1 (en) 2026-01-01
CN116917943A (zh) 2023-10-20
JP2026031682A (ja) 2026-02-24
JPWO2022191132A1 (https=) 2022-09-15

Similar Documents

Publication Publication Date Title
JP7717924B2 (ja) データ符号化方法、及びデータ復号方法
JP7711155B2 (ja) 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置
JP7612798B2 (ja) 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置
JP7785744B2 (ja) 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置
JP7775378B2 (ja) 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置
JP7758839B2 (ja) 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置
JP7608649B2 (ja) 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置
WO2021141094A1 (ja) 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置
JP7747852B2 (ja) 三次元データ格納方法、三次元データ取得方法、三次元データ格納装置、及び三次元データ取得装置
WO2020032248A1 (ja) 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置
EP4429250A1 (en) Point cloud data transmission device and method, and point cloud data reception device and method
WO2025205261A1 (ja) 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置及び三次元データ復号装置
CN119013699A (zh) 解码方法、编码方法、解码装置以及编码装置
WO2025205259A1 (ja) 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置及び三次元データ復号装置
WO2025205248A1 (ja) 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置及び三次元データ復号装置
WO2023105953A1 (ja) 三次元データ復号方法、三次元データ符号化方法、三次元データ復号装置、及び三次元データ符号化装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22767088

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023505546

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 202280019188.7

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22767088

Country of ref document: EP

Kind code of ref document: A1