WO2019235366A1 - 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 - Google Patents
三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 Download PDFInfo
- Publication number
- WO2019235366A1 WO2019235366A1 PCT/JP2019/021636 JP2019021636W WO2019235366A1 WO 2019235366 A1 WO2019235366 A1 WO 2019235366A1 JP 2019021636 W JP2019021636 W JP 2019021636W WO 2019235366 A1 WO2019235366 A1 WO 2019235366A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- encoding
- data
- information
- dimensional
- dimensional data
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 235
- 238000013139 quantization Methods 0.000 claims description 116
- 238000006243 chemical reaction Methods 0.000 claims description 58
- 238000012545 processing Methods 0.000 description 199
- 238000010586 diagram Methods 0.000 description 132
- 230000005540 biological transmission Effects 0.000 description 90
- 238000013519 translation Methods 0.000 description 66
- 230000008569 process Effects 0.000 description 59
- 238000004891 communication Methods 0.000 description 47
- 230000002829 reductive effect Effects 0.000 description 31
- 230000003068 static effect Effects 0.000 description 31
- 238000001514 detection method Methods 0.000 description 24
- 238000003070 Statistical process control Methods 0.000 description 19
- 230000006835 compression Effects 0.000 description 19
- 238000007906 compression Methods 0.000 description 19
- 230000008859 change Effects 0.000 description 18
- 230000010365 information processing Effects 0.000 description 18
- 238000012986 modification Methods 0.000 description 18
- 230000004048 modification Effects 0.000 description 18
- 238000004364 calculation method Methods 0.000 description 16
- 230000002159 abnormal effect Effects 0.000 description 15
- 239000000284 extract Substances 0.000 description 14
- 238000012544 monitoring process Methods 0.000 description 14
- 230000004044 response Effects 0.000 description 12
- 238000013500 data storage Methods 0.000 description 10
- 238000000605 extraction Methods 0.000 description 9
- 239000013598 vector Substances 0.000 description 9
- 230000002441 reversible effect Effects 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 230000015572 biosynthetic process Effects 0.000 description 6
- 238000003786 synthesis reaction Methods 0.000 description 6
- 230000010485 coping Effects 0.000 description 5
- 230000005856 abnormality Effects 0.000 description 4
- 230000006837 decompression Effects 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 238000012935 Averaging Methods 0.000 description 3
- 241000282412 Homo Species 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 3
- 239000000470 constituent Substances 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 238000012937 correction Methods 0.000 description 3
- 230000007423 decrease Effects 0.000 description 3
- 230000002093 peripheral effect Effects 0.000 description 3
- 238000002310 reflectometry Methods 0.000 description 3
- 238000009877 rendering Methods 0.000 description 3
- 230000002194 synthesizing effect Effects 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 210000005261 ventrolateral medulla Anatomy 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000010295 mobile communication Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 230000001131 transforming effect Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 239000013643 reference control Substances 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
- G06T17/20—Finite element generation, e.g. wire-frame surface description, tesselation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/001—Model-based coding, e.g. wire frame
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/004—Predictors, e.g. intraframe, interframe coding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/005—Statistical coding, e.g. Huffman, run length coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/13—Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/184—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/91—Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/40—Tree coding, e.g. quadtree, octree
Definitions
- FIG. 55 is a diagram showing an example of the occupancy code of the parent node according to the eighth embodiment.
- FIG. 56 is a block diagram of a 3D data encoding apparatus according to Embodiment 8.
- FIG. 57 is a block diagram of the three-dimensional data decoding apparatus according to the eighth embodiment.
- FIG. 58 is a flowchart of the three-dimensional data encoding process according to the eighth embodiment.
- FIG. 59 is a flowchart of the three-dimensional data decoding process according to the eighth embodiment.
- FIG. 60 is a diagram illustrating an example of coding table switching according to the eighth embodiment.
- FIG. 61 is a diagram showing a reference relationship in the space area according to the first modification of the eighth embodiment.
- a different encoding table may be used for each bit of the binary data.
- the binary data when the prediction residual is smaller than a threshold, the binary data is generated by binarizing the prediction residual with a fixed number of bits, and the prediction residual is greater than or equal to the threshold. If there is, generate the binary data including a first code of the fixed bit number indicating the threshold, and a second code obtained by binarizing a value obtained by subtracting the threshold from the prediction residual with an exponential Golomb,
- different arithmetic coding methods may be used for the first code and the second code.
- the three-dimensional data decoding method can appropriately decode a bit stream with improved encoding efficiency.
- the three-dimensional space is divided into spaces (SPC) corresponding to pictures in coding of moving images, and three-dimensional data is coded in units of spaces.
- the space is further divided into volumes (VLM) corresponding to macroblocks or the like in video encoding, and prediction and conversion are performed in units of VLM.
- the volume includes a plurality of voxels (VXL) that are minimum units with which position coordinates are associated.
- prediction is similar to the prediction performed in a two-dimensional image, refers to other processing units, generates predicted three-dimensional data similar to the processing unit to be processed, and generates the predicted three-dimensional data and the processing target.
- the difference from the processing unit is encoded.
- This prediction includes not only spatial prediction that refers to other prediction units at the same time but also temporal prediction that refers to prediction units at different times.
- the encoding device may encode the static object and the dynamic object independently of each other and superimpose the dynamic object on the world composed of the static objects.
- the dynamic object is composed of one or more SPCs, and each SPC is associated with one or more SPCs constituting a static object on which the SPC is superimposed.
- a dynamic object may be represented by one or more VLMs or VXLs instead of SPCs.
- the GOS size is fixed, and the encoding device stores the size as meta information. Further, the size of the GOS may be switched depending on, for example, whether it is an urban area or whether it is indoor or outdoor. That is, the size of the GOS may be switched according to the amount or nature of the object that is valuable as information. Alternatively, the encoding device may adaptively switch the GOS size or the I-SPC interval in the GOS in accordance with the object density or the like in the same world. For example, the higher the object density, the smaller the size of the GOS and the shorter the I-SPC interval in the GOS.
- This meta information is generated by the three-dimensional data encoding apparatus 100 and is included in the encoded three-dimensional data 112 (211).
- the encoding device or the decoding device uses GPS, route information, zoom magnification, etc. You may encode or decode GOS or SPC included in the space specified based on it.
- the decoding device may perform decoding in order from a space close to the self-position or the travel route.
- the encoding device or the decoding device may encode or decode a space far from its own position or travel route with a lower priority than a close space.
- lowering the priority means lowering the processing order, lowering the resolution (processing by thinning out), or lowering the image quality (increasing encoding efficiency, for example, increasing the quantization step).
- the encoding device may individually encode the point clouds representing the indoor and outdoor space shapes. For example, by separating GOS (indoor GOS) representing the room and GOS (outdoor GOS) representing the room, the decoding device selects the GOS to be decoded according to the viewpoint position when using the encoded data. it can.
- the encoding device may switch the size of GOS or SPC between the indoor GOS and the outdoor GOS. For example, the encoding apparatus sets the size of GOS smaller in the room than in the outdoor. Further, the encoding device may change the accuracy when extracting feature points from the point cloud, the accuracy of object detection, or the like between the indoor GOS and the outdoor GOS.
- the encoding device or the decoding device encodes the dynamic object and the static object as different SPC or GOS according to the appearance frequency of the dynamic object or the ratio of the static object to the dynamic object. Or you may decide whether to decode. For example, when the appearance frequency or ratio of a dynamic object exceeds a threshold, SPC or GOS in which a dynamic object and a static object are mixed is allowed, and the appearance frequency or ratio of a dynamic object does not exceed the threshold Does not allow SPC or GOS in which dynamic objects and static objects are mixed.
- GOS has a layer structure in at least one direction in the world, and the encoding device and the decoding device perform encoding or decoding from a lower layer.
- a randomly accessible GOS belongs to the lowest layer.
- GOS belonging to an upper layer refers to GOS belonging to the same layer or lower. That is, GOS includes a plurality of layers that are spatially divided in a predetermined direction and each include one or more SPCs.
- the encoding device and the decoding device encode or decode each SPC with reference to the SPC included in the same layer as the SPC or a layer lower than the SPC.
- the SHOT feature value is obtained by dividing the periphery of the VXL, calculating the inner product of the reference point and the normal vector of the divided area, and generating a histogram.
- This SHOT feature amount has a feature that the number of dimensions is high and the feature expression power is high.
- SIFT Scale-Invariant Feature Transform
- SURF Speeded Up Robust Features
- HOG Histogram of Orient
- FIG. 16 is a block diagram of 3D data encoding apparatus 400 according to the present embodiment.
- FIG. 17 is a flowchart of three-dimensional data encoding processing by the three-dimensional data encoding device 400.
- the decoding method used when the WLD decoding unit 503 decodes WLD may be different from the decoding method used when the SWLD decoding unit 504 decodes SWLD.
- inter prediction may be prioritized among intra prediction and inter prediction over the decoding method used for WLD.
- Update information indicating changes in people, construction, or row of trees (for trucks) is uploaded to the server as point clouds or metadata. Based on the upload, the server updates the WLD, and then updates the SWLD using the updated WLD.
- information for distinguishing between WLD and SWLD is added as header information of the encoded stream. For example, when there are multiple types of worlds such as a mesh world or a lane world, they are distinguished. Information may be added to the header information. In addition, when there are a large number of SWLDs having different feature amounts, information for distinguishing each may be added to the header information.
- the three-dimensional data encoding device 400 generates encoded three-dimensional data 414 in which data having a feature amount equal to or greater than a threshold value is encoded. Thereby, the amount of data can be reduced as compared with the case where the input three-dimensional data 411 is encoded as it is. Therefore, the three-dimensional data encoding device 400 can reduce the amount of data to be transmitted.
- the three-dimensional data encoding device 400 further generates encoded three-dimensional data 413 (second encoded three-dimensional data) by encoding the input three-dimensional data 411.
- the three-dimensional data encoding device 400 can selectively transmit the encoded three-dimensional data 413 and the encoded three-dimensional data 414, for example, according to the usage.
- the three-dimensional data encoding device 400 can use a more suitable three-dimensional position representation method for three-dimensional data having different numbers of data (number of VXL or FVXL).
- the three-dimensional data encoding apparatus 400 further transmits one of the encoded three-dimensional data 413 and 414 to the client in response to a request from the client.
- the extraction unit 643 extracts the three-dimensional data of the required range indicated by the required range information 633 from the fifth three-dimensional data 652, thereby processing the fifth three-dimensional data 652 into the sixth three-dimensional data 654. To do.
- the encoding unit 644 generates encoded three-dimensional data 634 that is an encoded stream by encoding the sixth three-dimensional data 654. Then, the transmission unit 645 transmits the encoded three-dimensional data 634 to the host vehicle.
- each vehicle includes the three-dimensional data generation device 620 and the three-dimensional data transmission device. It may have a function with 640.
- FIG. 26 is a block diagram illustrating a configuration example of the three-dimensional information processing apparatus 700 according to the present embodiment.
- the 3D information processing apparatus 700 is mounted on an animal body such as an automobile, for example. As shown in FIG. 26, the three-dimensional information processing apparatus 700 includes a three-dimensional map acquisition unit 701, a vehicle detection data acquisition unit 702, an abnormal case determination unit 703, a coping action determination unit 704, and an operation control unit 705. With.
- the 3D map acquisition unit 701 acquires a 3D map 711 in the vicinity of the travel route.
- the three-dimensional map acquisition unit 701 acquires the three-dimensional map 711 through a mobile communication network, vehicle-to-vehicle communication, or road-to-vehicle communication.
- the three-dimensional information processing apparatus 700 generates second three-dimensional position information (own vehicle detection three-dimensional data 712) from information detected by the sensor. Next, the three-dimensional information processing apparatus 700 performs the abnormality determination process on the first three-dimensional position information or the second three-dimensional position information, so that the first three-dimensional position information or the second three-dimensional position information is processed. It is determined whether or not the three-dimensional position information is abnormal.
- the communication unit 812 communicates with the traffic monitoring cloud or the preceding vehicle, and transmits a data transmission request or the like to the traffic monitoring cloud or the preceding vehicle.
- the format conversion unit 814 generates three-dimensional data 832 by performing format conversion or the like on the three-dimensional data 831 received by the data reception unit 811. Further, the format conversion unit 814 performs expansion or decoding processing when the three-dimensional data 831 is compressed or encoded.
- the plurality of sensors 815 is a sensor group that acquires information outside the vehicle, such as LiDAR, visible light camera, or infrared camera, and generates sensor information 833.
- the sensor information 833 is three-dimensional data such as a point cloud (point cloud data) when the sensor 815 is a laser sensor such as LiDAR. Note that the sensor 815 may not be plural.
- the three-dimensional image processing unit 1017 performs self-position estimation processing of the host vehicle using the received three-dimensional map 1032 such as a point cloud and the three-dimensional data 1034 around the host vehicle generated from the sensor information 1033. . Note that the three-dimensional image processing unit 1017 generates the three-dimensional data 1035 around the host vehicle by synthesizing the three-dimensional map 1032 and the three-dimensional data 1034, and self-position estimation is performed using the generated three-dimensional data 1035. Processing may be performed.
- the format conversion unit 1019 generates sensor information 1037 by converting the sensor information 1033 into a format supported by the receiving side.
- the format conversion unit 1019 may reduce the data amount by compressing or encoding the sensor information 1037. Further, the format conversion unit 1019 may omit the process when the format conversion is not necessary. Further, the format conversion unit 1019 may control the amount of data to be transmitted according to the designation of the transmission range.
- the server 901 includes a data reception unit 1111, a communication unit 1112, a reception control unit 1113, a format conversion unit 1114, a 3D data creation unit 1116, a 3D data synthesis unit 1117, and a 3D data storage unit 1118.
- the server 901 creates three-dimensional data 1134 near the position of the client device 902 using the sensor information 1037 received from the client device 902. Next, the server 901 calculates a difference between the three-dimensional data 1134 and the three-dimensional map 1135 by matching the created three-dimensional data 1134 with the three-dimensional map 1135 of the same area managed by the server 901. . If the difference is greater than or equal to a predetermined threshold, the server 901 determines that some abnormality has occurred around the client device 902. For example, when a land subsidence occurs due to a natural disaster such as an earthquake, a large difference occurs between the three-dimensional map 1135 managed by the server 901 and the three-dimensional data 1134 created based on the sensor information 1037. It is possible.
- the server 901 may determine the sensor specification information from the vehicle type of the vehicle. In this case, the server 901 may acquire information on the vehicle type of the vehicle in advance, or the sensor information may include the information. The server 901 may switch the degree of correction for the three-dimensional data 1134 created using the sensor information 1037 using the acquired sensor information 1037. For example, when the sensor performance is high accuracy (class 1), the server 901 does not correct the three-dimensional data 1134. When the sensor performance is low accuracy (class 3), the server 901 applies correction according to the accuracy of the sensor to the three-dimensional data 1134. For example, the server 901 increases the degree of correction (intensity) as the accuracy of the sensor is lower.
- the server 901 may simultaneously send sensor information transmission requests to a plurality of client devices 902 in a certain space.
- the server 901 receives a plurality of pieces of sensor information from a plurality of client devices 902, it is not necessary to use all the sensor information for creating the three-dimensional data 1134.
- a sensor to be used according to the sensor performance Information may be selected.
- the server 901 selects highly accurate sensor information (class 1) from a plurality of received sensor information, and creates three-dimensional data 1134 using the selected sensor information. May be.
- FIG. 36 is a block diagram illustrating the functional configuration of the server 901 and the client device 902.
- the server 901 includes, for example, a three-dimensional map compression / decoding processing unit 1201 that compresses and decodes a three-dimensional map, and a sensor information compression / decoding processing unit 1202 that compresses and decodes sensor information.
- the client device 902 transmits the sensor information 1033 to the server 901 or the like. Thereby, there is a possibility that the data amount of the transmission data can be reduced as compared with the case of transmitting the three-dimensional data. Further, since it is not necessary to perform processing such as compression or encoding of three-dimensional data in the client device 902, the processing amount of the client device 902 can be reduced. Therefore, the client device 902 can reduce the amount of data transmitted or simplify the configuration of the device.
- the sensor information 1033 includes at least one of information obtained by a laser sensor, a luminance image, an infrared image, a depth image, sensor position information, and sensor speed information.
- the server 901 can communicate with the client device 902 mounted on the mobile body, and sensor information 1037 indicating the surrounding state of the mobile body obtained by the sensor 1015 mounted on the mobile body. Is received from the client device 902. The server 901 creates three-dimensional data 1134 around the moving object from the received sensor information 1037.
- the server 901 further corrects the three-dimensional data according to the performance of the sensor. According to this, the three-dimensional data creation method can improve the quality of the three-dimensional data.
- the server 901 decodes or expands the received sensor information 1037, and creates three-dimensional data 1134 from the sensor information 1132 after decoding or expansion. According to this, the server 901 can reduce the amount of data to be transmitted.
- the conversion unit 1303 matches the scan order of the prediction residual with the scan order (width priority or depth priority, etc.) in the octree in the volume. This eliminates the need to add information indicating the scan order of prediction residuals to the bitstream, thereby reducing overhead.
- the conversion unit 1303 may apply a scan order different from the scan order of the octree.
- the 3D data encoding apparatus 1300 adds information indicating the scan order of prediction residuals to the bitstream.
- the three-dimensional data encoding device 1300 can efficiently encode the prediction residual.
- the conversion unit 1303 may convert not only the prediction residual of color information but also other attribute information possessed by the voxel.
- the conversion unit 1303 may convert and encode information such as reflectivity obtained when a point cloud is acquired using LiDAR or the like.
- the inverse quantization unit 1305 generates an inverse quantization coefficient of the prediction residual by performing inverse quantization on the quantization coefficient generated by the quantization unit 1304 using the quantization control parameter, and generates the generated inverse quantum.
- the quantization coefficient is output to the inverse transform unit 1306.
- FIG. 44 is a diagram for explaining the operation of the intra prediction unit 1309.
- the volume idx is identifier information added to the volume in the space, and a different value is assigned to each volume.
- the order of volume idx allocation may be the same as the encoding order, or may be an order different from the encoding order.
- a prediction residual is generated by subtracting the predicted value of the color information from the color information of each voxel included in the encoding target volume.
- the processing after the conversion unit 1303 is performed on this prediction residual.
- the 3D data encoding apparatus 1300 adds the adjacent volume information and the prediction mode information to the bitstream.
- the adjacent volume information is information indicating the adjacent volume used for prediction, and indicates, for example, the volume idx of the adjacent volume used for prediction.
- the prediction mode information indicates a mode used for generating a predicted volume.
- the mode is, for example, an average value mode for generating a predicted value from an average value of voxels in an adjacent volume, an intermediate value mode for generating a predicted value from an intermediate value of voxels in an adjacent volume, or the like.
- FIG. 45 is a diagram schematically illustrating the inter prediction process according to the present embodiment.
- the inter prediction unit 1311 encodes (inter prediction) a space (SPC) at a certain time T_Cur by using encoded spaces at different times T_LX.
- the inter prediction unit 1311 performs encoding processing by applying rotation and translation processing to encoded spaces at different times T_LX.
- the 3D data encoding apparatus 1300 determines whether to apply rotation and translation for each reference space. At this time, the 3D data encoding apparatus 1300 may add information (RT application flag or the like) indicating whether or not rotation and translation are applied for each reference space to the header information of the bitstream. For example, the three-dimensional data encoding apparatus 1300 calculates RT information and an ICP error value by using an ICP (Interactive Closest Point) algorithm for each reference space referred from the encoding target space. If the ICP error value is equal to or less than a predetermined value, the 3D data encoding apparatus 1300 determines that rotation and translation are not necessary and sets the RT application flag to OFF. On the other hand, when the ICP error value is larger than the predetermined value, the 3D data encoding apparatus 1300 sets the RT application flag to ON and adds the RT information to the bitstream.
- ICP Interactive Closest Point
- R_l1 [i] and T_l1 [i] are RT information of the reference space i in the reference list L1.
- R_l1 [i] is rotation information of the reference space i in the reference list L1.
- the rotation information indicates the content of the applied rotation process, and is, for example, a rotation matrix or a quaternion.
- T_l1 [i] is the translation information of the reference space i in the reference list L1.
- the translation information indicates the content of the applied translation process, and is, for example, a translation vector.
- the inter prediction unit 1311 applies the rotation and translation processing to the reference space to bring the overall positional relationship between the encoding target space and the reference space closer, and then uses the reference space information to predict the prediction volume.
- the accuracy of the predicted volume can be improved by generating.
- the prediction residual can be suppressed, the code amount can be reduced.
- the inter prediction unit 1311 performs ICP using at least one of an encoding target space obtained by thinning out the number of voxels or point clouds and a reference space obtained by thinning out the number of voxels or point clouds.
- RT information may be obtained.
- the entropy encoding unit 1313 generates an encoded signal (encoded bitstream) by variable-length encoding the quantization coefficient that is an input from the quantization unit 1304. Specifically, the entropy encoding unit 1313 binarizes the quantization coefficient, for example, and arithmetically encodes the obtained binary signal.
- the inverse quantization unit 1402 generates an inverse quantization coefficient by inversely quantizing the quantization coefficient input from the entropy decoding unit 1401 using a quantization parameter added to a bit stream or the like.
- the prediction control unit 1409 controls whether the decoding target volume is decoded by intra prediction or by inter prediction. For example, the prediction control unit 1409 selects intra prediction or inter prediction according to information indicating a prediction mode to be used, which is added to the bitstream. Note that the prediction control unit 1409 may always select intra prediction when it is determined in advance that the space to be decoded is to be decoded in the intra space.
- the 3D data encoding apparatus 1300 generates RT information in units of encoding volumes, and adds the generated RT information to a bitstream header or the like. Furthermore, the above may be combined. That is, the 3D data encoding apparatus 1300 may apply rotation and translation in large units, and then apply rotation and translation in fine units. For example, the 3D data encoding apparatus 1300 may apply rotation and translation in units of space, and may apply different rotation and translation to each of a plurality of volumes included in the obtained space.
- the three-dimensional data encoding device 1300 applies the first rotation and translation processing in the first unit (for example, a space) to the position information of the three-dimensional point included in the reference three-dimensional data, and performs the first rotation and translation processing.
- the predicted position information may be generated by applying the second rotation and translation processing in the second unit (for example, volume) finer than the first unit to the position information of the three-dimensional point obtained by the above.
- the three-dimensional data encoding apparatus may use 1 (occupied) or 0 (unoccupied) as an alternative value. A fixed value may be used.
- the occupancy code of the target node shown in FIG. 66 represents whether each node in the target node adjacent to the target node 2 is occupied, for example, as shown in FIG. Therefore, the 3D data encoding apparatus can switch the encoding table of the occupancy code of the target node 2 in accordance with the finer shape of the target node, so that the encoding efficiency can be improved.
- the three-dimensional data encoding apparatus may calculate an encoding table for entropy encoding the occupancy code of the target node 2 using, for example, the following equation.
- the three-dimensional data encoding apparatus permits reference to information (for example, occupancy information) of a child node of the first node among a plurality of adjacent nodes as shown in FIGS. 66 and 67. .
- the 3D data encoding apparatus can refer to an appropriate adjacent node according to the spatial position in the parent node of the target node.
- a three-dimensional data encoding device includes a processor and a memory, and the processor performs the above processing using the memory.
- the 3D data decoding apparatus permits reference to information on the parent node (for example, occupancy code), and information on other nodes (parent adjacent nodes) in the same layer as the parent node (for example, occupancy code). The reference of is prohibited.
- the 3D data decoding apparatus refers to the encoding efficiency by referring to the information of the first node having the same parent node as the target node among a plurality of adjacent nodes spatially adjacent to the target node. Can be improved.
- the 3D data decoding apparatus can reduce the processing amount by not referring to the information of the second node having a different parent node from the target node among the plurality of adjacent nodes. As described above, the three-dimensional data decoding apparatus can improve the encoding efficiency and reduce the processing amount.
- the 3D data decoding apparatus can appropriately perform the decoding process using the prohibition switching information.
- the three-dimensional data decoding apparatus selects a coding table based on whether or not a three-dimensional point exists in the first node in the decoding, and uses the selected coding table to obtain information on the target node.
- a coding table based on whether or not a three-dimensional point exists in the first node in the decoding, and uses the selected coding table to obtain information on the target node.
- occupancy code For example, occupancy code
- the three-dimensional data decoding apparatus permits reference to information (for example, occupancy information) of a child node of the first node among a plurality of adjacent nodes as shown in FIGS. 66 and 67.
- the three-dimensional data decoding apparatus can refer to more detailed information of adjacent nodes, the encoding efficiency can be improved.
- the three-dimensional data decoding apparatus switches the adjacent node to be referred to from among a plurality of adjacent nodes in the decoding, according to the spatial position in the parent node of the target node.
- the information of the three-dimensional point group includes position information (geometry) and attribute information (attribute).
- the position information includes coordinates (x coordinate, y coordinate, z coordinate) based on a certain point.
- the position of each three-dimensional point is represented by an octree representation, and the information is encoded by encoding the information of the octree. A method of reducing the amount is used.
- the attribute information includes information indicating color information (RGB, YUV, etc.), reflectance, and normal vector of each three-dimensional point.
- the three-dimensional data encoding apparatus can encode the attribute information using an encoding method different from the position information.
- an attribute information encoding method will be described.
- description will be made using an integer value as the value of attribute information.
- each color component of the color information RGB or YUV has 8-bit accuracy
- each color component takes an integer value of 0 to 255.
- the reflectance value has 10-bit accuracy
- the reflectance value takes an integer value of 0 to 1023.
- the three-dimensional data encoding device may round the integer information after multiplying the value by the scale value so that the value of the attribute information becomes an integer value. .
- the three-dimensional data encoding apparatus may add this scale value to the header of the bit stream or the like.
- the reference three-dimensional point is a three-dimensional point within a predetermined distance range from the target three-dimensional point.
- the three-dimensional data encoding apparatus can perform the three-dimensional operation shown in (Expression A1). The Euclidean distance d (p, q) between the point p and the three-dimensional point q is calculated.
- FIG. 68 is a diagram showing an example of a three-dimensional point.
- the distance d (p, q) between the target three-dimensional point p and the three-dimensional point q is smaller than the threshold value THd. Therefore, the three-dimensional data encoding apparatus determines the three-dimensional point q as the reference three-dimensional point of the target three-dimensional point p, and generates the predicted value Pp of the target three-dimensional p attribute information Ap. It is determined that the value of the information Aq is used.
- the three-dimensional data encoding device selects the initial point a0 and assigns it to LoD0.
- the three-dimensional data encoding device extracts a point a1 whose distance from the point a0 is larger than the threshold Thres_LoD [0] of LoD0 and assigns it to LoD0.
- the three-dimensional data encoding device extracts a point a2 whose distance from the point a1 is larger than the threshold Thres_LoD [0] of LoD0 and assigns it to LoD0.
- the three-dimensional data encoding apparatus configures LoD0 such that the distance between each point in LoD0 is greater than the threshold value Thres_LoD [0].
- the three-dimensional data encoding device selects a point b0 that is not yet assigned LoD and assigns it to LoD1.
- the three-dimensional data encoding apparatus extracts a point b1 whose distance from the point b0 is larger than the threshold Thres_LoD [1] of LoD1 and LoD is not assigned, and assigns it to LoD1.
- the three-dimensional data encoding device extracts a point b2 whose distance from the point b1 is larger than the threshold Thres_LoD [1] of LoD1 and LoD is not assigned, and assigns it to LoD1.
- the three-dimensional data encoding apparatus configures LoD1 such that the distance between each point in LoD1 is greater than the threshold value Thres_LoD [1].
- the 3D data encoding apparatus selects a point c0 to which LoD is not yet assigned and assigns it to LoD2.
- the three-dimensional data encoding device extracts a point c1 whose distance from the point c0 is larger than the threshold Thres_LoD [2] of LoD2 and LoD is not assigned, and assigns it to LoD2.
- the three-dimensional data encoding apparatus extracts a point c2 whose distance from the point c1 is larger than the threshold Thres_LoD [2] of LoD2 and LoD is not assigned, and assigns it to LoD2.
- the 3D data encoding apparatus may add information indicating the threshold value of each LoD to the header of the bit stream or the like. For example, in the case of the example illustrated in FIG. 70, the 3D data encoding apparatus may add thresholds Thres_LoD [0], Thres_LoD [1], and Thres_LoD [2] to the header.
- the upper layer (layer closer to LoD0) becomes a sparse point group (sparse) in which the distance between the three-dimensional points is farther away.
- the lower layer becomes a dense point group (dense) in which the distance between the three-dimensional points is closer.
- LoD0 is the highest layer.
- the method of selecting an initial three-dimensional point when setting each LoD may depend on the coding order at the time of position information coding. For example, the three-dimensional data encoding device selects the three-dimensional point encoded first at the time of position information encoding as the initial point a0 of LoD0, and selects the points a1 and a2 based on the initial point a0. Configure LoD0. Then, the 3D data encoding apparatus may select the 3D point whose position information is encoded earliest among the 3D points that do not belong to LoD0 as the initial point b0 of LoD1.
- the three-dimensional data encoding apparatus uses the predicted value of the attribute information of the three-dimensional point as the number of N or less three-dimensional points among the encoded three-dimensional points around the target three-dimensional point to be encoded. Generated by calculating the average of attribute values.
- the three-dimensional data encoding apparatus may add the value N to a bitstream header or the like. Note that the three-dimensional data encoding apparatus may change the value of N for each three-dimensional point and add the value of N for each three-dimensional point. Thereby, since appropriate N can be selected for every three-dimensional point, the precision of a predicted value can be improved. Therefore, the prediction residual can be reduced.
- the three-dimensional data encoding apparatus may add a value of N to the header of the bit stream and fix the value of N in the bit stream. This eliminates the need to encode or decode the value of N for each three-dimensional point, thereby reducing the amount of processing. Further, the three-dimensional data encoding device may encode the value of N separately for each LoD. Thereby, encoding efficiency can be improved by selecting appropriate N for every LoD.
- the three-dimensional data encoding device may calculate the predicted value of the attribute information of the three-dimensional point based on the weighted average value of the attribute information of the N encoded three-dimensional points around. For example, the three-dimensional data encoding apparatus calculates weights using distance information between the target three-dimensional point and the surrounding N three-dimensional points.
- the predicted value is calculated by a distance-dependent weighted average.
- the predicted value a2p of the point a2 is calculated by the weighted average of the attribute information of the points a0 and a1, as shown in (Expression A2) and (Expression A3).
- a i is the value of the attribute information of the point ai.
- the predicted value b2p of the point b2 is calculated by the weighted average of the attribute information of the points a0, a1, a2, b0, b1, as shown in (Expression A4) to (Expression A6).
- B i is the value of the attribute information of the point bi.
- the 3D data encoding apparatus calculates a difference value (prediction residual) between the value of the attribute information of the 3D point and the predicted value generated from the surrounding points, and can quantize the calculated prediction residual.
- prediction residual a difference value between the value of the attribute information of the 3D point and the predicted value generated from the surrounding points.
- the three-dimensional data encoding apparatus performs quantization by dividing a prediction residual by a quantization scale (also referred to as a quantization step).
- a quantization scale also referred to as a quantization step.
- the quantization scale also referred to as a quantization step
- the quantization scale also referred to as a quantization step
- the three-dimensional data encoding device may change the quantization scale to be used for each LoD.
- the three-dimensional data encoding apparatus decreases the quantization scale in the upper layer and increases the quantization scale in the lower layer. Since the attribute information value of the 3D point belonging to the upper layer may be used as a predicted value of the attribute information of the 3D point belonging to the lower layer, the quantization scale of the upper layer is reduced and the upper layer Coding efficiency can be improved by suppressing the quantization error that may occur and increasing the accuracy of the predicted value.
- the three-dimensional data encoding apparatus may add a quantization scale used for each LoD to the header or the like. Thereby, since the three-dimensional data decoding apparatus can correctly decode the quantization scale, the bit stream can be appropriately decoded.
- the prediction residual is calculated by subtracting the prediction value from the original value.
- the prediction residual a2r of the point a2 as shown in (Equation A7), from the values A 2 of the attribute information of the point a2, is calculated by subtracting the prediction value a2p the point a2.
- the 3D data encoding device may switch the threshold value R_TH for each LoD and add the threshold value R_TH for each LoD to the header or the like. That is, the 3D data encoding apparatus may switch the binarization method for each LoD. For example, since the distance between the three-dimensional points is long in the upper layer, the prediction accuracy may be poor and the prediction residual may increase as a result. Therefore, the three-dimensional data encoding apparatus prevents a rapid increase in the bit length of the binarized data by setting the threshold value R_TH small for the upper layer. Further, since the distance between the three-dimensional points is short in the lower layer, the prediction accuracy is high and as a result, the prediction residual may be small. Therefore, the 3D data encoding apparatus improves the encoding efficiency by setting a large threshold value R_TH for the hierarchy.
- FIG. 73 is a diagram for explaining processing when the remaining code is an exponential Golomb code, for example.
- the remaining code which is a portion binarized using the exponent Golomb, includes a prefix portion and a suffix portion as shown in FIG.
- the three-dimensional data encoding apparatus switches the encoding table between the prefix part and the suffix part. That is, the three-dimensional data encoding apparatus arithmetically encodes each bit included in the prefix part using the encoding table for the prefix, and uses each encoding bit for the suffix part using the encoding table for the suffix. Perform arithmetic coding.
- the three-dimensional data encoding apparatus may update the occurrence probability of 0 and 1 in each encoding table according to the value of the actually generated binarized data.
- the three-dimensional data encoding device may fix the occurrence probability of 0 and 1 in either encoding table. Thereby, the number of updates of the occurrence probability can be suppressed, and the processing amount can be reduced.
- the three-dimensional data encoding apparatus may update the occurrence probability with respect to the prefix portion and fix the occurrence probability with respect to the suffix portion.
- 3D point number information indicates the number of 3D points belonging to layer i.
- the 3D data encoding apparatus may add 3D point total number information (AllNumOfPoint) indicating the total number of 3D points to another header.
- the 3D data encoding apparatus may not add NumOfPoint [NumLoD-1] indicating the number of 3D points belonging to the lowest layer to the header.
- the three-dimensional data decoding apparatus can calculate NumOfPoint [NumLoD-1] by (Expression A15). Thereby, the code amount of the header can be reduced.
- the hierarchy threshold (Thres_Lod [i]) is a threshold used for setting the hierarchy i.
- the three-dimensional data encoding device and the three-dimensional data decoding device configure LoDi so that the distance between each point in LoDi is greater than the threshold Thres_LoD [i]. Further, the three-dimensional data encoding apparatus may not add the value of Thres_Lod [NumLoD-1] (lowermost layer) to the header. In this case, the 3D data decoding apparatus estimates the value of Thres_Lod [NumLoD-1] as 0. Thereby, the code amount of the header can be reduced.
- Surrounding point information indicates an upper limit value of surrounding points used for generating predicted values of three-dimensional points belonging to the hierarchy i.
- the three-dimensional data encoding apparatus may calculate a predicted value using M number of surrounding points.
- the 3D data encoding device can add one piece of peripheral point information (NumNeighorPoint) used for all LoDs to the header. Good.
- the prediction threshold (THd [i]) indicates an upper limit value of the distance between the surrounding 3D points and the target 3D points used for prediction of the target 3D points to be encoded or decoded in the hierarchy i.
- the three-dimensional data encoding device and the three-dimensional data decoding device do not use a three-dimensional point whose distance from the target three-dimensional point is more than THd [i] for prediction.
- the 3D data encoding device may add one prediction threshold (THd) used in all LoDs to the header when it is not necessary to divide the value of THd [i] in each LoD. .
- R_TH [i] may be the maximum value that can be expressed in nbits.
- R_TH is 63 for 6 bits
- R_TH is 255 for 8 bits.
- the three-dimensional data encoding device may encode the number of bits instead of encoding the maximum value that can be expressed by n bits as the binarization threshold.
- the three-dimensional data encoding device may define a minimum value (minimum bit number) of the number of bits representing R_TH [i] and add a relative bit number from the minimum value to the header.
- the n-bit code (n-bit code) is encoded data of a prediction residual of the attribute information value or a part thereof.
- the bit length of the n-bit code depends on the value of R_TH [i]. For example, when the value indicated by R_TH [i] is 63, the n-bit code is 6 bits, and when the value indicated by R_TH [i] is 255, the n-bit code is 8 bits.
- the remaining code (remaining code) is encoded data encoded by exponential Gorom among encoded data of prediction residuals of attribute information values. This remaining code is encoded or decoded when the n-bit code is the same as R_TH [i].
- the three-dimensional data decoding apparatus decodes the prediction residual by adding the value of the n-bit code and the value of the remaining code. When the n-bit code is not the same value as R_TH [i], the remaining code may not be encoded or decoded.
- FIG. 76 is a flowchart of 3D data encoding processing by the 3D data encoding device.
- the three-dimensional data encoding device encodes position information (geometry) (S3001). For example, three-dimensional data encoding is performed using an octree representation.
- FIG. 77 is a flowchart of the attribute information encoding process (S3003).
- the 3D data encoding apparatus sets LoD (S3011). That is, the three-dimensional data encoding apparatus assigns each three-dimensional point to one of a plurality of LoDs.
- the 3D data encoding apparatus starts a loop of 3D point units (S3013). That is, the 3D data encoding apparatus repeatedly performs the processing of steps S3014 to S3020 for each 3D point.
- the three-dimensional data encoding device searches for a plurality of surrounding points, which are three-dimensional points existing around the target three-dimensional point, used to calculate the predicted value of the target three-dimensional point to be processed (S3014).
- the three-dimensional data encoding apparatus calculates a weighted average of the attribute information values of a plurality of surrounding points, and sets the obtained value as the predicted value P (S3015).
- the three-dimensional data encoding device calculates a prediction residual that is a difference between the attribute information of the target three-dimensional point and the predicted value (S3016).
- the 3D data encoding apparatus calculates a quantized value by quantizing the prediction residual (S3017).
- the three-dimensional data encoding apparatus arithmetically encodes the quantized value (S3018).
- the three-dimensional data encoding device calculates an inverse quantization value by inverse quantization of the quantization value (S3019). Next, the three-dimensional data encoding device generates a decoded value by adding the predicted value to the inverse quantized value (S3020). Next, the three-dimensional data encoding device ends the loop of the three-dimensional point unit (S3021). Also, the three-dimensional data encoding device ends the loop of LoD units (S3022).
- the three-dimensional data encoding device calculates the prediction residual of the attribute information, and further binarizes and arithmetically encodes the prediction residual, so that the code amount of the encoded data of the attribute information Can be reduced.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Software Systems (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Image Processing (AREA)
- Compression Of Band Width Or Redundancy In Fax (AREA)
Abstract
Description
まず、本実施の形態に係る符号化三次元データ(以下、符号化データとも記す)のデータ構造について説明する。図1は、本実施の形態に係る符号化三次元データの構成を示す図である。
ポイントクラウドの符号化データを実際の装置又はサービスにおいて使用する際には、ネットワーク帯域を抑制するために用途に応じて必要な情報を送受信することが望ましい。しかしながら、これまで、三次元データの符号化構造にはそのような機能が存在せず、そのための符号化方法も存在しなかった。
本実施の形態では、車両間での三次元データを送受信する方法について説明する。例えば、自車両と周辺車両との間での三次元データの送受信が行われる。
本実施の形態では、三次元マップに基づく自己位置推定における異常系の動作について説明する。
本実施の形態では、後続車両への三次元データ送信方法等について説明する。
実施の形態5において、車両等のクライアント装置が、他の車両又は交通監視クラウド等のサーバに三次元データを送信する例を説明した。本実施の形態では、クライアント装置は、サーバ又は他のクライアント装置にセンサで得られたセンサ情報を送信する。
本実施の形態では、インター予測処理を用いた三次元データの符号化方法及び復号方法について説明する。
本実施の形態では、オキュパンシー符号の符号化時における参照の制御方法について説明する。なお、以下では、主に三次元データ符号化装置の動作を説明するが、三次元データ復号装置においても同様の処理が行われてもよい。
三次元点群の情報は、位置情報(geometry)と属性情報(attribute)とを含む。位置情報は、ある点を基準とした座標(x座標、y座標、z座標)を含む。位置情報を符号化する場合は、各三次元点の座標を直接符号化する代わりに、各三次元点の位置を8分木表現で表現し、8分木の情報を符号化することで符号量を削減する方法が用いられる。
101、201、401、501 取得部
102、402 符号化領域決定部
103 分割部
104、644 符号化部
111 三次元データ
112、211、413、414、511、634 符号化三次元データ
200、500 三次元データ復号装置
202 復号開始GOS決定部
203 復号SPC決定部
204、625 復号部
212、512、513 復号三次元データ
403 SWLD抽出部
404 WLD符号化部
405 SWLD符号化部
411 入力三次元データ
412 抽出三次元データ
502 ヘッダ解析部
503 WLD復号部
504 SWLD復号部
620、620A 三次元データ作成装置
621、641 三次元データ作成部
622 要求範囲決定部
623 探索部
624、642 受信部
626 合成部
631、651 センサ情報
632 第1三次元データ
633 要求範囲情報
635 第2三次元データ
636 第3三次元データ
640 三次元データ送信装置
643 抽出部
645 送信部
652 第5三次元データ
654 第6三次元データ
700 三次元情報処理装置
701 三次元マップ取得部
702 自車検知データ取得部
703 異常ケース判定部
704 対処動作決定部
705 動作制御部
711 三次元マップ
712 自車検知三次元データ
810 三次元データ作成装置
811 データ受信部
812、819 通信部
813 受信制御部
814、821 フォーマット変換部
815 センサ
816 三次元データ作成部
817 三次元データ合成部
818 三次元データ蓄積部
820 送信制御部
822 データ送信部
831、832、834、835、836、837 三次元データ
833 センサ情報
901 サーバ
902、902A、902B、902C クライアント装置
1011、1111 データ受信部
1012、1020、1112、1120 通信部
1013、1113 受信制御部
1014、1019、1114、1119 フォーマット変換部
1015 センサ
1016、1116 三次元データ作成部
1017 三次元画像処理部
1018、1118 三次元データ蓄積部
1021、1121 送信制御部
1022、1122 データ送信部
1031、1032、1135 三次元マップ
1033、1037、1132 センサ情報
1034、1035、1134 三次元データ
1117 三次元データ合成部
1201 三次元マップ圧縮/復号処理部
1202 センサ情報圧縮/復号処理部
1211 三次元マップ復号処理部
1212 センサ情報圧縮処理部
1300 三次元データ符号化装置
1301 分割部
1302 減算部
1303 変換部
1304 量子化部
1305、1402 逆量子化部
1306、1403 逆変換部
1307、1404 加算部
1308、1405 参照ボリュームメモリ
1309、1406 イントラ予測部
1310、1407 参照スペースメモリ
1311、1408 インター予測部
1312、1409 予測制御部
1313 エントロピー符号化部
1400 三次元データ復号装置
1401 エントロピー復号部
2100 三次元データ符号化装置
2101、2111 8分木生成部
2102、2112 幾何情報算出部
2103、2113 符号化テーブル選択部
2104 エントロピー符号化部
2110 三次元データ復号装置
2114 エントロピー復号部
3000 三次元データ符号化装置
3001 位置情報符号化部
3002 属性情報再割り当て部
3003 属性情報符号化部
3010 三次元データ復号装置
3011 位置情報復号部
3012 属性情報復号部
Claims (16)
- 属性情報を有する三次元点を符号化する三次元データ符号化方法であって、
前記三次元点の属性情報の予測値を算出し、
前記三次元点の属性情報と、前記予測値との差分である予測残差を算出し、
前記予測残差を二値化することで二値データを生成し、
前記二値データを算術符号化する
三次元データ符号化方法。 - 前記算術符号化では、前記二値データのビット毎に異なる符号化テーブルを用いる
請求項1記載の三次元データ符号化方法。 - 前記算術符号化では、前記二値データの下位ビットほど、使用する符号化テーブルの数が多い
請求項2記載の三次元データ符号化方法。 - 前記算術符号化では、前記二値データに含まれる対象ビットの上位ビットの値に応じて、前記対象ビットの算術符号化に使用する符号化テーブルを選択する
請求項1~3のいずれか1項に記載の三次元データ符号化方法。 - 前記二値化では、
前記予測残差が閾値より小さい場合、固定ビット数で前記予測残差を二値化することで前記二値データを生成し、
前記予測残差が前記閾値以上である場合、前記閾値を示す前記固定ビット数の第1符号と、前記予測残差から前記閾値を減算した値を指数ゴロムで二値化した第2符号とを含む前記二値データを生成し、
前記算術符号化では、前記第1符号と前記第2符号とに異なる算術符号化方法を用いる
請求項1~4のいずれか1項に記載の三次元データ符号化方法。 - 前記三次元データ符号化方法は、さらに、
前記予測残差を量子化し、
前記二値化では、量子化された前記予測残差を二値化し、
前記閾値は、前記量子化における量子化スケールに応じて変更される
請求項5記載の三次元データ符号化方法。 - 前記第2符号は、prefix部と、suffix部とを含み、
前記算術符号化では、前記prefix部と前記suffix部とに異なる符号化テーブルを用いる
請求項5記載の三次元データ符号化方法。 - 属性情報を有する三次元点を復号する三次元データ復号方法であって、
前記三次元点の属性情報の予測値を算出し、
ビットストリームに含まれる符号化データを算術復号することで二値データを生成し、
前記二値データを多値化することで予測残差を生成し、
前記予測値と前記予測残差とを加算することで、前記三次元点の属性情報の復号値を算出する
三次元データ復号方法。 - 前記算術復号では、前記二値データのビット毎に異なる符号化テーブルを用いる
請求項8記載の三次元データ復号方法。 - 前記算術復号では、前記二値データの下位ビットほど、使用する符号化テーブルの数が多い
請求項9記載の三次元データ復号方法。 - 前記算術復号では、前記二値データに含まれる対象ビットの上位ビットの値に応じて、前記対象ビットの算術復号に使用する符号化テーブルを選択する
請求項8~10のいずれか1項に記載の三次元データ復号方法。 - 前記多値化では、
前記二値データに含まれる固定ビット数の第1符号を多値化することで第1の値を生成し、
前記第1の値が閾値より小さい場合、前記第1の値を前記予測残差に決定し、
前記第1の値が閾値以上の場合、前記二値データに含まれる指数ゴロム符号である第2符号を多値化することで第2の値を生成し、前記第1の値と前記第2の値とを加算することで前記予測残差を生成し、
前記算術復号では、前記第1符号と前記第2符号とに異なる算術復号方法を用いる
請求項8~11のいずれか1項に記載の三次元データ復号方法。 - 前記三次元データ復号方法は、さらに、
前記予測残差を逆量子化し、
前記加算では、前記予測値と、逆量子化された前記予測残差とを加算し、
前記閾値は、前記逆量子化における量子化スケールに応じて変更される
請求項12記載の三次元データ復号方法。 - 前記第2符号は、prefix部と、suffix部とを含み、
前記算術復号では、前記prefix部と前記suffix部とに異なる符号化テーブルを用いる
請求項12記載の三次元データ復号方法。 - 属性情報を有する三次元点を符号化する三次元データ符号化装置であって、
プロセッサと、
メモリとを備え、
前記プロセッサは、前記メモリを用いて、
前記三次元点の属性情報の予測値を算出し、
前記三次元点の属性情報と、前記予測値との差分である予測残差を算出し、
前記予測残差を二値化することで二値データを生成し、
前記二値データを算術符号化する
三次元データ符号化装置。 - 属性情報を有する三次元点を復号する三次元データ復号装置であって、
プロセッサと、
メモリとを備え、
前記プロセッサは、前記メモリを用いて、
前記三次元点の属性情報の予測値を算出し、
ビットストリームに含まれる符号化データを算術復号することで二値データを生成し、
前記二値データを多値化することで予測残差を生成し、
前記予測値と前記予測残差とを加算することで、前記三次元点の属性情報の復号値を算出する
三次元データ復号装置。
Priority Applications (8)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2020523071A JP7167144B2 (ja) | 2018-06-06 | 2019-05-30 | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 |
BR112020023939-9A BR112020023939A2 (pt) | 2018-06-06 | 2019-05-30 | método de codificação de dados tridimensionais, método de decodificação de dados tridimensionais, dispositivo de codificação de dados tridimensionais e dispositivo de decodificação de dados tridimensionais |
EP19814101.2A EP3806042A4 (en) | 2018-06-06 | 2019-05-30 | METHOD FOR CODING THREE-DIMENSIONAL DATA, METHOD FOR DECODING THREE-DIMENSIONAL DATA, DEVICE FOR CODING THREE-DIMENSIONAL DATA AND DEVICE FOR DECODING THREE-DIMENSIONAL DATA |
CN201980037127.1A CN112219227A (zh) | 2018-06-06 | 2019-05-30 | 三维数据编码方法、三维数据解码方法、三维数据编码装置、以及三维数据解码装置 |
CA3101091A CA3101091A1 (en) | 2018-06-06 | 2019-05-30 | Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device |
MX2020012935A MX2020012935A (es) | 2018-06-06 | 2019-05-30 | Metodo de codificacion de datos tridimensionales, metodo de decodificacion de datos tridimensionales, dispositivo codificador de datos tridimensionales y dispositivo decodificador de datos tridimensionales. |
KR1020207034445A KR20210018254A (ko) | 2018-06-06 | 2019-05-30 | 삼차원 데이터 부호화 방법, 삼차원 데이터 복호 방법, 삼차원 데이터 부호화 장치, 및 삼차원 데이터 복호 장치 |
US17/108,496 US20210082153A1 (en) | 2018-06-06 | 2020-12-01 | Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201862681406P | 2018-06-06 | 2018-06-06 | |
US62/681,406 | 2018-06-06 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/108,496 Continuation US20210082153A1 (en) | 2018-06-06 | 2020-12-01 | Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019235366A1 true WO2019235366A1 (ja) | 2019-12-12 |
Family
ID=68770323
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2019/021636 WO2019235366A1 (ja) | 2018-06-06 | 2019-05-30 | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 |
Country Status (9)
Country | Link |
---|---|
US (1) | US20210082153A1 (ja) |
EP (1) | EP3806042A4 (ja) |
JP (1) | JP7167144B2 (ja) |
KR (1) | KR20210018254A (ja) |
CN (1) | CN112219227A (ja) |
BR (1) | BR112020023939A2 (ja) |
CA (1) | CA3101091A1 (ja) |
MX (1) | MX2020012935A (ja) |
WO (1) | WO2019235366A1 (ja) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021132595A1 (ja) * | 2019-12-26 | 2021-07-01 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 |
WO2022133752A1 (zh) * | 2020-12-22 | 2022-06-30 | Oppo广东移动通信有限公司 | 点云的编码方法、解码方法、编码器以及解码器 |
JP2023501640A (ja) * | 2020-07-09 | 2023-01-18 | テンセント・アメリカ・エルエルシー | 点群処理の方法、コンピュータシステム、プログラム及びコンピュータ可読記憶媒体 |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPWO2020218593A1 (ja) * | 2019-04-25 | 2020-10-29 | ||
WO2022252237A1 (zh) * | 2021-06-04 | 2022-12-08 | 华为技术有限公司 | 3d地图的编解码方法及装置 |
CN113676738B (zh) * | 2021-08-19 | 2024-03-29 | 上海交通大学 | 一种三维点云的几何编解码方法及装置 |
CN115720270A (zh) * | 2021-08-24 | 2023-02-28 | 鹏城实验室 | 点云编码方法、解码方法、点云编码设备及解码设备 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013062644A (ja) * | 2011-09-13 | 2013-04-04 | Kddi Corp | 画像符号化装置及び画像復号装置 |
WO2014020663A1 (ja) | 2012-07-30 | 2014-02-06 | 三菱電機株式会社 | 地図表示装置 |
WO2017104115A1 (ja) * | 2015-12-14 | 2017-06-22 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置及び三次元データ復号装置 |
WO2017217191A1 (ja) * | 2016-06-14 | 2017-12-21 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置及び三次元データ復号装置 |
US20180053324A1 (en) * | 2016-08-19 | 2018-02-22 | Mitsubishi Electric Research Laboratories, Inc. | Method for Predictive Coding of Point Cloud Geometries |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7336711B2 (en) * | 2001-11-16 | 2008-02-26 | Ntt Docomo, Inc. | Image encoding method, image decoding method, image encoder, image decode, program, computer data signal, and image transmission system |
KR100873636B1 (ko) * | 2005-11-14 | 2008-12-12 | 삼성전자주식회사 | 단일 부호화 모드를 이용하는 영상 부호화/복호화 방법 및장치 |
WO2009027606A1 (fr) * | 2007-08-24 | 2009-03-05 | France Telecom | Codage/decodage par plans de symboles, avec calcul dynamique de tables de probabilites |
CN102014283A (zh) * | 2010-11-30 | 2011-04-13 | 上海大学 | 一阶差分前缀表示的图像数据无损压缩的编码方法 |
EP3349360B1 (en) * | 2011-01-14 | 2019-09-04 | GE Video Compression, LLC | Entropy encoding and decoding scheme |
KR102062283B1 (ko) * | 2011-06-24 | 2020-01-03 | 선 페이턴트 트러스트 | 화상 복호 방법, 화상 부호화 방법, 화상 복호 장치, 화상 부호화 장치 및 화상 부호화 복호 장치 |
US9264706B2 (en) * | 2012-04-11 | 2016-02-16 | Qualcomm Incorporated | Bypass bins for reference index coding in video coding |
US9503760B2 (en) * | 2013-08-15 | 2016-11-22 | Mediatek Inc. | Method and system for symbol binarization and de-binarization |
US20170214943A1 (en) * | 2016-01-22 | 2017-07-27 | Mitsubishi Electric Research Laboratories, Inc. | Point Cloud Compression using Prediction and Shape-Adaptive Transforms |
KR20180007680A (ko) * | 2016-07-13 | 2018-01-23 | 한국전자통신연구원 | 영상 부호화/복호화 방법 및 장치 |
-
2019
- 2019-05-30 EP EP19814101.2A patent/EP3806042A4/en active Pending
- 2019-05-30 BR BR112020023939-9A patent/BR112020023939A2/pt unknown
- 2019-05-30 MX MX2020012935A patent/MX2020012935A/es unknown
- 2019-05-30 CA CA3101091A patent/CA3101091A1/en active Pending
- 2019-05-30 JP JP2020523071A patent/JP7167144B2/ja active Active
- 2019-05-30 CN CN201980037127.1A patent/CN112219227A/zh active Pending
- 2019-05-30 KR KR1020207034445A patent/KR20210018254A/ko active Search and Examination
- 2019-05-30 WO PCT/JP2019/021636 patent/WO2019235366A1/ja unknown
-
2020
- 2020-12-01 US US17/108,496 patent/US20210082153A1/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013062644A (ja) * | 2011-09-13 | 2013-04-04 | Kddi Corp | 画像符号化装置及び画像復号装置 |
WO2014020663A1 (ja) | 2012-07-30 | 2014-02-06 | 三菱電機株式会社 | 地図表示装置 |
WO2017104115A1 (ja) * | 2015-12-14 | 2017-06-22 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置及び三次元データ復号装置 |
WO2017217191A1 (ja) * | 2016-06-14 | 2017-12-21 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置及び三次元データ復号装置 |
US20180053324A1 (en) * | 2016-08-19 | 2018-02-22 | Mitsubishi Electric Research Laboratories, Inc. | Method for Predictive Coding of Point Cloud Geometries |
Non-Patent Citations (1)
Title |
---|
See also references of EP3806042A4 |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021132595A1 (ja) * | 2019-12-26 | 2021-07-01 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 |
JP2023501640A (ja) * | 2020-07-09 | 2023-01-18 | テンセント・アメリカ・エルエルシー | 点群処理の方法、コンピュータシステム、プログラム及びコンピュータ可読記憶媒体 |
JP7368623B2 (ja) | 2020-07-09 | 2023-10-24 | テンセント・アメリカ・エルエルシー | 点群処理の方法、コンピュータシステム、プログラム及びコンピュータ可読記憶媒体 |
US11893691B2 (en) | 2020-07-09 | 2024-02-06 | Tencent America LLC | Point cloud geometry upsampling |
WO2022133752A1 (zh) * | 2020-12-22 | 2022-06-30 | Oppo广东移动通信有限公司 | 点云的编码方法、解码方法、编码器以及解码器 |
Also Published As
Publication number | Publication date |
---|---|
CN112219227A (zh) | 2021-01-12 |
BR112020023939A2 (pt) | 2021-02-09 |
US20210082153A1 (en) | 2021-03-18 |
EP3806042A4 (en) | 2021-06-23 |
EP3806042A1 (en) | 2021-04-14 |
JP7167144B2 (ja) | 2022-11-08 |
MX2020012935A (es) | 2021-02-15 |
CA3101091A1 (en) | 2019-12-12 |
JPWO2019235366A1 (ja) | 2021-06-24 |
KR20210018254A (ko) | 2021-02-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7167147B2 (ja) | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 | |
JP7330962B2 (ja) | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 | |
WO2019240284A1 (ja) | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 | |
JP7167144B2 (ja) | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 | |
JP7389028B2 (ja) | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 | |
JP7410879B2 (ja) | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 | |
JP7448519B2 (ja) | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 | |
JP7478668B2 (ja) | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 | |
WO2020075862A1 (ja) | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 | |
WO2020138352A1 (ja) | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 | |
JP2024052911A (ja) | 符号化方法 | |
WO2020196677A1 (ja) | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 | |
JP7197575B2 (ja) | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 | |
WO2020196680A1 (ja) | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 | |
JP7453212B2 (ja) | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 | |
WO2020184444A1 (ja) | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19814101 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 3101091 Country of ref document: CA |
|
ENP | Entry into the national phase |
Ref document number: 2020523071 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112020023939 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 2019814101 Country of ref document: EP Effective date: 20210111 |
|
ENP | Entry into the national phase |
Ref document number: 112020023939 Country of ref document: BR Kind code of ref document: A2 Effective date: 20201124 |