WO2020057338A1 - 点云编码方法和编码器 - Google Patents

点云编码方法和编码器 Download PDF

Info

Publication number
WO2020057338A1
WO2020057338A1 PCT/CN2019/103124 CN2019103124W WO2020057338A1 WO 2020057338 A1 WO2020057338 A1 WO 2020057338A1 CN 2019103124 W CN2019103124 W CN 2019103124W WO 2020057338 A1 WO2020057338 A1 WO 2020057338A1
Authority
WO
WIPO (PCT)
Prior art keywords
point cloud
patch
frame
global
occupancy map
Prior art date
Application number
PCT/CN2019/103124
Other languages
English (en)
French (fr)
Inventor
张德军
王田
扎克哈成科弗莱德斯拉夫
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201811121017.7A external-priority patent/CN110944187B/zh
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP19861619.5A priority Critical patent/EP3849188A4/en
Publication of WO2020057338A1 publication Critical patent/WO2020057338A1/zh
Priority to US17/205,100 priority patent/US11875538B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field

Definitions

  • the present application relates to the field of encoding and decoding technologies, and in particular, to point cloud encoding and decoding methods and encoders.
  • 3d sensor such as 3d scanner
  • the embodiments of the present application provide a point cloud encoding method and an encoder, which are helpful to improve encoding or compression efficiency.
  • a point cloud encoding method including: obtaining a global matching patch of an N-frame point cloud of a point cloud group; the point cloud group includes the N-frame point cloud; N ⁇ 2, N is an integer; M union patch occupation maps corresponding to M sets; each of the M sets includes N global matching patches, and the N global matching patches are patches having a matching relationship across the N-frame point cloud; the The union patch occupation map corresponding to the mth set in the M sets is the union of the occupation maps of the global matching patches in the mth set; 1 ⁇ m ⁇ M, where m and M are integers; for M Package the united patch occupancy maps to get the global occupancy map; the global occupancy map is used to determine the position of the M union patch occupancy maps in the global occupancy map; package each point cloud in the N-frame point cloud To obtain the occupancy map of the N-frame point cloud; wherein the position (ie, the first position) of the occupancy map of the m-th globally matching patch in the n-th point cloud in the occupancy map
  • the global matching patch in a frame point cloud refers to a patch in the point cloud of the frame and a patch with a matching relationship can be found in the point cloud of the frame other than the frame point cloud.
  • the patch having a matching relationship with a patch is a matching patch of a target patch, where the target patch is the patch or the target patch is a matching patch of the patch.
  • a patch with a matching relationship means that two patches have similar spatial positions and / or shapes in a three-dimensional space.
  • the method for determining whether two patches have a matching relationship is not limited in the specification.
  • the patch can be projected into a two-dimensional space according to the same projection plane.
  • the target IOU intersection and union ratio
  • the largest of all IOUs A patch whose maximum IOU exceeds a certain threshold is the matching patch of the target patch.
  • the patch with the largest IOU of all IOUs and the largest IOU exceeding a certain threshold is The matching patch of the target patch.
  • other methods for finding a matching relationship patch may also be adopted, which is not limited herein.
  • the number of global matching patches in the point cloud of each frame is equal, and the number is M, and M is a positive integer.
  • the N global matching patches included in a set are patches that have a matching relationship across the N-frame point cloud. It can be understood that each global matching patch in the N global matching patches comes from (or belongs to) a frame point. Cloud, and different global matching patches are from different point clouds, and the N global matching patches have a matching relationship.
  • the N global matching patches are from N frame point clouds, respectively.
  • the first position corresponds to the second position, which can be understood as: the coordinate value of the first position is the same as the coordinate value of the second position; or, the coordinate value of the first position is the same as the coordinate value of the second position in its coordinate system It is substantially the same in its coordinate system; or, the coordinate value of the first position is not the same as the coordinate value of the second position, but the position range where the second position is located covers the position range where the first position is located.
  • the global matching patches in a point cloud group are counted, and the global matching patches in the N-frame point cloud of the point cloud group are assigned the same (or similar) position during the packaging process.
  • the technical solution takes into consideration the temporal and spatial correlations between different point clouds, so that patches with matching relationships in different point clouds are spatially consistent.
  • the occupancy map of the point cloud can be used to guide the generation of the depth map and texture map of the point cloud
  • the encoding technology of the depth map and texture map of the point cloud is video / image encoding technology
  • the video / image encoding technology Code stream transmission is usually the difference data between frames. Therefore, patches with matching relationships in different point clouds are spatially consistent, which helps to improve encoding or compression efficiency and save code stream transmission overhead.
  • each frame of the N-frame point cloud is packed to obtain an occupation map of the N-frame point cloud, including: based on an m-th union patch occupation map in a global occupation map Position to determine the position of the occupancy map of the mth global matching patch in the nth point cloud in the occupancy map of the nth point cloud; based on some or all of the global matching patch occupancy maps in the nth point cloud Position in the occupancy map of the n-th point cloud.
  • the matching patch includes the m-th global matching patch.
  • the global matching patch and The occupancy map of the non-global matching patch is packaged to obtain the occupancy map of the nth frame point cloud, including: based on the position of the occupancy map of the partial or all global matching patch in the occupancy map of the nth frame point cloud, some or all of the global
  • the occupancy map of the matching patch is mapped to the initial occupancy map of the nth frame point cloud; based on the initial occupancy map of the nth frame point cloud mapped to the occupancy map with some or all of the patches, All global matching occupancy maps of patches other than patch are packaged to obtain the occupancy map of the nth point cloud. That is, first map global matching patches and then pack non-global matching patches.
  • the occupancy map of the non-global matching patch in the n-th point cloud occupies a preset position range.
  • the preset position range refers to a position range of the occupation map of the union set corresponding to the occupation map of the global matching patch in the n-th point cloud, and does not belong to the occupation map of the global matching patch in the n-th point cloud.
  • the arrangement of the occupancy maps of each patch in the occupancy map of the point cloud can be made tighter (or dense), so that the size of the occupancy map of the point cloud is smaller.
  • the occupancy map of the non-global matching patch in the n-th point cloud does not occupy a preset position range.
  • the preset position range refers to the position range of the occupation map of the union patch corresponding to the set to which the global matching patch in the n-th point cloud belongs, and does not belong to the occupation map of the global matching patch in the n-th point cloud. In this way, implementation is simpler.
  • packaging each point cloud in the N-frame point cloud includes: when the maximum size of the pre-occupation map of the first part of the point cloud in the N-frame point cloud and the size of the global occupancy map When the difference is within a preset range, each point cloud in the N-frame point cloud is packed according to the global occupancy map.
  • the pre-occupancy map of the first part of the point cloud is obtained by pre-packing the occupancy map of the patch in the first part of the point cloud according to the pre-packing algorithm; the pre-packing algorithm is an algorithm that does not use the global occupancy map for packaging.
  • packaging each point cloud in the N-frame point cloud includes pre-packaging the occupancy map of the patch in the first part of the point cloud in the N-frame point cloud according to a pre-packaging algorithm.
  • the pre-packing algorithm is an algorithm that does not use the global occupancy map for packaging; when the difference between the maximum size of the pre-occupation map of the first part of the point cloud and the size of the global occupancy map is within a preset range
  • each point cloud in the N-frame point cloud is packed according to the global occupancy map. This helps to obtain a larger coding gain.
  • packaging each point cloud in the N-frame point cloud includes: when the maximum size of the pre-occupation map of the first part of the point cloud in the N-frame point cloud and the N-frame point cloud When the difference between the maximum sizes of the pre-occupation maps of the second part of the point clouds in the preset range is within the preset range, each frame of the N-frame point clouds is packaged according to the global occupancy map.
  • the pre-occupancy map of the first part of the point cloud is obtained by pre-packing the occupancy map of the patch in the first part of the point cloud in the N-frame point cloud according to the pre-packing algorithm; the pre-packing algorithm does not use the global occupancy map An algorithm for packing; the pre-occupancy map of the second part of the point cloud is obtained by pre-packing the occupancy map of the second part of the point cloud in the N-frame point cloud according to the global occupancy map. This helps to obtain a larger coding gain.
  • packaging each point cloud in the N-frame point cloud includes pre-packaging the occupancy map of the patch in the first part of the point cloud in the N-frame point cloud according to a pre-packaging algorithm.
  • the pre-packing algorithm is an algorithm that does not use the global occupancy map for packaging; according to the global occupancy map, the patch occupancy map of the second part of the point cloud in the N-frame point cloud Pre-packing to obtain the pre-occupied map of the second part of the point cloud; when the difference between the maximum size of the pre-occupied map of the first part of the point cloud and the maximum size of the pre-occupied map of the second part of the point cloud is within a preset range, according to The global occupancy map packs each frame of the N-frame point cloud. This helps to obtain a larger coding gain.
  • the preset range may be determined according to the coding gain, and may specifically be an empirical value.
  • the embodiments of the present application are not limited to this.
  • the point cloud group is a group of frames (GOF).
  • the number of frames of a point cloud included in a GOF is configurable.
  • a GOF includes a 32-frame point cloud.
  • the point cloud group is a sub-GOF
  • the sub-GOF includes a time-continuous multi-frame point cloud in one GOF.
  • One GOF can be divided into at least two sub-GOFs, and the number of frames of point clouds included in different sub-GOFs may be equal or unequal.
  • obtaining the global matching patch in each frame of the point cloud group includes: obtaining a reference point cloud; the reference point cloud is any frame point cloud in the GOF; for the first point cloud in the reference point cloud; i patches, if there is a patch matching the target patch in each non-reference point cloud in the GOF, then it is determined that the i-th patch and the patch matching the target patch are global matching patches; where the target patch is the first The i patch or target patch is the matching patch of the i-th patch, and the i-th patch is any patch in the reference point cloud. It can be understood that the order of the patches in the reference point cloud in the point cloud group determines the order of the global matching patches in the point cloud group.
  • the reference point cloud is the first frame point cloud in the GOF.
  • the global matching patches in a GOF can be guaranteed to be in the same order as the global matching patches in the first frame, which can bring certain gain to subsequent video encoding.
  • the reference point cloud is a point cloud in the first frame of the sub-GOF or a point cloud in the previous sub-GOF of the sub-GOF.
  • the reference point cloud is the last frame point cloud in the previous sub-GOF of the sub-GOF.
  • the last frame point cloud is a point cloud obtained after performing a packaging operation. This helps to ensure the temporal continuity of the global matching patch of two neighboring GOFs, thereby improving the encoding or compression efficiency.
  • an encoder including: a packaging module and an encoding module.
  • the packaging module is used to perform the following steps: obtain a global matching patch of the N-frame point cloud of the point cloud group; the point cloud group includes the N-frame point cloud; N ⁇ 2, N is an integer; determine M corresponding to M sets Union patch occupation map; each of the M sets includes N global matching patches, and the N global matching patches are patches having a matching relationship across the N-frame point cloud; the mth in the M sets
  • the patch occupation map corresponding to each set is the union of the occupation maps of the global matching patches in the mth set; 1 ⁇ m ⁇ M, m and M are integers;
  • the global occupancy map is obtained by packaging; the global occupancy map is used to determine the position of the M union patch occupancy map in the global occupancy map; each frame of the N-frame point cloud is packaged to obtain the N-frame point cloud
  • a point cloud encoding method including: pre-packaging a patch occupancy map of a first part of the point cloud in the N-frame point cloud according to a pre-packaging algorithm to obtain a pre-occupation map of the first part of the point cloud;
  • the pre-packing algorithm is an algorithm that does not use a global occupancy map.
  • the packaging method is determined based on the global occupancy map Pack each point cloud in the N-frame point cloud. Otherwise, it is determined that the packing method is the pre-packing algorithm. Then, the N-frame point cloud is encoded according to the occupancy map of the N-frame point cloud.
  • the global occupancy graph refer to the foregoing.
  • a point cloud encoding method including: pre-packaging a patch occupancy map of a first part of the point cloud in the N-frame point cloud according to a pre-packing algorithm to obtain a pre-occupation map of the first part of the point cloud;
  • the pre-packing algorithm is an algorithm that does not use a global occupancy map.
  • the occupancy map of the second part of the point cloud in the N-frame point cloud is pre-packed to obtain the pre-occupation of the second part of the point cloud.
  • the packaging method is based on the global occupancy map for the N-frame point cloud.
  • Each frame of the point cloud is packed. Otherwise, it is determined that the packing method is the pre-packing algorithm.
  • the N-frame point cloud is encoded according to the occupancy map of the N-frame point cloud.
  • an encoder including: a packaging module and an encoding module.
  • the packaging module is used to pre-pack the occupancy map of the first part of the point cloud in the N-frame point cloud according to the pre-packing algorithm to obtain the pre-occupation map of the first part of the point cloud; the pre-packing algorithm does not use the global occupancy map.
  • Packing algorithm when the difference between the maximum size of the pre-occupation map of the first part of the point cloud and the size of the global occupancy map is within a preset range, the packaging method is determined based on the global occupancy map for each frame in the N-frame point cloud Point clouds are packaged. Otherwise, it is determined that the packing method is the pre-packing algorithm.
  • the encoding module is configured to encode the N-frame point cloud according to the occupancy map of the N-frame point cloud.
  • an encoder including: a packaging module and an encoding module.
  • the packaging module is used to pre-pack the occupancy map of the first part of the point cloud in the N-frame point cloud according to the pre-packing algorithm to obtain the pre-occupation map of the first part of the point cloud; the pre-packing algorithm does not use the global occupancy map.
  • Packing algorithm according to the global occupancy map, pre-packing the occupancy map of the second part of the point cloud in the N-frame point cloud to obtain the pre-occupation map of the second part of the point cloud;
  • the packaging method is to pack each frame of point clouds in the N-frame point cloud according to the global occupancy map. Otherwise, it is determined that the packing method is the pre-packaging algorithm.
  • the encoding module is configured to encode the N-frame point cloud according to the occupancy map of the N-frame point cloud.
  • a point cloud encoding method including: obtaining a global matching patch of N-frame point clouds of a point cloud group; determining M union set patch occupancy maps corresponding to M sets; among the M sets, Each set of N includes global matching patches, which are patches that have a matching relationship across the N-frame point cloud; the union patch corresponding to the mth set in the M sets occupies
  • the picture shows the union of the occupation maps of the global matching patches in the mth set; 1 ⁇ m ⁇ M, m and M are integers; the patch maps of the M union sets are packed to obtain the global occupation Map; the global occupancy map is used to determine the position of the M union patch occupancy map in the global occupancy map; using the global occupancy map to globally match each point cloud in the N-frame point cloud
  • the occupancy map of the patch and the non-global matching patch is packaged to obtain the occupancy map of the N-frame point cloud; and the N-frame point cloud is encoded according to the occupancy map of the N-frame point cloud.
  • the global matching patch in a point cloud group is counted, and the global occupancy map is obtained based on the global matching patch, and the global matching patch and non-global matching patch in the point cloud for each frame in the point cloud group are calculated.
  • Occupancy diagrams are packaged. This creates conditions for achieving “the spatial and temporal correlation between different point clouds and making patches with matching relationships in different point clouds consistent in space”, thus helping to improve encoding or compression Efficiency, and save code stream transmission overhead.
  • an encoder including: a packaging module and an encoding module.
  • the packaging module is used to perform the following steps: obtaining a global matching patch of the N-frame point cloud of the point cloud group; determining M union set patch occupation maps corresponding to M sets; each of the M sets includes N global matching patches, the N global matching patches are patches with a matching relationship across the N-frame point cloud; the patch occupation map of the union set corresponding to the mth set in the M sets is the first Union of occupation maps of global matching patches in m sets; 1 ⁇ m ⁇ M, m and M are integers; the M occlusion patch occupation maps are packed to obtain a global occupation map; the global The occupancy map is used to determine the position of the M union patch occupancy map in the global occupancy map; the global occupancy map is used to globally match patches and non-global matches in each frame of the N-frame point cloud.
  • the occupancy map of the patch is packaged to obtain the occupancy map of the N-frame point cloud; and the N-frame point cloud is encoded according to the occupancy map of the N-frame point cloud.
  • the encoding module is configured to encode the N-frame point cloud according to the occupation map of the N-frame point cloud. Encoding module (can be implemented by part or all of modules 103 to 112 as described in FIG. 2).
  • the occupancy map of the m-th global matching patch in the n-th point cloud in the N-frame point cloud is in the n-th point cloud.
  • the position in the occupancy graph corresponds to the position of the mth union set patch occupancy graph in the M union patch occupancy graph in the global occupancy graph; 1 ⁇ n ⁇ N, n is an integer.
  • a device for encoding point cloud data may include:
  • Memory for storing point cloud data.
  • the encoder is configured to execute any one of the point cloud encoding methods of the first aspect, the third aspect, the fourth aspect, or the seventh aspect described above.
  • an encoding device which includes a non-volatile memory and a processor coupled to each other, and the processor calls program code stored in the memory to execute the first aspect, the third aspect, the fourth aspect, or the seventh aspect. Aspects of any or all of the steps of the method.
  • an encoding device includes a memory and a processor.
  • the memory is configured to store program code; the processor is configured to call the program code to execute any one of the point cloud encoding methods of the first aspect, the third aspect, the fourth aspect, or the seventh aspect.
  • a computer-readable storage medium stores program code, and when the program code runs on a computer, the computer executes the first aspect, the third aspect, and the fourth aspect. Or part or all of the steps of any of the methods of the seventh aspect.
  • a computer program product that, when the computer program product runs on a computer, causes the computer to execute a part of the method of any one of the first aspect, the third aspect, the fourth aspect, or the seventh aspect, or All steps.
  • FIG. 1 is a block diagram of a point cloud decoding system that can be used in an example of an embodiment of the present application
  • FIG. 2 is a schematic block diagram of an encoder that can be used in an example of an embodiment of the present application
  • FIG. 3 is a schematic diagram of a point cloud, a point cloud patch, and a point cloud occupancy map applicable to the embodiments of the present application;
  • FIG. 4 is a schematic block diagram of a decoder that can be used in an example of an embodiment of the present application
  • FIG. 5 is a schematic flowchart of a packaging method provided in MPEG point cloud coding technology
  • FIG. 6 is a schematic flowchart of a point cloud encoding method according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a union of two occupancy graphs applicable to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a correspondence relationship between a first position and a second position provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a point cloud occupancy map obtained based on the methods shown in FIG. 5 and FIG. 6 according to an embodiment of the present application.
  • FIG. 10 and FIG. 11 are schematic process diagrams of a packaging method provided in FIG. 6 according to an embodiment of the present application.
  • FIG. 12 is a schematic diagram of an occupation map of each point cloud in two adjacent sub-GOFs obtained according to an embodiment of the present application.
  • FIG. 13 is a schematic flowchart of another point cloud encoding method according to an embodiment of the present application.
  • FIG. 14 is a schematic block diagram of an encoder according to an embodiment of the present application.
  • FIG. 15 is a schematic block diagram of an implementation manner of an encoding device used in an embodiment of the present application.
  • At least one (species) in the embodiments of the present application includes one (species) or a plurality (species).
  • Multiple (species) means two (species) or two or more.
  • at least one of A, B, and C includes: A alone, B alone, A and B simultaneously, A and C, B and C, and A, B, and C.
  • the term “and / or” in the embodiments of the present application is only an association relationship describing an associated object, and indicates that there can be three types of relationships. For example, A and / or B can mean: A exists alone, and A and B, there are three cases of B alone.
  • character “/” in the embodiments of the present application generally indicates that the related objects before and after are an “or” relationship. In the formula, the character "/” means division operation, such as A / B means A divided by B.
  • first and “second” in the embodiments of the present application are used to distinguish different objects, and the order of the different objects is not limited.
  • FIG. 1 is a schematic block diagram of a point cloud decoding system 1 that can be used in an example of an embodiment of the present application.
  • the terms "point cloud coding” or “coding” may generally refer to point cloud coding or point cloud decoding.
  • the encoder 100 of the point cloud decoding system 1 may encode the point cloud to be encoded according to any one of the point cloud encoding methods proposed in this application.
  • the decoder 200 of the point cloud decoding system 1 may decode the point cloud to be decoded according to the point cloud decoding method corresponding to the point cloud encoding method used by the encoder.
  • the point cloud decoding system 1 includes a source device 10 and a destination device 20.
  • the source device 10 generates encoded point cloud data. Therefore, the source device 10 may be referred to as a point cloud encoding device.
  • the destination device 20 may decode the encoded point cloud data generated by the source device 10. Therefore, the destination device 20 may be referred to as a point cloud decoding device.
  • Various implementations of the source device 10, the destination device 20, or both may include one or more processors and a memory coupled to the one or more processors.
  • the memory may include, but is not limited to, random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), read-only memory (EEPROM) ), Flash memory, or any other medium that can be used to store the desired program code in the form of instructions or data structures accessible by a computer, as described herein.
  • RAM random access memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • EEPROM read-only memory
  • Flash memory or any other medium that can be used to store the desired program code in the form of instructions or data structures accessible by a computer, as described herein.
  • the source device 10 and the destination device 20 may include various devices including desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, telephone handsets, such as so-called “smart” phones, etc. Cameras, televisions, cameras, display devices, digital media players, video game consoles, on-board computers, or the like.
  • the destination device 20 may receive the encoded point cloud data from the source device 10 via the link 30.
  • the link 30 may include one or more media or devices capable of moving the encoded point cloud data from the source device 10 to the destination device 20.
  • the link 30 may include one or more communication media that enable the source device 10 to send the encoded point cloud data directly to the destination device 20 in real time.
  • the source device 10 may modulate the encoded point cloud data according to a communication standard, such as a wireless communication protocol, and may send the modulated point cloud data to the destination device 20.
  • the one or more communication media may include wireless and / or wired communication media, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
  • RF radio frequency
  • the one or more communication media may form part of a packet-based network, such as a local area network, a wide area network, or a global network (eg, the Internet).
  • the one or more communication media may include a router, a switch, a base station, or other devices that facilitate communication from the source device 10 to the destination device 20.
  • the encoded data may be output from the output interface 140 to the storage device 40.
  • the encoded point cloud data can be accessed from the storage device 40 through the input interface 240.
  • the storage device 40 may include any of a variety of distributed or locally-accessed data storage media, such as a hard disk drive, a Blu-ray disc, a digital versatile disc (DVD), or a compact disc (read-only). only memory (CD-ROM), flash memory, volatile or non-volatile memory, or any other suitable digital storage medium for storing encoded point cloud data.
  • the storage device 40 may correspond to a file server or another intermediate storage device that may hold the encoded point cloud data generated by the source device 10.
  • the destination device 20 may access the stored point cloud data from the storage device 40 via streaming or download.
  • the file server may be any type of server capable of storing the encoded point cloud data and transmitting the encoded point cloud data to the destination device 20.
  • the example file server includes a network server (eg, for a website), a file transfer protocol (FTP) server, a network attached storage (NAS) device, or a local disk drive.
  • the destination device 20 can access the encoded point cloud data through any standard data connection, including an Internet connection.
  • the transmission of the encoded point cloud data from the storage device 40 may be a streaming transmission, a download transmission, or a combination of the two.
  • the point cloud decoding system 1 illustrated in FIG. 1 is merely an example, and the technology of the present application can be applied to point cloud decoding (for example, a point cloud) that does not necessarily include any data communication between the point cloud encoding device and the point cloud decoding device.
  • (Cloud encoding or point cloud decoding) device In other examples, data is retrieved from local storage, streamed over a network, and so on.
  • the point cloud encoding device may encode the data and store the data to a memory, and / or the point cloud decoding device may retrieve the data from the memory and decode the data. In many instances, encoding and decoding are performed by devices that do not communicate with each other, but only encode data to and / or retrieve data from memory and decode data.
  • the source device 10 includes a data source 120, an encoder 100, and an output interface 140.
  • the output interface 140 may include a modulator / demodulator (modem) and / or a transmitter (or a transmitter).
  • the data source 120 may include a point cloud capture device (e.g., a camera), a point cloud archive containing previously captured point cloud data, a point cloud feed interface to receive point cloud data from a point cloud content provider, and / or Computer graphics systems for generating point cloud data, or a combination of these sources of point cloud data.
  • the encoder 100 may encode point cloud data from the data source 120.
  • the source device 10 sends the encoded point cloud data directly to the destination device 20 via the output interface 140.
  • the encoded point cloud data may also be stored on the storage device 40 for later access by the destination device 20 for decoding and / or playback.
  • the destination device 20 includes an input interface 240, a decoder 200, and a display device 220.
  • the input interface 240 includes a receiver and / or a modem.
  • the input interface 240 may receive the encoded point cloud data via the link 30 and / or from the storage device 40.
  • the display device 220 may be integrated with the destination device 20 or may be external to the destination device 20. Generally, the display device 220 displays the decoded point cloud data.
  • the display device 220 may include various display devices, such as a liquid crystal display (LCD), a plasma display, an organic light-emitting diode (OLED) display, or other types of display devices.
  • LCD liquid crystal display
  • OLED organic light-emitting diode
  • the encoder 100 and the decoder 200 may each be integrated with an audio encoder and decoder, and may include an appropriate multiplexer-demultiplexer (multiplexer- demultiplexer (MUX-DEMUX) unit or other hardware and software to handle encoding of both audio and video in a common or separate data stream.
  • MUX-DEMUX multiplexer-demultiplexer
  • the MUX-DEMUX unit may conform to the ITU H.223 multiplexer protocol, or other protocols such as the user datagram protocol (UDP).
  • the encoder 100 and the decoder 200 may each be implemented as any of a variety of circuits such as one or more microprocessors, digital signal processors (DSPs), and application specific integrated circuits (applications) specific integrated circuit (ASIC), field-programmable gate array (FPGA), discrete logic, hardware, or any combination thereof. If the present application is implemented partially in software, the device may store instructions for the software in a suitable non-volatile computer-readable storage medium and may use one or more processors to execute the instructions in hardware Thus implementing the technology of the present application. Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) may be considered as one or more processors. Each of the encoder 100 and the decoder 200 may be included in one or more encoders or decoders, any of which may be integrated as a combined encoder / decoder in a corresponding device (Codec).
  • codec codec
  • This application may generally refer to the encoder 100 as “signaling” or “sending” certain information to another device, such as the decoder 200.
  • the terms “signaling” or “sending” may generally refer to the transmission of syntax elements and / or other data to decode the compressed point cloud data. This transfer can occur in real time or almost real time. Alternatively, this communication may occur after a period of time, such as may occur when a syntax element is stored in a coded bit stream to a computer-readable storage medium at the time of encoding, and the decoding device may then after the syntax element is stored to this medium Retrieve the syntax element at any time.
  • FIG. 2 it is a schematic block diagram of an encoder 100 that can be used in an example of an embodiment of the present application.
  • FIG. 2 is based on an MPEG (Moving Picture Expert Group) point cloud compression (PCC) coding framework as an example for illustration.
  • the encoder 100 may include a patch information generation module 101, a packing module 102, a depth map generation module 103, a texture map generation module 104, a first filling module 105, an image or video-based encoding module 106, and an occupation.
  • the encoder 100 may further include a point cloud filtering module 110, a second filling module 111, a point cloud reconstruction module 112, and the like. among them:
  • the patch information generating module 101 is configured to divide a point cloud of a frame into multiple patches by using a certain method, and obtain related information of the generated patches.
  • patch refers to a set of partial points in a frame of point cloud.
  • one connected area corresponds to one patch.
  • the relevant information of the patch may include, but is not limited to, at least one of the following information: the number of patches divided by the point cloud, the position information of each patch in the three-dimensional space, the index of the normal coordinate axis of each patch, each Depth maps generated from 3D space to 2D space for each patch, depth map size (such as width and height of each depth map) of each patch, and occupancy maps generated from 3D space to 2D space for each patch.
  • the relevant information such as the number of patches divided by the point cloud, the index of the normal axis of each patch, the depth map size of each patch, the position information of each patch in the point cloud, each
  • the size information and the like of the occupancy map of the patch can be sent as auxiliary information to the auxiliary information encoding module 108 for encoding (that is, compression encoding).
  • the occupancy map of each patch can be sent to the packaging module 102 for packaging.
  • the patches of the point cloud are arranged in a specific order, for example, in descending (or ascending) order of the width / height of the occupancy map of each patch; Then, in accordance with the order of the arranged patches, the patch occupancy map is sequentially inserted into the available area of the point cloud occupancy map to obtain the point cloud occupancy map.
  • the specific position information of each patch in the point cloud occupancy map and the depth map of each patch can be sent to the depth map generation module 103.
  • the occupancy map of the point cloud may be filled by the second filling module 111 and sent to the occupancy map encoding module 107 for encoding.
  • the occupancy map of the point cloud can be used to guide the depth map generation module 103 to generate the depth map of the point cloud and the guided texture map generation module 104 to generate the texture map of the point cloud.
  • FIG. 3 it is a schematic diagram of a point cloud, a point cloud patch, and a point cloud occupancy map applicable to the embodiment of the present application.
  • FIG. 3 (a) in FIG. 3 is a schematic diagram of a point cloud
  • (b) in FIG. 3 is a schematic diagram of a patch based on the point cloud obtained in (a) of FIG. 3
  • (c) in FIG. 3 is FIG. 3
  • (b) is a schematic diagram of the occupancy map of the point cloud obtained by mapping the occupancy map of each patch obtained by mapping the patches on a two-dimensional plane.
  • a depth map generating module 103 is configured to generate a depth map of the point cloud according to the occupancy map of the point cloud, the occupancy map of each patch of the point cloud, and depth information, and send the generated depth map to the first filling module 105 To fill the blank pixels in the depth map to get the filled depth map.
  • the texture map generating module 104 is configured to generate a texture map of the point cloud according to the occupancy map of the point cloud, the occupancy map of each patch of the point cloud, and texture information, and send the generated texture map to the first filling module 105 To fill the blank pixels in the texture map to get the filled texture map.
  • the filled depth map and the filled texture map are sent by the first filling module 105 to the image or video-based encoding module 106 to perform image or video-based encoding.
  • the image or video-based encoding module 106, the occupancy map encoding module 107, and the auxiliary information encoding module 108 send the obtained encoding result (that is, the code stream) to the multiplexing module 109 to merge into a code stream.
  • the code stream may be sent to the output interface 140.
  • the encoding result (ie code stream) obtained by the image or video based encoding module 106 is sent to the point cloud reconstruction module 112 for point cloud reconstruction to obtain a reconstructed point cloud (that is, a reconstructed Point cloud geometry information).
  • the geometric information of the point cloud refers to the coordinate values of points in the point cloud (for example, each point in the point cloud) in a three-dimensional space.
  • the point cloud reconstruction module 112 may also send the texture information of the point cloud and the reconstructed point cloud geometric information to the coloring module, and the coloring module is used to color the reconstructed point cloud to obtain the reconstructed point cloud. Texture information.
  • the texture map generating module 104 may further generate a texture map of the point cloud based on the information obtained by filtering the reconstructed point cloud geometric information through the point cloud filtering module 110.
  • the decoder 200 may include a demultiplexing module 201, an image or video-based decoding module 202, an occupancy map decoding module 203, an auxiliary information decoding module 204, a point cloud geometric information reconstruction module 205, and a point.
  • the cloud filtering module 206 and the texture information reconstruction module 207 of the point cloud are among them:
  • the demultiplexing module 201 is configured to send an input code stream (that is, a combined code stream) to a corresponding decoding module. Specifically, the code stream containing the encoded texture map and the coded depth map are sent to the image or video-based decoding module 202; the code stream containing the encoded occupancy map is sent to the occupancy map decoding module 203 , Sending a code stream containing the encoded auxiliary information to the auxiliary information decoding module 204.
  • the image or video-based decoding module 202 is configured to decode the received encoded texture map and the encoded depth map; and then send the decoded texture map information to the point cloud texture information reconstruction module 207, The decoded depth map information is sent to the geometric information reconstruction module 205 of the point cloud.
  • the occupancy map decoding module 203 is configured to decode the received code stream containing the encoded occupancy map, and send the decoded occupancy map information to the geometric information reconstruction module 205 of the point cloud.
  • the auxiliary information decoding module 204 is configured to decode the received encoded auxiliary information, and send the decoded information indicating the auxiliary information to the geometric information reconstruction module 205 of the point cloud.
  • the point cloud geometric information reconstruction module 205 is configured to reconstruct the point cloud geometric information according to the received occupancy map information and auxiliary information. After the geometric information of the reconstructed point cloud is filtered by the point cloud filtering module 206, it is sent to the texture information reconstruction module 207 of the point cloud.
  • the point cloud texture information reconstruction module 207 is configured to reconstruct the point cloud texture information to obtain a reconstructed point cloud.
  • the decoder 200 shown in FIG. 4 is only an example. In specific implementation, the decoder 200 may include more or fewer modules than those shown in FIG. 4. This embodiment of the present application does not limit this.
  • the encoder first divides the point cloud to be encoded (that is, the current frame or the current frame point cloud) into several patches according to certain criteria, and these patches do not overlap with each other. Then, each patch is projected from a three-dimensional space to a two-dimensional plane to obtain a two-dimensional image (that is, the occupation map of the patch). Next, the occupancy maps of all patches (or the occupancy maps of the reduced resolution patches) are closely arranged on a two-dimensional image according to some rules to obtain the current frame occupancy map. This method of arranging the occupancy graphs of a patch is called packing. Subsequently, the current frame depth map and the current frame texture map are generated in the packing order.
  • the current frame depth map is a two-dimensional image generated by the projection depth of each patch in the packing order.
  • the current frame texture map is a two-dimensional image generated by the patched texture maps in the packing order.
  • the current frame occupancy map is a two-value two-dimensional image, which is used to indicate whether each pixel position of the two-dimensional image is occupied by a point in the point cloud.
  • the resolution of the current frame occupancy map is lower than the resolution of the current frame depth map and the current frame texture map.
  • the coordinates of the patch (or the occupation map of the patch) in the current frame occupation map can be expressed as (x, y), x is the minimum coordinate value of each point of the patch occupation map on the X axis, and y is the patch The minimum coordinate value of each point of the occupancy map on the Y axis, of course, the embodiment of the present application is not limited thereto.
  • the coordinate system of the current frame occupancy map is an X-Y coordinate system, the X axis is a coordinate axis in a horizontal direction, and the Y axis is a coordinate axis in a vertical direction.
  • the execution body of the method shown in FIG. 5 may be a packaging module in an encoder.
  • the method shown in FIG. 5 assumes that the encoder divides the current frame into patchCount patches. These patches are usually stored in the form of a patch array.
  • the method shown in FIG. 5 includes:
  • Step 1 The patch array of the current frame is sorted in descending order according to the size (sizeU) of the patch occupancy map, the height (sizeV), or the size of the patch index (patchIndex) to obtain a sequence.
  • sizeU size of the patch occupancy map
  • sizeV the height
  • patchIndex size of the patch index
  • Step 2 Calculate the initial values of the width and height of the current frame occupancy map.
  • patch [0] .sizeU0 ⁇ obtain the initial value of the width of the current frame occupation image.
  • minimumImageWidth represents the minimum width of the current frame occupancy map
  • patch [0] .sizeU0 represents the width of the occupancy map of patch [0].
  • Step 3 From left to right in the current frame occupancy map, look for positions in the occupancy map that can be put in patch [i] from top to bottom. Where the current frame occupancy map is occupied by the occupancy map that has been placed in the patch, the occupancy map of the new patch cannot be placed again. It can be understood that the occupancy map of patch [i] in step 3 can be replaced with the occupancy map of patch [i] after the resolution is reduced.
  • step 5 is performed.
  • Step 4 Record the storage locations u0 and v0 of the occupation map of patch [i] in the current frame occupation map.
  • Step 5 Double the height occupancySizeV of the current frame occupancy map, and continue to perform Step 3 above based on the updated (ie, doubled) current frame.
  • step 3 the patch array of the current frame is traversed and step 3 described above is performed to obtain the final current frame occupation map.
  • the process of performing packaging can be considered as an update process of the current frame occupancy map.
  • the current frame occupancy map described below refers to the final current frame occupancy map.
  • the packing method shown in FIG. 5 directly packs the occupancy maps of all patches after sorting in descending order, which is simple to implement, but may cause the occupancy maps of the same patch to have different positions on the front and rear frame occupancy maps. This will cause a large loss of coding performance based on the video / image coding module, and will require more bits when coding the patch side information.
  • the embodiments of the present application provide a point cloud encoding method and an encoder.
  • the point cloud in the embodiment of the present application is a dynamic point cloud, and the point cloud before and after the frame has a correlation in time and space.
  • the correlation may be specifically embodied by the existence of a global matching patch in the point cloud group.
  • FIG. 6 it is a schematic flowchart of a point cloud encoding method according to an embodiment of the present application.
  • the execution body of the method shown in FIG. 6 may be the encoder 100 in FIG. 1.
  • the method shown in FIG. 6 includes:
  • S101 Obtain a global matching patch of an N-frame point cloud of a point cloud group.
  • the point cloud group includes the N frame point clouds; N ⁇ 2, and N is an integer.
  • the N-frame point cloud is a continuous N-frame point cloud in time.
  • the embodiment of the present application is not limited thereto.
  • a global matching patch in a frame point cloud means that a patch with a matching relationship can be found in the frame point cloud and in each point cloud of the point cloud group except the frame point cloud.
  • a patch that has a matching relationship with the patch is a global matching patch.
  • the point cloud group includes 4 frames of point clouds (labeled as point clouds 1 to 4 respectively).
  • point clouds 1 to 4 For any patch in point cloud 1, if a match with the patch can be found in point clouds 2 to 4, Relational patches.
  • patch11 in point cloud 1 has matching relationships: patch21 in point cloud 2, patch31 in point cloud 3, and patch41 in point cloud 4.
  • patch11, 21, 31, and 41 are all Is a global matching patch
  • patch1w is the global matching patch in the point cloud w, 1 ⁇ w ⁇ 4, w is an integer.
  • a patch having a matching relationship with a patch is a matching patch of a target patch, where the target patch is the patch or a matching patch of the patch.
  • the matching patch of one patch in a frame of point cloud in another frame of point cloud may be the largest intersection (IoU) in the point cloud of the other frame with the patch, and The IoU is greater than or equal to a preset threshold.
  • IoU the largest intersection
  • the number of global matching patches in the point cloud of each frame is equal, and the number (that is, M in the following) may be a positive integer greater than or equal to 1. If the point cloud group includes N-frame point clouds, then for any global matching patch, the number of patches having a matching relationship with the global matching patch is N-1.
  • the N-1 patches are all matching patches of the patch.
  • At least one of the N-1 patches is a matching patch of the patch, and other patches are in a chain matching relationship.
  • chain matching relationship refer to the following.
  • S102 Determine M union set occupation maps corresponding to M sets.
  • Each of the M sets includes N global matching patches, and the N global matching patches are patches having a matching relationship across the above-mentioned N-frame point cloud; the union set corresponding to the mth set in the M sets
  • the patch occupation graph is the union of the occupation graphs of the global matching patches in the m-th set; 1 ⁇ m ⁇ M, where m and M are integers.
  • the N global matching patches included in a set are patches that have a matching relationship across the N-frame point cloud. It can be understood that the N global matching patches are from the N-frame point cloud, and the N global matching patches have a matching relationship. .
  • the N global matching patches are respectively from N frame point clouds. It can be understood that each of the N global matching patches is from (or belongs to) a frame point cloud, and different global matching patches are from different frames. Point cloud. For example, the nth global matching patch in the N global matching patches is from the nth frame point cloud in the N frame point cloud.
  • the N global matching patches included in a set are patches that have a matching relationship across the N-frame point cloud.
  • This feature not only describes the characteristics of the N global matching patches included in a set, but also describes the global matching.
  • the concept of patch, that is, a global matching patch in a frame point cloud means that a patch with a matching relationship can be found in the point cloud of the frame and in the point cloud group except for the frame point cloud. patch.
  • the N global matching patches have a matching relationship, and may include: any two of the N global matching patches are matched, in other words, the N global matching patches are matched with each other.
  • the N global matching patches have a matching relationship, which may include: N global matching patches are matched in a chain, for example, the first global matching patch and the second global matching of the N global matching patches.
  • patch match the 2nd global match patch matches the 3rd global match patch
  • the 3rd global match patch matches the 4th global match patch
  • N global matching patches have a matching relationship, which may include: any two of the N global matching patches are partially matched, and any other global matching patches are matched in a chain.
  • the point cloud of each frame includes M global matching patches.
  • the mth global matching patch in the M global matching patches belongs to the mth set among the above M sets.
  • the method of obtaining the union of two-dimensional graphics in the prior art can be used to obtain the union of at least two global matching patch occupancy maps, thereby obtaining the patch occupancy map corresponding to each set.
  • FIG. 7 it is a schematic diagram of a union of occupation maps of two patches applicable to an embodiment of the present application.
  • the rectangle ABCD represents the occupation map of global matching patch1 in point cloud 1
  • the rectangle AEFG represents the occupation map of global matching patch2 in point cloud 2.
  • the union of these two globally matched patch occupancy graphs is the union patch occupancy graph, as shown by the rectangular AEFG in FIG. 7.
  • the area (or range or area range) occupied by an occupation map of a global matching patch is less than or equal to the area of the union patch occupation map corresponding to the global matching patch (that is, the union patch occupation map corresponding to the set to which the global matching patch belongs).
  • the occupied area, or a union patch occupation map corresponding to the set to which the global matching patch belongs covers the occupation map of the global matching patch.
  • the area occupied by the occupation map of patch 1 in FIG. 7 is smaller than the area occupied by the union patch occupation map, and the area occupied by the global matching patch 2 of FIG. 7 is equal to the area occupied by the union patch occupation map.
  • S103 Pack the M union patch occupation maps above to obtain a global occupation map.
  • the global occupation map is used to determine the positions of the M union patch occupation maps in the global occupation map (or M resolution reduction maps). The position of the union occupation graph in the global occupation graph).
  • the packaging method shown in FIG. 5 may be used to package the M union set patch occupancy maps.
  • the positions of the above-mentioned M union set occupation maps in the global occupation map can be characterized by the values of u0 and v0 shown in Table 1.
  • S104 Package each point cloud in the N-frame point cloud to obtain an occupation map of the N-frame point cloud.
  • the position (i.e., the first position) of the occupancy map of the m-th globally matched patch in the n-th point cloud in the occupancy map of the n-th point cloud corresponds to the m-th union in the patch occupancy map of the m union.
  • the position of the set patch occupation graph in the global occupation graph ie, the second position).
  • the first position corresponds to the second position, which can be understood as: the coordinate value of the first position is the same as the coordinate value of the second position, or the coordinate value of the first position is in its coordinate system and the coordinate value of the second position is in its The coordinate system is substantially the same; or the coordinate value of the first position is different from the coordinate value of the second position, but the position range where the second position is located covers the position range where the first position is located.
  • the coordinate value of the first position may be characterized by the values of positions u0 and v0 of the mth globally matched patch occupation map in the nth frame point cloud occupation map in the nth frame point cloud occupation map.
  • the coordinate value of the second position can be characterized by the values of the positions u0, v0 of the mth union set occupation map in the global occupation map.
  • the position range where the first position is located is the area occupied by the mth global matching patch occupation map, and the position range where the second position is located is the area occupied by the mth union patch occupation map.
  • FIG. 8 it is a schematic diagram of a correspondence relationship between a first position and a second position according to an embodiment of the present application.
  • FIG. 8 is drawn based on FIG. 7.
  • (B1) and (b2) in FIG. 8 each indicate the position (i.e., the occupancy map shown by the rectangular ABCD) of the m-th global matching patch in the n-th point cloud (i.e., the occupancy map shown by the rectangular ABCD).
  • First position wherein the coordinate value of the first position in (b1) in FIG. 8 is the same as the coordinate value of the second position.
  • the coordinate value of the first position in (b2) in FIG. 8 is different from the coordinate value of the second position, but the position range where the second position is located covers the position range where the first position is located.
  • the occupation map of the non-global matching patch in the n-th point cloud occupies a preset position range; where the preset position range refers to the union corresponding to the set to which the global matching patch in the n-th point cloud belongs.
  • the patch occupation map does not belong to the area of the global matching patch occupation map in the n-th point cloud. In this way, the arrangement of the occupancy maps of each patch in the occupancy map of the point cloud can be made tighter (or dense), so that the size of the occupancy map of the point cloud is smaller.
  • the "preset position range" in this optional implementation is for a point cloud, that is, the preset position range includes a global match for each or all of the global matches in the point cloud.
  • the preset position range of the patch For example, in conjunction with FIG. 7, when the occupancy map of the patch in the point cloud 1 is packaged, for patch 1 (such as the patch indicated by the rectangular ABCD), the preset position range is shown by the black shaded part in FIG. 7. region. That is, the occupancy map of the non-global matching patch can be mapped (or placed) in the area shown by the black shaded part in FIG. 7.
  • the occupancy map of the non-global matching patch in the n-th point cloud does not occupy the preset position range.
  • a non-global matching patch occupancy map cannot be mapped (or placed) in the area shown by the black shaded part in FIG. 7.
  • S105 Encode the N-frame point cloud according to the occupancy map of the N-frame point cloud.
  • the generation of the depth map and texture map of the point cloud is guided according to the occupancy map of the point cloud in each frame of the N-frame point cloud occupancy map, and the depth map and texture map of the point cloud are video / image encoded, etc. .
  • the principle of the encoder 100 above refers to the description of the principle of the encoder 100 above.
  • the global matching patch (such as all global matching patches) in a point cloud group is counted, and the global matching patch in the point cloud of each frame of the point cloud group is matched during the packaging process. Allocate the same (or similar) positions, so that the global matching patch with a matching relationship in the occupancy graph of the generated point cloud for each frame is spatially consistent. That is, the technical solution takes into consideration the temporal and spatial correlations between different point clouds, so that patches with matching relationships in different point clouds are spatially consistent.
  • the occupancy map of the point cloud can be used to guide the generation of the depth map and texture map of the point cloud
  • the encoding technology of the depth map and texture map of the point cloud is video / image encoding technology
  • the video / image encoding technology Code stream transmission is usually the difference data between frames. Therefore, patches with matching relationships in different point clouds are spatially consistent, which helps to improve encoding or compression efficiency and save code stream transmission overhead.
  • FIG. 9 a comparison diagram of an occupancy map of a point cloud obtained based on the methods shown in FIG. 5 and FIG. 6 according to an embodiment of the present application is provided.
  • the upper two figures of FIG. 9 are the occupancy maps of the two frames of the point cloud obtained based on the method shown in FIG. 5, and the middle one is the global occupancy maps obtained based on the method shown in FIG.
  • the occupancy map of the two frame point clouds obtained under the constraint of the global occupancy map. It can be seen from FIG. 9 that the spatial positions of the four large patch occupancy maps corresponding to the two frame occupancy maps obtained based on the method shown in FIG. 5 are relatively similar, but the positions of the other patch occupancy maps are relatively messy.
  • the global occupancy map is used to constrain the position of the global matching patch in the point cloud occupancy map of each frame, which not only makes the four large patches correspond to similar (or the same) space.
  • the position also makes other global matching patches with relatively small areas corresponding to similar (or identical) spatial positions, so that the advantages of video encoding inter prediction can be fully utilized to improve encoding or compression efficiency.
  • FIG. 10 and FIG. 11 it is a schematic diagram of a process based on the packaging method provided in FIG. 6.
  • the point cloud group includes 4 frame point clouds, and each frame point cloud includes 10 patches, and the global matching patches in the point cloud of each frame of the point cloud group obtained in S101 are:
  • patch11, patch21, patch31, and patch41 have a matching relationship, and these patches form set 1.
  • patch12, patch22, patch32, and patch42 have a matching relationship, and these patches form set 2.
  • patch13, patch23, patch33, and patch43 have a matching relationship, and these patches form set 3. That is, based on this example, the M sets in S102 are specifically sets 1 to 3, and each set contains 4 patches with a matching relationship.
  • the global occupancy map obtained by executing S103 may be as shown in FIG. 10.
  • the larger rectangle in FIG. 10 represents the global occupancy map
  • the ellipse, triangle, and smaller rectangle in FIG. 10 respectively represent the union patch patch occupancy map corresponding to sets 1 to 3.
  • the occupation map of point cloud 1 may be as shown in (a) in FIG. 11, and the occupation map of point cloud 2 may be as shown in (b) in FIG. 11.
  • FIG. 11 only shows the occupancy map of the point cloud 1 and the partial occupancy map of the point cloud 2, and does not show the occupancy maps of the point cloud 3 and the point cloud 4. Comparing FIG. 10 and FIG.
  • the position of the occupancy map of patch11 in the occupancy map of point cloud 1 and the position of the occupancy map of patch21 in the occupancy map of point cloud 2 correspond to the union of the patch occupancy map in the global occupancy map
  • the position of the occupancy map of patch12 in the occupancy map of point cloud 1 and the position of the occupancy map of patch22 in the occupancy map of point cloud 2 correspond to the location of the patch occupancy map in the global occupancy map of patch 2;
  • the position of the occupancy map in the occupancy map of point cloud 1 and the occupancy map of patch 23 in the occupancy map of point cloud 2 both correspond to the position of the union patch occupancy map in the global occupancy map.
  • the point cloud group may be a GOF, such as one or more GOFs, which generally refers to one GOF.
  • the frame number of the point cloud in GOF is configurable. This embodiment of the present application does not limit the number of frames of a point cloud included in a GOF.
  • a GOF may include a 32-frame point cloud.
  • the determination method of GOF refer to the prior art.
  • the point cloud group may be a sub-GOF, and the sub-GOF may be composed of time-continuous multi-frame point clouds in one GOF.
  • a GOF may include at least two sub-GOFs, and the number of frames of the point cloud included in any two of the at least two sub-GOFs may be equal or unequal.
  • S101 may include the following steps S101A to S101B:
  • S101A Obtain a reference point cloud; the reference point cloud is an arbitrary point cloud in the GOF.
  • the i-th patch and the patch matching the i-th patch Matches the patch globally.
  • the target patch is an i-th patch or the target patch is a matching patch of the i-th patch, and the i-th patch is any one of the reference point clouds.
  • the non-reference point cloud is sequentially searched for a patch matching the target patch.
  • the order of the patches in the reference point cloud in the point cloud group determines the order of the global matching patches in the point cloud group. Therefore, the order of the patches in the reference frame is particularly important.
  • the reference frame here may be the patch data after the packaging operation.
  • the packaging operation may change the storage order of the patches in the patch array. For example, when the global matching patch is obtained, the storage order of the patches is updated.
  • the packaging operation herein may be a packaging method (such as the methods shown in S101 to S104) provided according to the embodiment of the present application. This helps to ensure the temporal continuity of the global matching patch of two adjacent point cloud groups, thereby improving the encoding or compression efficiency.
  • the reference frames may also be in the order of patches without a packing operation.
  • This optional implementation manner may be applied to the embodiment in which the point cloud group is a GOF, and may also be applied to the scenario where the point cloud group is a sub-GOF. That is, when the point cloud group is a sub-GOF, a point cloud in any frame in the GOF to which the sub-GOF belongs can be used as a reference point cloud.
  • the reference point cloud may be the first frame point cloud in the GOF.
  • the global matching patches in a GOF can be guaranteed to be in the same order as the global matching patches in the first frame, which can bring certain gain to subsequent video encoding.
  • the reference point cloud may be a point cloud in the first frame of the sub-GOF, or an arbitrary point cloud in the previous sub-GOF of the sub-GOF.
  • the child GOF and the previous child GOF belong to the same GOF.
  • the reference point cloud is the last frame point cloud in the previous sub-GOF of the sub-GOF.
  • the temporal continuity of adjacent sub-GOFs is considered. Therefore, Helps improve encoding or compression efficiency.
  • the point cloud of the last frame is a point cloud obtained after performing a packaging operation.
  • the packaging operation herein may be a packaging method (such as the methods shown in S101 to S104) provided according to the embodiment of the present application. This helps to ensure the temporal continuity of the global matching patch of two neighboring GOFs, thereby improving the encoding or compression efficiency.
  • FIG. 12 it is an occupation map of each point cloud in two adjacent sub-GOFs (labeled as sub-GOF1 and sub-GOF2) obtained based on this further optional implementation manner.
  • the numbers 1051 to 1058 in FIG. 12 indicate the index of the point cloud.
  • the last frame may also be the point cloud before the packaging operation is performed.
  • S104 may include the following steps S104A to S104B:
  • S104A Determine the position of the occupancy map of the mth globally matched patch in the nth frame point cloud in the occupancy map of the nth frame based on the position of the mth union set occupancy map in the global occupancy map. These two positions correspond to each other. For related descriptions of these two positions, refer to the foregoing.
  • S104B Based on some or all (usually all) global matching patch occupancy map positions in the nth frame point cloud, the global matching patch and The occupancy map of the non-global matching patch is packed to obtain the occupancy map of the n-th point cloud. This part or all of the global matching patches includes the m-th global matching patch.
  • S104B may include the following steps S104B-1 to S104B-2:
  • S104B-1 Based on the position of the occupancy map of the part or all of the global matching patch in the n-th point cloud in the occupancy map of the n-th point cloud, map the occupancy map of the part or all of the global matching patches to the n The initial occupancy map of the frame point cloud.
  • the width of the initial occupancy map of the n-th point cloud is the initial width of the octave of the n-th point cloud
  • the height of the initial occupancy map of the n-th point cloud is the n-th point cloud.
  • the initial value of the height of the occupied map For the obtaining method of the initial value of the width / height of the occupancy map of the point cloud, refer to the foregoing, and also refer to the prior art.
  • S104B-2 Based on the initial occupancy map of the nth frame point cloud mapped with the occupancy map of the part or all matching patches, the occupancy of other patches in the nth point cloud except the part or all of the global matching patch The map is packed to obtain the occupancy map of the n-th point cloud.
  • the process of packaging the occupancy maps of other patches can refer to the prior art, as shown in the packaging method shown in FIG. 5.
  • the process of packaging each patch in a point cloud can be considered as the process of updating the occupation map of the point cloud.
  • each mapping of the occupancy map of a patch to a blank area of the point cloud occupancy map can be considered to have been updated once until the occupancy map of the last patch is mapped After reaching the blank area of the point cloud occupancy map, it is considered that the final occupancy map of the point cloud is obtained. Therefore, the process of performing S104B-1 can be considered as the process of updating the initial occupancy map of the point cloud to the intermediate occupancy map of the point cloud (this process is different from the prior art), and the process of executing S104B-2 can be considered as the intermediate The process of updating the occupancy map to the final occupancy map.
  • This optional implementation provides a difference from the packaging method shown in FIG. 5 in that the packaging method shown in FIG. 5 is to uniformly sort each patch in a frame point cloud, and sequentially sort the patches according to the obtained sequence.
  • the occupancy map of each patch is mapped to the occupancy map of the point cloud, thereby obtaining the position of the occupancy map of each patch in the occupancy map of the point cloud.
  • the optional implementation is to first determine the position of the global matching patch occupancy map in a point cloud in the occupancy map of the point cloud, and then map each global matching patch to the occupancy map of the point cloud; then , In order to obtain the position of each non-global matching patch in the occupancy map of the point cloud in order to obtain the position of each non-global matching patch in the occupancy map of the point cloud.
  • the above method further includes: determining whether to use a global occupancy map (that is, whether to use the above method provided in the embodiment of the present application) for packaging.
  • the method provided in the embodiment of the present application is used for packaging, and it is likely to obtain a larger coding gain; otherwise, other The packaging method is shown in FIG. 5.
  • Method 1 including the following steps A-1 to A-2:
  • Step A-1 Pre-packing the occupancy map of the first part of the point cloud in the N-frame point cloud according to the pre-packing algorithm to obtain the pre-occupation map of the first part of the point cloud; the pre-packing algorithm does not use the global occupancy map Algorithm for packing.
  • Step A-2 When the difference between the maximum size (such as height) of the pre-occupied map of the first part of the point cloud and the size (such as height) of the global occupancy map is within a preset range, determine the The method for packaging each point cloud is to pack each point cloud in the above N frame point clouds according to the global occupancy map. Otherwise, it is determined that the method for packaging each frame point cloud in the N-frame point cloud is the pre-packaging algorithm.
  • the first part of the point cloud is any one or more frames of the N-frame point cloud.
  • the pre-packaging algorithm refers to a packaging method not provided in the embodiments of the present application. For example, it may be a packaging method as shown in FIG. 5.
  • each frame of the multi-frame point cloud corresponds to a pre-occupancy map.
  • the maximum size of the pre-occupied map of the first part of the point cloud refers to the size of the pre-occupied map with the largest size (such as height and maximum) in the multiple pre-occupied maps corresponding to the multi-frame point cloud.
  • the preset range may be determined according to the coding gain, and may specifically be an empirical value.
  • the difference between the maximum size of the pre-occupation map of the first part of the point cloud and the size of the global occupancy map is within a preset range, which can be understood as:
  • the maximum height of the pre-occupation map of the first part of the point cloud is equivalent to the height of the global occupancy map.
  • Step B-1 Pre-pack the occupancy map of the first part of the point cloud in the N-frame point cloud according to the pre-packing algorithm to obtain the pre-occupation map of the first part of the point cloud; the pre-packing algorithm does not use the global occupancy map Algorithm for packing.
  • Step B-2 According to the global occupancy map, pre-pack the occupancy map of the second part of the point cloud in the N-frame point cloud to obtain the pre-occupation map of the second part of the point cloud.
  • Step B-3 When the difference between the maximum size of the pre-occupied map of the first part of the point cloud (such as height) and the maximum size of the pre-occupied map of the second part of the point cloud (such as height) is within a preset range, determine the correct The method for packaging each frame point cloud in the N-frame point cloud is to pack each frame point cloud in the N-frame point cloud according to the global occupancy map. Otherwise, it is determined that the method for packaging each frame point cloud in the N-frame point cloud is the pre-packaging algorithm.
  • the first part of the point cloud is any one or more frames of the N-frame point cloud.
  • the pre-packaging algorithm refers to a packaging method not provided in the embodiments of the present application. For example, it may be a packaging method as shown in FIG. 5.
  • the second part of the point cloud can be any one or more frames of the N-frame point cloud.
  • the point cloud of the first part is the same as the point cloud of the second part, which helps to better compare the technical solutions provided by the embodiments of the present application and the prior art.
  • Method 3 According to the rate-distortion cost criterion, it is determined that a global occupancy map is used to pack each frame of the above-mentioned N-frame point clouds.
  • pre-packing part or all of the patch occupancy maps in the N-frame point clouds to obtain the first pre-occupation map of the part or all of the point clouds
  • the occupancy map of the patch in the part or all of the point clouds is pre-packed to obtain a second pre-occupancy map of the part or all of the point clouds.
  • the method for packaging each point cloud in the above-mentioned N-frame point clouds is determined as follows: each point cloud in the above-mentioned N-frame point clouds is packaged according to the global occupancy map . Otherwise, it is determined that the method for packaging each frame point cloud in the N-frame point cloud is the pre-packaging algorithm.
  • FIG. 13 it is a schematic flowchart of a packaging method according to an embodiment of the present application. This embodiment may be considered as a specific example of the packaging method provided in FIG. 6.
  • the method includes the following steps:
  • S201 Store the patches in each point cloud of the point cloud group in an array form.
  • the global matching patch in a point cloud is arranged before the non-global matching patch, that is, the number of the array in which the global matching patch is located is smaller than the number of the array in which the non-global matching patch is located.
  • the specific implementation is not limited to this.
  • the number of global matching patches for each frame in a GOF is M, and M is labeled GobalPatchCount in the following procedure.
  • S202 Calculate a union patch occupation map (unionPatch) corresponding to the M sets.
  • unionPatch For the description of the global matching patch contained in each set, please refer to the above.
  • S202 may include:
  • the i-th point cloud traverse all its global matching patches.
  • the j-th global matching patch compute the j-th union patch occupation graph.
  • the resolution of the occupancy map of the union patch can be 16 * 16.
  • the procedure for obtaining the j-th union patch occupation graph can be as follows:
  • S203 Pack the M united patch occupancy maps of the point cloud group to obtain a global occupancymap.
  • the position coordinates of each union set occupation map in the global occupation map can be determined, that is, the values of u0 and v0.
  • step S204 Determine whether the difference between the maximum value of some or all of the frame heights in the point cloud group and the height of the global occupancy map is within a preset range. If yes, it is explained that the maximum value of the frame height in the point cloud group is equivalent to the height of the global occupancy map, then step S205 is performed. If not, execute S206.
  • S204 specifically includes the following steps:
  • S204B The global matching patch in some or all of the frames in the point cloud group is packaged according to the method shown in FIG. 5, and the height [i] .height of the occupation map of the point cloud is obtained.
  • the calculation method can be:
  • frame [i] .Patch [j] represents the j-th patch in the i-th frame.
  • the global matching patch in step S204B may also be replaced with a non-global matching patch, or a global matching patch and a non-global matching patch.
  • the part or all of the frames may be 1 frame, for example, the 0th, ... or N-1th frame in the point cloud group.
  • the part or all of the frames may be 2 frames, for example, the 0th frame and the 1st frame or the N-2th frame and the N-1th frame in the point cloud group.
  • the embodiments of the present application are not limited to this.
  • S205 Use the global occupancy map as the initial occupancy map of each point cloud in the point cloud group, and package the occupancy maps of all patches of the point cloud for each frame. For each point cloud of the point cloud group, perform the following steps S205A-S205B for the i-th point cloud:
  • S205A For the occupation map of the first globalPatchCount global matching patch of the point cloud patch array of the i-th frame, determine its position in the occupancy map of the i-th frame as follows:
  • S205B The remaining patches of the point cloud patch array of the i-th frame are sequentially packed into the occupancy map of the i-th point cloud using the packaging method shown in FIG. 5.
  • the procedure for updating the occupancy map of the i-th point cloud can be as follows:
  • S206 Pack the patch occupancy map of the point cloud in each frame of the point cloud group according to the method shown in FIG. 5.
  • the point cloud group can be a GOF.
  • one GOF may be divided into K sub-GOFs, that is, the point cloud group may specifically be one sub-GOF.
  • K is an integer greater than or equal to 2
  • K is 2, 4, 8, 16, ..., N / 2.
  • one sub-GOF is used.
  • the encoder may be divided into functional modules according to the foregoing method example.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above integrated modules may be implemented in the form of hardware or software functional modules. It should be noted that the division of the modules in the embodiments of the present application is schematic, and is only a logical function division. In actual implementation, there may be another division manner.
  • FIG. 14 is a schematic block diagram of an encoder 140 according to an embodiment of the present application.
  • the encoder 140 may include a packing module 1401 and an encoding module 1402.
  • the packaging module 1401 may correspond to the packaging module 102 in FIG. 2.
  • the occupation map of the point cloud generated by the packaging module 1401 is sent to the depth map generation module 103, the texture map generation module 104, and the second filling module 111.
  • the cloud occupancy map module cooperates with its connected or communicating modules to complete the encoding function.
  • the N frame point cloud is coded according to the N frame point cloud occupancy map.
  • the packaging module 1401 is configured to perform the following steps: obtaining a global matching patch in a point cloud of each frame of the point cloud group; the point cloud group includes N frame point clouds; N ⁇ 2, N is an integer ; Determine M union patch occupation maps corresponding to M sets; each of the M sets includes N global matching patches, and the N global matching patches have a matching relationship across the N-frame point cloud patch; the patch set occupation map corresponding to the mth set in the M sets is the union set of the occupation maps of the global matching patches in the mth set; 1 ⁇ m ⁇ M, where m and M are integers; The M union set patch occupancy map is packaged to obtain a global occupancy map; the global occupancy map is used to determine the position of the M union set patch occupancy map in the global occupancy map; each frame point in the N-frame point cloud The cloud is packed to obtain the occupancy map of the N-frame point cloud; wherein the position of the occupancy map of the m-th globally matching patch in the n-th point cloud in the occupancy map of the
  • the encoding module 1402 is configured to encode the N-frame point cloud according to the occupancy map of the N-frame point cloud.
  • the packaging module 1401 may be used to execute S101-S104, and the encoding module 1402 may be used to execute S105.
  • the packing module 1401 is specifically configured to determine the occupancy map of the m-th global matching patch in the n-th point cloud based on the position of the m-th union patch occupancy map in the global occupancy map. Position in the occupancy map of the n-th point cloud; based on the position of some or all of the global matching patch occupancy map in the n-th point cloud in the occupancy map of the n-th point cloud, The occupancy map of the global matching patch and the non-global matching patch are packaged to obtain the occupancy map of the n-th point cloud; wherein the part or all of the global matching patch includes the m-th global matching patch.
  • the packing module 1401 is specifically configured to: based on the position of the occupation map of the partial or all global matching patch in the occupation map of the point cloud of the nth frame, globally or partially match the occupation of the patch The map is mapped to the initial occupancy map of the nth frame point cloud; based on the initial occupancy map of the nth frame point cloud to which the part or all of the patched occupancy map is mapped, the whole or part of the nth point cloud The occupancy maps of other patches than the patch are packaged to obtain the occupancy map of the nth point cloud.
  • the occupancy map of the non-global matching patch in the n-th point cloud occupies a preset position range, or does not occupy a preset position range; wherein the preset position range refers to the points belonging to the n-th frame
  • the packaging module 1401 is specifically configured to: when the difference between the maximum size of the pre-occupancy map of the first part of the point cloud in the N-frame point cloud and the size of the global occupancy map is within a preset range , And package each point cloud in the N-frame point cloud according to the global occupancy map.
  • the pre-occupation map of the first part of the point cloud please refer to the above.
  • the packaging module 1404 is specifically configured to: when the maximum size of the pre-occupancy map of the first part of the point cloud in the N-frame point cloud and the second part of the point cloud in the N-frame point cloud are When the difference between the maximum sizes of the pre-occupancy maps is within a preset range, each frame of the N-frame point clouds is packed according to the global occupancy map.
  • the pre-occupation map of the first part of the point cloud and the pre-occupation map of the second part of the point cloud please refer to the above.
  • the point cloud group is a frame group GOF; or, the point cloud group is a sub-GOF, and the sub-GOF includes a time-continuous multi-frame point cloud in a GOF;
  • the packaging module 1401 is specifically used Yu: Obtain a reference point cloud; the reference point cloud is any point cloud in the GOF; for the i-th patch in the reference point cloud, if each non-reference point cloud in the GOF exists to match the target patch Patch, it is determined that the i-th patch and the patch that matches the i-th patch are global matching patches; where the target patch is the i-th patch or the target patch is the matching patch of the i-th patch, and the i-th patch is Refer to any patch in the point cloud.
  • the reference point cloud is the first frame point cloud in the GOF.
  • the reference point cloud is the first-frame point cloud in the sub-GOF or any one-point frame in the previous sub-GOF of the sub-GOF.
  • the reference point cloud is the last frame point cloud in the previous sub-GOF of the sub-GOF.
  • the last frame point cloud is a point cloud obtained after performing a packaging operation.
  • each module in the encoder of the embodiment of the present application is a functional body that implements various execution steps included in the point cloud encoding method corresponding to the present application, that is, it has a method for implementing the full implementation of the point cloud encoding method corresponding to the present application.
  • Each step in the steps and the main body of the expansion and deformation of these steps, for the sake of brevity, this article will not repeat them.
  • FIG. 15 is a schematic block diagram of an implementation manner of an encoding device 150 used in an embodiment of the present application.
  • the encoding device 150 may include a processor 1510, a memory 1530, and a bus system 1550.
  • the processor 1510 and the memory 1530 are connected through a bus system 1550.
  • the memory 1530 is configured to store instructions.
  • the processor 1510 is configured to execute instructions stored in the memory 1530 to perform various point cloud encoding or decoding methods described in this application. . To avoid repetition, it will not be described in detail here.
  • the processor 1510 may be a central processing unit (CPU), and the processor 1510 may also be another general-purpose processor, DSP, ASIC, FPGA, or other programmable logic device, discrete gate. Or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the memory 1530 may include a ROM device or a RAM device. Any other suitable type of storage device may also be used as the memory 1530.
  • the memory 1530 may include code and data 1531 accessed by the processor 1510 using the bus 1550.
  • the memory 1530 may further include an operating system 1533 and an application program 1535, which includes a processor 1510 that allows the processor 1510 to perform the point cloud encoding or decoding method described in this application (especially the current image image based on the block size of the current image block as described in this application). Block filtering method) at least one program.
  • the application program 1535 may include applications 1 to N, which further includes a point cloud encoding or decoding application (referred to as a point cloud decoding application) that executes the point cloud encoding or decoding method described in this application.
  • the bus system 1550 may include a power bus, a control bus, and a status signal bus in addition to a data bus. However, for the sake of clarity, various buses are marked as the bus system 1550 in the figure.
  • the encoding device 150 may further include one or more output devices, such as a display 1570.
  • the display 1570 may be a tactile display that incorporates the display with a tactile unit operatively sensing a touch input.
  • the display 1570 may be connected to the processor 1510 via a bus 1550.
  • Computer-readable media may include computer-readable storage media, which corresponds to tangible media, such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another (e.g., according to a communication protocol) .
  • computer-readable media may generally correspond to (1) tangible computer-readable storage media that is non-transitory, or (2) a communication medium such as a signal or carrier wave.
  • a data storage medium may be any available medium that can be accessed by one or more computers or one or more processors to retrieve instructions, code, and / or data structures used to implement the techniques described in this application.
  • the computer program product may include a computer-readable medium.
  • such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage devices, magnetic disk storage devices or other magnetic storage devices, flash memory, or may be used to store instructions or data structures Any form of desired program code and any other medium accessible by a computer.
  • any connection is properly termed a computer-readable medium.
  • a coaxial cable is used to transmit instructions from a website, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave. Wire, fiber optic cable, twisted pair, DSL or wireless technologies such as infrared, radio and microwave are included in the definition of media.
  • the computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other temporary media, but are instead directed to non-transitory tangible storage media.
  • magnetic and optical discs include compact discs (CDs), laser discs, optical discs, DVDs, and Blu-ray discs, where magnetic discs typically reproduce data magnetically, and optical discs use lasers to reproduce data optically. Combinations of the above should also be included within the scope of computer-readable media.
  • DSPs digital signal processors
  • ASICs application specific integrated circuits
  • FPGAs field programmable logic arrays
  • the term "processor” as used herein may refer to any of the aforementioned structures or any other structure suitable for implementing the techniques described herein.
  • the functions described by the various illustrative logical blocks, modules, and steps described herein may be provided within dedicated hardware and / or software modules configured for encoding and decoding, or Into the combined codec.
  • the techniques can be fully implemented in one or more circuits or logic elements.
  • various illustrative logical blocks, units, and modules in the encoder 100 and the decoder 200 can be understood as corresponding circuit devices or logic elements.
  • the techniques of this application may be implemented in a wide variety of devices or devices, including a wireless handset, an integrated circuit (IC), or a group of ICs (eg, a chipset).
  • IC integrated circuit
  • Various components, modules, or units are described in this application to emphasize functional aspects of the apparatus for performing the disclosed techniques, but do not necessarily need to be implemented by different hardware units.
  • the various units may be combined in a codec hardware unit in combination with suitable software and / or firmware, or through interoperable hardware units (including one or more processors as described above) provide.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请公开了点云编码方法和编码器,用以提高编码或压缩效率。该方法包括:获取点云组的N帧点云的全局匹配patch;确定与M个集合对应的M个并集patch占用图;每个集合包括N个全局匹配patch,N个全局匹配patch是跨N帧点云中具有匹配关系的patch;第m个集合对应的并集patch占用图为第m个集合中的各全局匹配patch的占用图的并集;将M个并集patch占用图进行打包,得到全局占用图,全局占用图用于确定M个并集patch占用图在全局占用图中的位置;对N帧点云中的每帧点云进行打包,得到N帧点云的占用图;第n帧点云中的第m个全局匹配patch的占用图在第n帧点云的占用图中的位置,对应于第m个并集patch占用图在全局占用图中的位置;根据N帧点云的占用图对N帧点云进行编码。

Description

点云编码方法和编码器
本申请要求于2018年09月19日提交国家知识产权局、申请号为201811120300.8、申请名称为“点云编码方法和编码器”的中国专利申请的优先权,以及,2018年09月25日提交国家知识产权局、申请号为201811121017.7、申请名称为“点云编码方法和编码器”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及编解码技术领域,尤其涉及点云(point cloud)编解码方法和编码器。
背景技术
随着3d传感器(例如3d扫描仪)技术的不断发展,采集点云数据越来越便捷,所采集的点云数据的规模也越来越大,因此,如何有效地对点云数据进行编码,成为迫切需要解决的问题。
发明内容
本申请实施例提供了点云编码方法和编码器,有助于提高编码或者压缩效率。
第一方面,提供了一种点云编码方法,包括:获取点云组的N帧点云的全局匹配patch;该点云组包括该N帧点云;N≥2,N是整数;确定与M个集合对应的M个并集patch占用图;该M个集合中的每个集合包括N个全局匹配patch,该N个全局匹配patch是跨该N帧点云中具有匹配关系的patch;该M个集合中的第m个集合对应的并集patch占用图为第m个集合中的各全局匹配patch的占用图的并集;1≤m≤M,m和M均是整数;对该M个并集patch占用图进行打包,得到全局占用图;全局占用图用于确定该M个并集patch占用图在全局占用图中的位置;对该N帧点云中的每帧点云进行打包,得到该N帧点云的占用图;其中,第n帧点云中的第m个全局匹配patch的占用图在第n帧点云的占用图中的位置(即第一位置),对应于该M个并集patch占用图中的第m个并集patch占用图在全局占用图中的位置(即第二位置);1≤n≤N,n是整数;根据该N帧点云的占用图对该N帧点云进行编码。
其中,一帧点云中的全局匹配patch,是指该帧点云中的且在点云组的除该帧点云之外的各帧点云中均能找到具有匹配关系的patch的patch。
其中,与一个patch具有匹配关系的patch是目标patch的匹配patch,其中,目标patch是该patch或者目标patch是该patch的匹配patch。
应当理解的是,具有匹配关系的patch是指两个patch在三维空间上具有相似的空间位置和/或形状。确定两个patch是否具有匹配关系的方法在说明书中不做限定。作为一个示例,可以采用将patch按照相同的投影平面投影到二维空间,在二维空间,通过计算目标patch与其他可能或潜在具有匹配关系的patch的IOU(交并比),所有IOU中最大的IOU,且最大IOU超过一定阈值的patch即为目标patch的匹配patch。作为另一个示例,可以直接在三维空间,通过计算目标patch与其他具有可能或潜在匹配关系的patch的IOU(交并比),所有IOU中最大的IOU,且最大IOU超过一定 阈值的patch即为目标patch的匹配patch。本申请实施例中,当然也可以采用其他寻找匹配关系patch的方法,这里不作限定。
其中,每帧点云中的全局匹配patch的个数相等,该个数是M,M是正整数。
其中,一个集合包括的N个全局匹配patch是跨N帧点云中具有匹配关系的patch,可以理解为:该N个全局匹配patch中的每个全局匹配patch来自于(或所属)一帧点云,且不同全局匹配patch来自于不同点云,且该N个全局匹配patch具有匹配关系。该N个全局匹配patch分别来自于N帧点云。
其中,第一位置对应于第二位置,可以理解为:第一位置的坐标值与第二位置的坐标值相同;或者,第一位置的坐标值在其坐标系下与第二位置的坐标值在其坐标系下实质性相同;或者,第一位置的坐标值与第二位置的坐标值不相同,但第二位置所在的位置范围覆盖第一位置所在的位置范围。
本技术方案中,通过统计一个点云组内的全局匹配patch,并在打包的过程中为该点云组的N帧点云中的具有匹配关系的全局匹配patch分配相同(或相似)的位置,使得所产生的每一帧点云的占用图中具有匹配关系的全局匹配patch在空间上具有一致性。也就是说,本技术方案在考虑了不同点云间的时间和空间上的相关性的基础上,使得不同点云中具有匹配关系的patch在空间上具有一致性。由于点云的占用图可以用于指导该点云的深度图和纹理图的生成,并且对点云的深度图和纹理图的编码技术是视频/图像编码技术,而视频/图像编码技术中经码流传输的通常是帧间的差异数据,因此,不同点云中具有匹配关系的patch在空间上具有一致性,有助于提高编码或者压缩效率,以及节省码流传输开销。
在一种可能的设计中,对该N帧点云中的每帧点云进行打包,得到该N帧点云的占用图,包括:基于第m个并集patch占用图在全局占用图中的位置,确定第n帧点云中的第m个全局匹配patch的占用图在第n帧点云的占用图中的位置;基于第n帧点云中的部分或全部全局匹配patch的占用图在第n帧点云的占用图中的位置,对第n帧点云中的全局匹配patch和非全局匹配patch的占用图进行打包,得到第n帧点云的占用图;其中,部分或全部全局匹配patch包括第m个全局匹配patch。
在一种可能的设计中,基于第n帧点云中的部分或全部全局匹配patch的占用图在第n帧点云的占用图中的位置,对第n帧点云中的全局匹配patch和非全局匹配patch的占用图进行打包,得到第n帧点云的占用图,包括:基于部分或全部全局匹配patch的占用图在第n帧点云的占用图中的位置,对部分或全部全局匹配patch的占用图映射到第n帧点云的初始占用图中;基于映射有部分或全部patch的占用图的第n帧点云的初始占用图,对第n帧点云中的除部分或全部全局匹配patch之外的其他patch的占用图进行打包,得到第n帧点云的占用图。也就是说,先映射全局匹配patch,再打包非全局匹配patch。
在一种可能的设计中,第n帧点云中的非全局匹配patch的占用图占用预设位置范围。预设位置范围是指属于第n帧点云中的全局匹配patch的占用图对应的并集patch占用图,且不属于第n帧点云中的全局匹配patch的占用图的位置范围。这样,能够使点云的占用图中的各patch的占用图排列更紧实(或密实),从而使得点云的占用图的尺寸较小。
在一种可能的设计中,第n帧点云中的非全局匹配patch的占用图不占用预设位置范围。预设位置范围是指属于第n帧点云中的全局匹配patch所属的集合对应的并集patch占用图,且不属于第n帧点云中的全局匹配patch的占用图的位置范围。这样,实现较简单。
在一种可能的设计中,对该N帧点云中的每帧点云进行打包,包括:当该N帧点云中的第一部分点云的预占用图的最大尺寸与全局占用图的尺寸之差在预设范围之内时,根据全局占用图对该N帧点云中的每帧点云进行打包。可选的,第一部分点云的预占用图为根据预打包算法对该第一部分点云中的patch的占用图进行预打包而得到的;预打包算法是不采用全局占用图进行打包的算法。
在一种可能的设计中,对该N帧点云中的每帧点云进行打包,包括:根据预打包算法对该N帧点云中的第一部分点云中的patch的占用图进行预打包,得到第一部分点云的预占用图;预打包算法是不采用全局占用图进行打包的算法;当第一部分点云的预占用图的最大尺寸与全局占用图的尺寸之差在预设范围之内时,根据全局占用图,对该N帧点云中的每帧点云进行打包。这样,有助于获得更大的编码增益。
在一种可能的设计中,对该N帧点云中的每帧点云进行打包,包括:当该N帧点云中的第一部分点云的预占用图的最大尺寸与该N帧点云中的第二部分点云的预占用图的最大尺寸之差在预设范围之内时,根据全局占用图对该N帧点云中的每帧点云进行打包。可选的,第一部分点云的预占用图为根据预打包算法对N帧点云中的第一部分点云中的patch的占用图进行预打包而得到的;预打包算法是不采用全局占用图进行打包的算法;第二部分点云的预占用图为根据全局占用图对N帧点云中的第二部分点云中的patch的占用图进行预打包得到的。这样,有助于获得更大的编码增益。
在一种可能的设计中,对该N帧点云中的每帧点云进行打包,包括:根据预打包算法对该N帧点云中的第一部分点云中的patch的占用图进行预打包,得到第一部分点云的预占用图;预打包算法是不采用全局占用图进行打包的算法;根据全局占用图,对该N帧点云中的第二部分点云中的patch的占用图进行预打包,得到第二部分点云的预占用图;当第一部分点云的预占用图的最大尺寸与第二部分点云的预占用图的最大尺寸之差在预设范围之内时,根据全局占用图,对该N帧点云中的每帧点云进行打包。这样,有助于获得更大的编码增益。
在上述涉及到预设范围的技术方案中,该预设范围可以依据编码增益来确定,具体可以是一个经验值。当然本申请实施例不限于此。
在一种可能的设计中,该点云组是一个帧组(group of frame,GOF)。一个GOF包括的点云的帧数是可配置的,例如一个GOF包括32帧点云。
在一种可能的设计中,该点云组是一个子GOF,子GOF包括一个GOF中的时间上连续的多帧点云。一个GOF可以被分为至少两个子GOF,不同子GOF包括的点云的帧数可以相等,也可以不相等。
在一种可能的设计中,获取点云组的各帧点云中的全局匹配patch,包括:获取参考点云;参考点云是GOF中的任意一帧点云;对于参考点云中的第i个patch,如果在GOF中的各非参考点云中均存在与目标patch相匹配的patch,则确定第i个patch和与目标patch相匹配的patch是全局匹配patch;其中,目标patch是第i个patch或 目标patch是第i个patch的匹配patch,第i个patch是参考点云中的任意一个patch。可以理解的,点云组中的参考点云中patch的顺序决定了该点云组中全局匹配patch的顺序。
在一种可能的设计中,当该点云组是GOF时,参考点云是GOF中的第一帧点云。这样,可以保证一个GOF中全局匹配patch都与第一帧中全局匹配patch的顺序一致,可以为后续视频编码带来一定增益。
在一种可能的设计中,当该点云组是子GOF时,参考点云是子GOF中的第一帧点云或者子GOF的前一个子GOF中的任意一帧点云。
在一种可能的设计中,参考点云是子GOF的前一个子GOF中的最后一帧点云。
在一种可能的设计中,该最后一帧点云是执行打包操作后得到的点云。这样有助于保证相邻两个子GOF的全局匹配patch在时间上的连续性,从而提高编码或者压缩效率。
第二方面,提供一种编码器,包括:打包模块和编码模块。打包模块用于执行以下步骤:获取点云组的N帧点云的全局匹配patch;该点云组包括该N帧点云;N≥2,N是整数;确定与M个集合对应的M个并集patch占用图;该M个集合中的每个集合包括N个全局匹配patch,该N个全局匹配patch是跨该N帧点云中具有匹配关系的patch;该M个集合中的第m个集合对应的并集patch占用图为第m个集合中的各全局匹配patch的占用图的并集;1≤m≤M,m和M均是整数;对该M个并集patch占用图进行打包,得到全局占用图;全局占用图用于确定该M个并集patch占用图在全局占用图中的位置;对该N帧点云中的每帧点云进行打包,得到该N帧点云的占用图;其中,第n帧点云中的第m个全局匹配patch的占用图在第n帧点云的占用图中的位置,对应于该M个并集patch占用图中的第m个并集patch占用图在全局占用图中的位置,1≤n≤N,n是整数。编码模块用于根据该N帧点云的占用图对该N帧点云进行编码。编码模块(可以通过如图2中描述的模块103~模块112中的部分或全部实现)。
关于打包模块所执行的步骤的具体实现方式或相关内容的解释等均可以参考上述第一方面,此处不再赘述。
第三方面,提供了一种点云编码方法,包括:根据预打包算法对N帧点云中的第一部分点云中的patch的占用图进行预打包,得到第一部分点云的预占用图;预打包算法是不采用全局占用图进行打包的算法;当第一部分点云的预占用图的最大尺寸与全局占用图的尺寸之差在预设范围之内时,确定打包方法是根据全局占用图对N帧点云中的每帧点云进行打包。否则,确定打包方法是该预打包算法。然后,根据打包得到的N帧点云的占用图对该N帧点云进行编码。其中,关于全局占用图的描述可以参考上文。
第四方面,提供了一种点云编码方法,包括:根据预打包算法对N帧点云中的第一部分点云中的patch的占用图进行预打包,得到第一部分点云的预占用图;预打包算法是不采用全局占用图进行打包的算法;根据全局占用图,对N帧点云中的第二部分点云中的patch的占用图进行预打包,得到第二部分点云的预占用图;当第一部分点云的预占用图的最大尺寸与第二部分点云的预占用图的最大尺寸之差在预设范围之内时,确定打包方法是根据全局占用图对N帧点云中的每帧点云进行打包。否则,确 定打包方法是该预打包算法。然后,根据打包得到的N帧点云的占用图对该N帧点云进行编码。其中,关于全局占用图的描述可以参考上文。
第五方面,提供一种编码器,包括:打包模块和编码模块。其中,打包模块用于根据预打包算法对N帧点云中的第一部分点云中的patch的占用图进行预打包,得到第一部分点云的预占用图;预打包算法是不采用全局占用图进行打包的算法;当第一部分点云的预占用图的最大尺寸与全局占用图的尺寸之差在预设范围之内时,确定打包方法是根据全局占用图对N帧点云中的每帧点云进行打包。否则,确定打包方法是该预打包算法。编码模块用于根据打包得到的N帧点云的占用图对该N帧点云进行编码。
第六方面,提供一种编码器,包括:打包模块和编码模块。其中,打包模块用于根据预打包算法对N帧点云中的第一部分点云中的patch的占用图进行预打包,得到第一部分点云的预占用图;预打包算法是不采用全局占用图进行打包的算法;根据全局占用图,对N帧点云中的第二部分点云中的patch的占用图进行预打包,得到第二部分点云的预占用图;当第一部分点云的预占用图的最大尺寸与第二部分点云的预占用图的最大尺寸之差在预设范围之内时,确定打包方法是根据全局占用图对N帧点云中的每帧点云进行打包。否则确定打包方法是该预打包算法。编码模块用于根据打包得到的N帧点云的占用图对该N帧点云进行编码。
第七方面,提供了一种点云编码方法,包括:获取点云组的N帧点云的全局匹配patch;确定与M个集合对应的M个并集patch占用图;所述M个集合中的每个集合包括N个全局匹配patch,所述N个全局匹配patch是跨所述N帧点云中具有匹配关系的patch;所述M个集合中的第m个集合对应的并集patch占用图为所述第m个集合中的各全局匹配patch的占用图的并集;1≤m≤M,m和M均是整数;对所述M个并集patch占用图进行打包,得到全局占用图;所述全局占用图用于确定所述M个并集patch占用图在所述全局占用图中的位置;利用全局占用图对所述N帧点云中的每帧点云中的全局匹配patch和非全局匹配patch的占用图进行打包,得到所述N帧点云的占用图;根据所述N帧点云的占用图对所述N帧点云进行编码。本技术方案中,通过统计一个点云组内的全局匹配patch,并基于全局匹配patch得到全局占用图,并对该点云组中的每帧点云中的全局匹配patch和非全局匹配patch的占用图进行打包。这为实现“不同点云间的时间和空间上的相关性的基础上,使得不同点云中具有匹配关系的patch在空间上具有一致性”创造了条件,因此,有助于提高编码或者压缩效率,以及节省码流传输开销。
第八方面,提供一种编码器,包括:打包模块和编码模块。其中,打包模块用于执行以下步骤:获取点云组的N帧点云的全局匹配patch;确定与M个集合对应的M个并集patch占用图;所述M个集合中的每个集合包括N个全局匹配patch,所述N个全局匹配patch是跨所述N帧点云中具有匹配关系的patch;所述M个集合中的第m个集合对应的并集patch占用图为所述第m个集合中的各全局匹配patch的占用图的并集;1≤m≤M,m和M均是整数;对所述M个并集patch占用图进行打包,得到全局占用图;所述全局占用图用于确定所述M个并集patch占用图在所述全局占用图中的位置;利用全局占用图对所述N帧点云中的每帧点云中的全局匹配patch和非全局 匹配patch的占用图进行打包,得到所述N帧点云的占用图;根据所述N帧点云的占用图对所述N帧点云进行编码。编码模块用于根据该N帧点云的占用图对该N帧点云进行编码。编码模块(可以通过如图2中描述的模块103~模块112中的部分或全部实现)。
基于上述第七或第八方面,在一种可能的设计中,所述N帧点云中的第n帧点云中的第m个全局匹配patch的占用图在所述第n帧点云的占用图中的位置,对应于所述M个并集patch占用图中的第m个并集patch占用图在所述全局占用图中的位置;1≤n≤N,n是整数。该技术方案可以实现“不同点云间的时间和空间上的相关性的基础上,使得不同点云中具有匹配关系的patch在空间上具有一致性”,从而有助于提高编码或者压缩效率,以及节省码流传输开销。
上述第七或第八方面及其可能的实现方式中相关术语或步骤的具体实现方式等均可以参考上述第一或第二方面或其可能的实现方式,此处不再赘述。
第九方面,提供一种用于编码点云数据的设备,该设备可以包括:
存储器,用于存储点云数据。
编码器,用于执行上述第一方面、第三方面、第四方面或第七方面的任一种点云编码方法。
第十方面,提供一种编码设备,包括:相互耦合的非易失性存储器和处理器,处理器调用存储在存储器中的程序代码以执行第一方面、第三方面、第四方面或第七方面的任一种方法的部分或全部步骤。
第十一方面,提供一种编码装置,该装置包括存储器和处理器。所述存储器用于存储程序代码;所述处理器用于调用所述程序代码,以执行第一方面、第三方面、第四方面或第七方面的任一种点云编码方法。
第十二方面,提供一种计算机可读存储介质,该计算机可读存储介质存储了程序代码,当该程序代码在计算机上运行时,使得该计算机执行第一方面、第三方面、第四方面或第七方面的任一种方法的部分或全部步骤。
第十三方面,提供一种计算机程序产品,当该计算机程序产品在计算机上运行时,使得该计算机执行第一方面、第三方面、第四方面或第七方面的任一种方法的部分或全部步骤。
应当理解的是,上述提供的相关装置/设备/计算机可读存储介质/计算机程序产品的有益效果均可以对应参考对应方面提供的方法实施例的有益效果,此处不再赘述。
附图说明
图1为可用于本申请实施例的一种实例的点云译码系统的框图;
图2为可用于本申请实施例的一种实例的编码器的示意性框图;
图3为可适用于本申请实施例的一种点云、点云的patch以及点云的占用图的示意图;
图4为可用于本申请实施例的一种实例的解码器的示意性框图;
图5为MPEG点云编码技术中提供的一种打包方法的流程示意图;
图6为本申请实施例提供的一种点云编码方法的流程示意图;
图7为可适用于本申请一实施例的两个patch的占用图的并集的示意图;
图8为本申请实施例提供的第一位置与第二位置之间的对应关系的示意图;
图9为本申请实施例提供的基于图5与图6所示的方法得到点云的占用图的对比示意图;
图10和图11为本申请实施例基于图6提供的打包方法的过程示意图;
图12为本申请实施例提供的获得的相邻两个子GOF中的各点云的占用图的示意图;
图13为本申请实施例提供的另一种点云编码方法的流程示意图;
图14为本申请实施例提供的一种编码器的示意性框图;
图15为用于本申请实施例的编码设备的一种实现方式的示意性框图。
具体实施方式
本申请实施例中的术语“至少一个(种)”包括一个(种)或多个(种)。“多个(种)”是指两个(种)或两个(种)以上。例如,A、B和C中的至少一种,包括:单独存在A、单独存在B、同时存在A和B、同时存在A和C、同时存在B和C,以及同时存在A、B和C。本申请实施例中的术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。本申请实施例中的术语字符“/”,一般表示前后关联对象是一种“或”的关系。在公式中,字符“/”表示除法运算,如A/B表示A除以B。本申请实施例中的术语“第一”、“第二”等是为了区分不同的对象,并不限定该不同对象的顺序。
图1为可用于本申请实施例的一种实例的点云译码系统1的示意性框图。术语“点云译码”或“译码”可一般地指代点云编码或点云解码。点云译码系统1的编码器100可以根据本申请提出的任一种点云编码方法对待编码点云进行编码。点云译码系统1的解码器200可以根据本申请提出的与编码器使用的点云编码方法相对应的点云解码方法对待解码点云进行解码。
如图1所示,点云译码系统1包含源装置10和目的地装置20。源装置10产生经编码点云数据。因此,源装置10可被称为点云编码装置。目的地装置20可对由源装置10所产生的经编码的点云数据进行解码。因此,目的地装置20可被称为点云解码装置。源装置10、目的地装置20或两个的各种实施方案可包含一或多个处理器以及耦合到所述一或多个处理器的存储器。所述存储器可包含但不限于随机存取存储器(random access memory,RAM)、只读存储器(read-only memory,ROM)、带电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、快闪存储器或可用于以可由计算机存取的指令或数据结构的形式存储所要的程序代码的任何其它媒体,如本文所描述。
源装置10和目的地装置20可以包括各种装置,包含桌上型计算机、移动计算装置、笔记型(例如,膝上型)计算机、平板计算机、机顶盒、例如所谓的“智能”电话等电话手持机、电视机、相机、显示装置、数字媒体播放器、视频游戏控制台、车载计算机或其类似者。
目的地装置20可经由链路30从源装置10接收经编码点云数据。链路30可包括能够将经编码点云数据从源装置10移动到目的地装置20的一或多个媒体或装置。在 一个实例中,链路30可包括使得源装置10能够实时将经编码点云数据直接发送到目的地装置20的一或多个通信媒体。在此实例中,源装置10可根据通信标准(例如无线通信协议)来调制经编码点云数据,且可将经调制的点云数据发送到目的地装置20。所述一或多个通信媒体可包含无线和/或有线通信媒体,例如射频(radio frequency,RF)频谱或一或多个物理传输线。所述一或多个通信媒体可形成基于分组的网络的一部分,基于分组的网络例如为局域网、广域网或全球网络(例如,因特网)。所述一或多个通信媒体可包含路由器、交换器、基站或促进从源装置10到目的地装置20的通信的其它设备。
在另一实例中,可将经编码数据从输出接口140输出到存储装置40。类似地,可通过输入接口240从存储装置40存取经编码点云数据。存储装置40可包含多种分布式或本地存取的数据存储媒体中的任一者,例如硬盘驱动器、蓝光光盘、数字多功能光盘(digital versatile disc,DVD)、只读光盘(compact disc read-only memory,CD-ROM)、快闪存储器、易失性或非易失性存储器,或用于存储经编码点云数据的任何其它合适的数字存储媒体。
在另一实例中,存储装置40可对应于文件服务器或可保持由源装置10产生的经编码点云数据的另一中间存储装置。目的地装置20可经由流式传输或下载从存储装置40存取所存储的点云数据。文件服务器可为任何类型的能够存储经编码的点云数据并且将经编码的点云数据发送到目的地装置20的服务器。实例文件服务器包含网络服务器(例如,用于网站)、文件传输协议(file transfer protocol,FTP)服务器、网络附属存储(network attached storage,NAS)装置或本地磁盘驱动器。目的地装置20可通过任何标准数据连接(包含因特网连接)来存取经编码点云数据。这可包含无线信道(例如,Wi-Fi连接)、有线连接(例如,数字用户线路(digital subscriber line,DSL)、电缆调制解调器等),或适合于存取存储在文件服务器上的经编码点云数据的两者的组合。经编码点云数据从存储装置40的传输可为流式传输、下载传输或两者的组合。
图1中所说明的点云译码系统1仅为实例,并且本申请的技术可适用于未必包含点云编码装置与点云解码装置之间的任何数据通信的点云译码(例如,点云编码或点云解码)装置。在其它实例中,数据从本地存储器检索、在网络上流式传输等等。点云编码装置可对数据进行编码并且将数据存储到存储器,和/或点云解码装置可从存储器检索数据并且对数据进行解码。在许多实例中,由并不彼此通信而是仅编码数据到存储器和/或从存储器检索数据且解码数据的装置执行编码和解码。
在图1的实例中,源装置10包含数据源120、编码器100和输出接口140。在一些实例中,输出接口140可包含调节器/解调器(调制解调器)和/或发送器(或称为发射器)。数据源120可包括点云捕获装置(例如,摄像机)、含有先前捕获的点云数据的点云存档、用以从点云内容提供者接收点云数据的点云馈入接口,和/或用于产生点云数据的计算机图形系统,或点云数据的这些来源的组合。
编码器100可对来自数据源120的点云数据进行编码。在一些实例中,源装置10经由输出接口140将经编码点云数据直接发送到目的地装置20。在其它实例中,经编码点云数据还可存储到存储装置40上,供目的地装置20以后存取来用于解码和/或播放。
在图1的实例中,目的地装置20包含输入接口240、解码器200和显示装置220。在一些实例中,输入接口240包含接收器和/或调制解调器。输入接口240可经由链路30和/或从存储装置40接收经编码点云数据。显示装置220可与目的地装置20集成或可在目的地装置20外部。一般来说,显示装置220显示经解码点云数据。显示装置220可包括多种显示装置,例如,液晶显示器(liquid crystal display,LCD)、等离子显示器、有机发光二极管(organic light-emitting diode,OLED)显示器或其它类型的显示装置。
尽管图1中未图示,但在一些方面,编码器100和解码器200可各自与音频编码器和解码器集成,且可包含适当的多路复用器-多路分用器(multiplexer-demultiplexer,MUX-DEMUX)单元或其它硬件和软件,以处置共同数据流或单独数据流中的音频和视频两者的编码。在一些实例中,如果适用的话,那么MUX-DEMUX单元可符合ITU H.223多路复用器协议,或例如用户数据报协议(user datagram protocol,UDP)等其它协议。
编码器100和解码器200各自可实施为例如以下各项的多种电路中的任一者:一个或多个微处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)、离散逻辑、硬件或其任何组合。如果部分地以软件来实施本申请,那么装置可将用于软件的指令存储在合适的非易失性计算机可读存储媒体中,且可使用一或多个处理器在硬件中执行所述指令从而实施本申请技术。前述内容(包含硬件、软件、硬件与软件的组合等)中的任一者可被视为一或多个处理器。编码器100和解码器200中的每一者可包含在一或多个编码器或解码器中,所述编码器或解码器中的任一者可集成为相应装置中的组合编码器/解码器(编码解码器)的一部分。
本申请可大体上将编码器100称为将某些信息“发信号通知”或“发送”到例如解码器200的另一装置。术语“发信号通知”或“发送”可大体上指代用以对经压缩点云数据进行解码的语法元素和/或其它数据的传送。此传送可实时或几乎实时地发生。替代地,此通信可经过一段时间后发生,例如可在编码时在经编码位流中将语法元素存储到计算机可读存储媒体时发生,解码装置接着可在所述语法元素存储到此媒体之后的任何时间检索所述语法元素。
如图2所示,为可用于本申请实施例的一种实例的编码器100的示意性框图。图2是以MPEG(Moving Picture Expert Group)点云压缩(Point Cloud Compression,PCC)编码框架为例进行说明的。在图2的实例中,编码器100可以包括patch信息生成模块101、打包模块102、深度图生成模块103、纹理图生成模块104、第一填充模块105、基于图像或视频的编码模块106、占用图编码模块107、辅助信息编码模块108和复用模块109等。另外,编码器100还可以包括点云滤波模块110、第二填充模块111和点云重构模块112等。其中:
patch信息生成模块101,用于采用某种方法将一帧点云分割产生多个patch,以及获得所生成的patch的相关信息等。其中,patch是指一帧点云中部分点构成的集合,通常一个连通区域对应一个patch。patch的相关信息可以包括但不限于以下信息中的至少一项:点云所分成的patch的个数、每个patch在三维空间中的位置信息、每个patch 的法线坐标轴的索引、每个patch从三维空间投影到二维空间产生的深度图、每个patch的深度图大小(例如深度图的宽和高)、每个patch从三维空间投影到二维空间产生的占用图等。该相关信息中的部分,如点云所分成的patch的个数,每个patch的法线坐标轴的索引,每个patch的深度图大小、每个patch在点云中的位置信息、每个patch的占用图的尺寸信息等,可以作为辅助信息被发送到辅助信息编码模块108,以进行编码(即压缩编码)。每个patch的占用图可以被发送到打包模块102进行打包,具体的,将该点云的各patch按照特定的顺序进行排列例如按照各patch的占用图的宽/高降序(或升序)排列;然后,按照排列后的各patch的顺序,依次将patch的占用图插入该点云占用图的可用区域中,得到该点云的占用图。再一方面,各patch在该点云占用图中的具体位置信息和各patch的深度图等可以被发送到深度图生成模块103。
打包模块102获得该点云的占用图后,一方面可以将该点云的占用图经第二填充模块111进行填充后发送到占用图编码模块107以进行编码。另一方面可以利用该点云的占用图指导深度图生成模块103生成该点云的深度图和指导纹理图生成模块104生成该点云的纹理图。
如图3所示,为可适用于本申请实施例的一种点云、点云的patch以及点云的占用图的示意图。其中,图3中的(a)为一帧点云的示意图,图3中的(b)为基于图3中的(a)获得的点云的patch的示意图,图3中的(c)为图3中的(b)所示的各patch映射到二维平面上所得到的各patch的占用图经打包得到的该点云的占用图的示意图。
深度图生成模块103,用于根据点云的占用图、该点云的各patch的占用图和深度信息,生成该点云的深度图,并将所生成的深度图发送到第一填充模块105,以对深度图中的空白像素进行填充,得到经填充的深度图。
纹理图生成模块104,用于根据点云的占用图、该点云的各patch的占用图和纹理信息,生成该点云的纹理图,并将所生成的纹理图发送到第一填充模块105,以对纹理图中的空白像素进行填充,得到经填充的纹理图。
经填充的深度图和经填充的纹理图被第一填充模块105发送到基于图像或视频的编码模块106,以进行基于图像或视频的编码。后续:
一方面,基于图像或视频的编码模块106、占用图编码模块107、辅助信息编码模块108,将所得到的编码结果(即码流)发送到复用模块109,以合并成一个码流,该码流可以被发送到输出接口140。
另一方面,基于图像或视频的编码模块106所得到的编码结果(即码流)发送到点云重构模块112进行点云重构,以得到经重构的点云(即得到重构的点云几何信息)。具体的,对基于图像或视频的编码模块106所得到的经编码的深度图进行视频解码,获得点云的解码深度图,利用解码深度图、该点云的占用图和各patch的辅助信息,获得重构的点云几何信息。其中,点云的几何信息是指点云中的点(例如点云中的每个点)在三维空间中的坐标值。
可选的,点云重构模块112还可以将点云的纹理信息和重构的点云几何信息发送到着色模块,着色模块用于对重构点云进行着色,以获得重构点云的纹理信息。
可选的,纹理图生成模块104还可以基于经点云滤波模块110对重构的点云几何 信息进行滤波得到的信息,生成该点云的纹理图。
如图4所示,为可用于本申请实施例的一种实例的解码器200的示意性框图。其中,图4中是以MPEG PCC解码框架为例进行说明的。在图4的实例中,解码器200可以包括解复用模块201、基于图像或视频的解码模块202、占用图解码模块203、辅助信息解码模块204、点云的几何信息重构模块205、点云滤波模块206和点云的纹理信息重构模块207。其中:
解复用模块201用于将输入的码流(即合并的码流)发送到相应解码模块。具体的,将包含经编码的纹理图的码流和经编码的深度图的码流发送给基于图像或视频的解码模块202;将包含经编码的占用图的码流发送给占用图解码模块203,将包含经编码的辅助信息的码流发送给辅助信息解码模块204。
基于图像或视频的解码模块202,用于对接收到的经编码的纹理图和经编码的深度图进行解码;然后,将解码得到的纹理图信息发送给点云的纹理信息重构模块207,将解码得到的深度图信息发送给点云的几何信息重构模块205。占用图解码模块203,用于对接收到的包含经编码的占用图的码流进行解码,并将解码得到的占用图信息发送给点云的几何信息重构模块205。辅助信息解码模块204,用于对接收到的经编码的辅助信息进行解码,并将解码得到的指示辅助信息的信息发送给点云的几何信息重构模块205。
点云的几何信息重构模块205,用于根据接收到的占用图信息和辅助信息对点云的几何信息进行重构。经重构的点云的几何信息经点云滤波模块206滤波之后,被发送到点云的纹理信息重构模块207。
点云的纹理信息重构模块207,用于对点云的纹理信息进行重构,得到经重构的点云。
可以理解的,图4所示的解码器200仅为示例,具体实现时,解码器200可以包括比图4中所示的更多或更少的模块。本申请实施例对此不进行限定。
为了便于理解本申请实施例提供的技术方案,以下对本申请实施例涉及的技术及术语进行说明。
在MPEG点云编码方法中,编码器首先将待编码点云(即当前帧或者当前帧点云)按照一定准则分割成若干个patch,这些patch相互之间没有交叠区域。然后,将每个patch从三维空间投影到二维平面,得到一个二维图像(即patch的占用图)。接着,将所有patch的占用图(或者降低分辨率后的patch的占用图)按照某种规则紧密的排列在一张二维图像上,得到当前帧占用图。这种排列patch的占用图的方法称为打包(packing)。后续,按照打包顺序生成当前帧深度图和当前帧纹理图。其中,当前帧深度图是各patch经过投影得到的深度按照打包顺序产生的一张二维图像。当前帧纹理图是各patch经过投影得到的纹理图按照打包顺序产生的一张二维图像。当前帧占用图是一张2值二维图像,用于表示该二维图像的各像素位置是否被点云中某点占据。通常,为了节约编码比特,当前帧占用图的分辨率低于当前帧深度图和当前帧纹理图的分辨率。
以下给出patch的信息(或者称为patch的边信息)的示例及描述,具体如表1所示。
表1
Figure PCTCN2019103124-appb-000001
其中,patch(或patch的占用图)在当前帧占用图中的坐标可以表示为(x,y),x是该patch的占用图的各点在X轴上的最小坐标值,y是该patch的占用图的各点在Y轴上的最小坐标值,当然本申请实施例不限于此。其中,当前帧占用图的坐标系是X-Y坐标系,X轴是水平方向上的坐标轴,Y轴是竖直方向上的坐标轴。
在MPEG点云编码技术中,打包方法的流程示意图如图5所示。其中,图5所示的方法的执行主体可以是编码器中的打包模块。图5所示的方法中假设编码器将当前帧分割成patchCount个patch,这些patch通常以patch数组形式保存。图5所示的方法包括:
步骤1:将当前帧的patch数组按照patch的占用图的宽度(sizeU)的大小、高度(sizeV)的大小或者patch索引(patchIndex)的大小进行降序排列,得到一个序列。下文中将该序列中的第i个patch数组标记为patch[i],i=0、1、…、patchCount-1。
步骤2:计算当前帧占用图的宽度和高度的初始值。
例如,根据max{minimumImageWidth/occupancyResolution,patch[0].sizeU0},得到当前帧占用图的宽度的初始值。其中,minimumImageWidth表示当前帧占用图的宽度的最小值,patch[0].sizeU0表示patch[0]的占用图的宽度。
根据公式max{minimumImageHeight/occupancyResolution,patch[0].sizeV0},得到当前帧占用图的高度的初始值。其中,minimumImageHeight表示当前帧占用图的高度的最小值,patch[0].sizeV0表示patch[0]的占用图的高度。
步骤3:在当前帧占用图中从左向右,从上向下寻找可以放入patch[i]的占用图的位置。其中当前帧占用图中被已放入patch的占用图占据的位置,不可再放入新的patch的占用图。可以理解的,步骤3中的patch[i]的占用图可以替换为降低分辨率后的patch[i]的占用图。
如果找到patch[i]的占用图的可用位置,则执行步骤4。
如果当前帧占用图中没有找到patch[i]的占用图的可用位置,则执行步骤5。
步骤4:记录下patch[i]的占用图在当前帧占用图中的存放位置u0和v0。
执行步骤4之后,则结束。
步骤5:将当前帧占用图的高度occupancySizeV加倍,基于更新后(即加倍后)的当前帧继续执行上述步骤3。
可以理解的,遍历当前帧的各patch数组,执行上述步骤3,得到最终的当前帧占用图。
可以理解的,执行打包的过程可以认为是对当前帧占用图的更新过程。如果不加说明,下文中所描述的当前帧占用图均是指最终的当前帧占用图。执行打包操作之后可以确定当前帧中所有patch的占用图在当前帧占用图中的位置信息。后续,基于该位置信息,对当前帧深度图和当前帧纹理图进行视频/图像编码。
图5所示的打包方法通过降序排列后直接打包所有patch的占用图,实现简单,但是,可能导致同一个patch的占用图在前后帧占用图上位置不同。这会导致基于视频/图像编码模块编码性能的损失较大,且导致在编码patch边信息时需要更多比特。
为此,本申请实施例提供了点云编码方法和编码器。可选的,本申请实施例中的点云为动态点云,前后帧点云在时间和空间上具有相关性,其中,该相关性具体可以体现为点云组中存在全局匹配patch。
以下结合附图对本申请实施例提供的点云编码方法和编码器进行说明。
如图6所示,为本申请实施例提供的点云编码方法的流程示意图。结合图1所示的点云译码系统,图6所示的方法的执行主体可以是图1中的编码器100。图6所示的方法包括:
S101:获取点云组的N帧点云的全局匹配patch。其中,该点云组包括该N帧点云;N≥2,N是整数。可选的,该N帧点云是时间上连续的N帧点云,当然本申请实施例不限于此。
一帧点云中的全局匹配patch,是指该帧点云中的且在点云组中除该帧点云之外的各帧点云中均能找到具有匹配关系的patch的patch。同时,与该patch具有匹配关系的patch是全局匹配patch。例如,假设点云组包括4帧点云(分别标记为点云1~4),对于点云1中的任意一个patch来说,如果在点云2~4中均能找到与该patch具有匹配关系的patch,如点云1中的patch11具有匹配关系的patch分别是:点云2中的patch21、点云3中的patch31和点云4中的patch41,那么,patch11、21、31、41均是全局匹配patch,且patch1w是点云w中的全局匹配patch,1≤w≤4,w是整数。
与一个patch具有匹配关系的patch是目标patch的匹配patch,其中,目标patch是该patch或者该patch的匹配patch。可选的,一帧点云中的一个patch在另一帧点云中的匹配patch,可以是该另一帧点云中的与该patch的交并比(intersection over union,IoU)最大、且该IoU大于或等于预设阈值的patch。如何确定当前patch的匹配patch,可以参考其他专利文献,例如申请号为201810045808.X的中国专利申请,这里不再赘述。
由上述描述可知,每帧点云中的全局匹配patch的个数相等,该个数(即下文中的M)可以大于或等于1的正整数。如果点云组包括N帧点云,那么,对于任意一个全局匹配patch来说,与该全局匹配patch具有匹配关系的patch的个数是N-1。
在一种实现方式中,这N-1个patch均是该patch的匹配patch。
在另一种实现方式中,这N-1个patch中的至少一个patch是该patch的匹配patch, 其他patch呈链式匹配关系。链式匹配关系的具体示例可以参考下文。
S102:确定与M个集合对应的M个并集patch占用图。该M个集合中的每个集合包括N个全局匹配patch,该N个全局匹配patch是跨上述N帧点云中具有匹配关系的patch;该M个集合中的第m个集合对应的并集patch占用图为第m个集合中的各全局匹配patch的占用图的并集;1≤m≤M,m和M均是整数。
一个集合包括的N个全局匹配patch是跨N帧点云中具有匹配关系的patch,可以理解为:该N个全局匹配patch分别来自于N帧点云,且该N个全局匹配patch具有匹配关系。该N个全局匹配patch分别来自于N帧点云,可以理解为,该N个全局匹配patch中的每个全局匹配patch来自于(或所属)一帧点云,且不同全局匹配patch来自于不同点云。例如,该N个全局匹配patch中的第n个全局匹配patch来自于该N帧点云中的第n帧点云。
可以理解的,一个集合包括的N个全局匹配patch是跨N帧点云中具有匹配关系的patch,这一特征不仅描述了一个集合中包括的N个全局匹配patch的特征,还描述了全局匹配patch的概念,即一帧点云中的全局匹配patch是指该帧点云中的且在点云组中除该帧点云之外的各帧点云中均能找到具有匹配关系的patch的patch。
在一种实现方式中,N个全局匹配patch具有匹配关系,可以包括:N个全局匹配patch中的任意两个patch均匹配,换言之,N个全局匹配patch相互匹配。
在另一种实现方式中,N个全局匹配patch具有匹配关系,可以包括:N个全局匹配patch呈链式匹配,例如N个全局匹配patch中的第1个全局匹配patch与第2个全局匹配patch匹配,第2个全局匹配patch与第3个全局匹配patch匹配,第3个全局匹配patch与第4个全局匹配patch匹配,……第N-1个全局匹配patch与第N个全局匹配patch匹配。
当然本申请实施例不限于此。例如,N个全局匹配patch具有匹配关系,可以包括:N个全局匹配patch的部分全局匹配patch中的任意两个patch均匹配,其他全局匹配patch呈链式匹配。
根据上文中的描述可知,每帧点云包括M个全局匹配patch。作为一个示例,该M个全局匹配patch中的第m个全局匹配patch归属于上述M个集合中的第m个集合。
具体实现的过程中,可以结合现有技术中求二维图形的并集的方法,获得至少两个全局匹配patch的占用图的并集,从而得到每个集合对应的并集patch占用图。如图7所示,为可适用于本申请一实施例的两个patch的占用图的并集的示意图。其中,矩形ABCD表示点云1中的全局匹配patch1的占用图,矩形AEFG表示点云2中的全局匹配patch2的占用图。这两个全局匹配patch的占用图的并集即并集patch占用图,如图7中的矩形AEFG所示。
一个全局匹配patch的占用图所占的区域(或范围或区域范围)小于或等于该全局匹配patch对应的并集patch占用图(即该全局匹配patch所属的集合对应的并集patch占用图)所占的区域,或者说,一个全局匹配patch所属的集合对应的并集patch占用图覆盖该全局匹配patch的占用图。例如,图7中的patch1的占用图所占的区域小于并集patch占用图所占的区域,图7中的全局匹配patch2的占用图所占的区域等于并集patch占用图所占的区域。
S103:对上述M个并集patch占用图进行打包,得到全局占用图,该全局占用图用于确定上述M个并集patch占用图在全局占用图中的位置(或者降低分辨率后的M个并集占用图在全局占用图中的位置)。
本申请实施例对S103的具体实现方式不进行限定,例如可以采用如图5所示的打包方法对上述M个并集patch占用图进行打包。上述M个并集patch占用图在该全局占用图中的位置,可以通过表1中所示的u0和v0的取值来表征。
S104:对上述N帧点云中的每帧点云进行打包,得到该N帧点云的占用图。第n帧点云中的第m个全局匹配patch的占用图在第n帧点云的占用图中的位置(即第一位置),对应于M个并集patch占用图中的第m个并集patch占用图在全局占用图中的位置(即第二位置)。
第一位置对应于第二位置,可以理解为:第一位置的坐标值与第二位置的坐标值相同,或者,第一位置的坐标值在其坐标系下与第二位置的坐标值在其坐标系下实质性相同;或者,第一位置的坐标值与第二位置的坐标值不相同,但第二位置所在的位置范围覆盖第一位置所在的位置范围。其中,第一位置的坐标值可以通过第n帧点云的占用图中的第m个全局匹配patch的占用图在第n帧点云的占用图中的位置u0、v0的取值来表征。第二位置的坐标值可以通过第m个并集patch占用图在全局占用图中的位置u0、v0的取值来表征。第一位置所在的位置范围是第m个全局匹配patch的占用图所占的区域,第二位置所在的位置范围是第m个并集patch占用图所占的区域。
例如,参见图8,为本申请实施例提供的第一位置与第二位置之间的对应关系的示意图。其中,图8是基于图7进行绘制的。图8中的(a)表示第m个并集patch占用图AEFG在全局占用图中的位置(即第二位置),且位置坐标是点E的坐标(1,2),即u0=1,v0=2。图8中的(b1)和(b2)均表示第n帧点云中的第m个全局匹配patch的占用图(即矩形ABCD所示的占用图)在第n帧点云中的位置(即第一位置),其中,图8中的(b1)中第一位置的坐标值与第二位置的坐标值相同。图8中的(b2)中的第一位置的坐标值与第二位置的坐标值不相同,但第二位置所在的位置范围覆盖第一位置所在的位置范围。
可选的,第n帧点云中的非全局匹配patch的占用图占用预设位置范围;其中,预设位置范围是指属于第n帧点云中的全局匹配patch所属的集合对应的并集patch占用图,且不属于第n帧点云中的全局匹配patch的占用图的区域。这样,能够使点云的占用图中的各patch的占用图排列更紧实(或密实),从而使得点云的占用图的尺寸较小。
该可选的实现方式中的“预设位置范围”是针对一个点云来说的,也就是说,预设位置范围包括针对该点云中的部分或全部全局匹配patch中的每个全局匹配patch的预设位置范围。例如,结合图7,当对点云1中的patch的占用图进行打包时,针对patch1(如矩形ABCD表示的patch)来说,预设位置范围是指图7中的黑色阴影部分所示的区域。也就是说,图7中黑色阴影部分所示的区域中可以映射(或放置)非全局匹配patch的占用图。
可替换的,第n帧点云中的非全局匹配patch的占用图不占用该预设位置范围。例如,图7中的黑色阴影部分所示的区域中不能映射(或放置)非全局匹配patch的 占用图。
S105:根据上述N帧点云的占用图对上述N帧点云进行编码。
具体的,根据上述N帧点云的占用图中的每帧点云的占用图指导该点云的深度图和纹理图的生成,对该点云的深度图和纹理图进行视频/图像编码等。具体可以参考上文对编码器100的原理的描述。
本技术方案中,通过统计一个点云组内的全局匹配patch(如所有全局匹配patch),并在打包的过程中为该点云组的每一帧点云中的具有匹配关系的全局匹配patch分配相同(或相似)的位置,使得所产生的每一帧点云的占用图中具有匹配关系的全局匹配patch在空间上具有一致性。也就是说,本技术方案在考虑了不同点云间的时间和空间上的相关性的基础上,使得不同点云中具有匹配关系的patch在空间上具有一致性。由于点云的占用图可以用于指导该点云的深度图和纹理图的生成,并且对点云的深度图和纹理图的编码技术是视频/图像编码技术,而视频/图像编码技术中经码流传输的通常是帧间的差异数据,因此,不同点云中具有匹配关系的patch在空间上具有一致性,有助于提高编码或者压缩效率,以及节省码流传输开销。
如图9所示,为本申请实施例提供的基于图5与图6所示的方法得到点云的占用图的对比示意图。其中,图9的上面两个图为基于图5所示的方法得到的两帧点云的占用图,中间一个图为基于图6所示的方法得到的全局占用图,下面两个图为在全局占用图约束下得到的该两帧点云的占用图。由图9可以看出基于如图5所示的方法得到的两帧占用图中4个面积比较大的patch的占用图的空间位置对应比较相似,但是其余patch的占用图的位置相对凌乱,不便于视频编码利用帧间相关性进行高效编码。而基于图6所示的方法通过全局占用图约束每一帧点云中的全局匹配patch在该帧点云的占用图中的位置,不仅使得4个大的patch对应具有相似(或相同)空间位置,也使得其他面积比较小的全局匹配patch也对应具有相似(或相同)空间位置,从而可以充分利用视频编码帧间预测的优势提高编码或者压缩效率。
以下通过一个简单的示例,对图6中的打包方法进行说明。
参见图10和图11,为基于图6提供的打包方法的过程示意图。其中,图10和图11中,假设点云组包括4帧点云,每帧点云包括10个patch,且S101获得的该点云组的各帧点云中的全局匹配patch分别是:
点云1中的patch11、patch12和patch13;
点云2中的patch21、patch22和patch23;
点云3中的patch31、patch32和patch33;
点云4中的patch41、patch42和patch43。
并且,patch11、patch21、patch31和patch41具有匹配关系,这些patch组成集合1。patch12、patch22、patch32和patch42具有匹配关系,这些patch组成集合2。patch13、patch23、patch33和patch43具有匹配关系,这些patch组成集合3。也就是说,基于本示例,S102中的M个集合具体是集合1~3,且每个集合包含的4个具有匹配关系的patch。
基于该示例,执行S103得到的全局占用图可以如图10所示。图10中的较大矩形表示全局占用图,图10中的椭圆、三角形和较小矩形分别表示集合1~3对应的并集 patch占用图。
基于该示例,执行S104之后,点云1的占用图可以如图11中的(a)所示,点云2的占用图可以如图11中的(b)所示。图11仅示意出了点云1的占用图和点云2的占用图中的部分patch的占用图,且没有示意出点云3和点云4的占用图。对比图10和图11可知,patch11的占用图在点云1的占用图中的位置和patch21的占用图在点云2的占用图中的位置均对应并集patch占用图1在全局占用图中的位置;patch12的占用图在点云1的占用图中的位置和patch22的占用图在点云2的占用图中的位置均对应并集patch占用图2在全局占用图中的位置;patch13的占用图在点云1的占用图中的位置和patch23的占用图在点云2的占用图中的位置均对应并集patch占用图3在全局占用图中的位置。
可选的,点云组可以是GOF,如一个或多个GOF,通常是指一个GOF。GOF中点云的帧数是可配置的。本申请实施例对一个GOF所包含的点云的帧数不进行限定,例如,一个GOF可以包含32帧点云。关于GOF的确定方式可以参考现有技术。
可选的,点云组可以是一个子GOF,该子GOF可以由一个GOF中的时间上连续的多帧点云构成。一个GOF可以包括至少两个子GOF,该至少两个子GOF中的任意两个子GOF所包含的点云的帧数可以相等,也可以不相等。
可选的,S101可以包括以下步骤S101A~S101B:
S101A:获取参考点云;该参考点云是GOF中的任意一帧点云。
S101B:对于参考点云中的第i个patch,如果该GOF中的各非参考点云中均存在与目标patch相匹配的patch,则将第i个patch和与第i个patch相匹配的patch为为全局匹配patch。其中,目标patch是第i个patch或目标patch是第i个patch的匹配patch,第i个patch是所述参考点云中的任意一个patch。具体的,按照GOF中的各点云的索引依次在各非参考点云中查找与目标patch相匹配的patch。
可以理解的,点云组中的参考点云中patch的顺序决定了该点云组中全局匹配patch的顺序,因此,参考帧中patch的顺序尤为重要。其中,这里的参考帧可以是经过打包操作的patch数据,其中,打包操作有可能会改变patch数组中patch的存放顺序,如获得全局匹配patch时会更新patch的存放顺序。这里的打包操作可以是按照本申请实施例提供的打包方法(如S101~S104所示的方法)。这样有助于保证相邻两个点云组的全局匹配patch在时间上的连续性,从而提高编码或者压缩效率。可替换的,参考帧也可以是没有经过打包操作的patch的顺序。
依据该可选的实现方式,遍历i=1、2……M,可以获得点云组的各帧点云中的全局匹配patch。该可选的实现方式可以应用于点云组是GOF的实施例中,也可以应用于点云组是子GOF的场景中。也就是说,当点云组是子GOF时,可以使用该子GOF所属的GOF中的任意一帧点云作为参考点云。
可选的,当点云组是GOF时,参考点云可以是该GOF中的第一帧点云。这样,可以保证一个GOF中全局匹配patch都与第一帧中全局匹配patch的顺序一致,可以为后续视频编码带来一定增益。
可选的,当点云组是子GOF时,参考点云可以是该子GOF中的第一帧点云,或者是该子GOF的前一个子GOF中的任意一帧点云。该子GOF与该前一个子GOF属 于同一个GOF。
可选的,当点云组是子GOF时,参考点云是子GOF的前一个子GOF中的最后一帧点云,该方案中考虑了相邻子GOF在时间上的连续性,因此,有助于提高编码或者压缩效率。
进一步可选的,该最后一帧点云是执行打包操作后得到的点云。其中,这里的打包操作可以是按照本申请实施例提供的打包方法(如S101~S104所示的方法)。这样有助于保证相邻两个子GOF的全局匹配patch在时间上的连续性,从而提高编码或者压缩效率。如图12所示,为基于该进一步可选的实现方式获得的相邻两个子GOF(标记为子GOF1和子GOF2)中的各点云的占用图。其中,图12中的数字1051~1058表示点云的索引。
可替换的,该最后一帧也可以是执行打包操作之前的点云。
可选的,S104可以包括以下步骤S104A~S104B:
S104A:基于第m个并集patch占用图在全局占用图中的位置,确定第n帧点云中的第m个全局匹配patch的占用图在第n帧点云的占用图中的位置。这两个位置相对应,关于这两个位置相对应的相关描述可以参考上文。
依据S104A,当m=1、2、3……M中的部分或全部时,可获得第n帧点云中的编号为该部分或全部的全局匹配patch的占用图在第n帧点云的占用图中的位置。
S104B:基于第n帧点云中的部分或全部(通常是指全部)全局匹配patch的占用图在第n帧点云的占用图中的位置,对第n帧点云中的全局匹配patch和非全局匹配patch的占用图进行打包,得到第n帧点云的占用图。该部分或全部全局匹配patch包括第m个全局匹配patch。
具体的,S104B可以包括以下步骤S104B-1~S104B-2:
S104B-1:基于第n帧点云中的该部分或全部全局匹配patch的占用图在第n帧点云的占用图中的位置,将该部分或全部全局匹配patch的占用图映射到第n帧点云的初始占用图中。
其中,第n帧点云的初始占用图的宽度取值是第n帧点云的占用图的宽度的初始值,第n帧点云的初始占用图的高度取值是第n帧点云的占用图的高度的初始值。关于点云的占用图的宽度/高度的初始值的获取方式可以参考上文,也可以参考现有技术。
S104B-2:基于映射有该部分或全部匹配patch的占用图的第n帧点云的初始占用图,对第n帧点云中的除该部分或全部全局匹配patch之外的其他patch的占用图进行打包,得到第n帧点云的占用图。其中,对该其他patch的占用图进行打包的过程可以参考现有技术,如图5所示的打包方法。
可以理解的,对一帧点云中的各patch进行打包的过程,可以认为是对点云的占用图进行更新的过程。例如,结合图5所示的打包方法,每将一个patch的占用图映射到点云占用图的空白区域,可以认为对点云的占用图进行了一次更新,直至将最后一个patch的占用图映射到点云占用图的空白区域之后,认为得到了点云的最终占用图。因此,执行S104B-1的过程可以认为是由点云的初始占用图更新至点云的中间占用图的过程(该过程与现有技术不同),执行S104B-2的过程可以认为是将该中间占用图更新为了最终的占用图的过程。
该可选的实现方式提供的与如图5所示的打包方法的区别在于:图5所示的打包方法是对一帧点云中的各patch统一进行排序,并依照排序得到的序列依次将各patch的占用图映射到点云的占用图中,从而获得各patch的占用图在点云的占用图中的位置。而该可选的实现方式是先确定一帧点云中的全局匹配patch的占用图在该点云的占用图中的位置,再将各全局匹配patch映射到该点云的占用图中;接着,依次获得各非全局匹配patch在该点云的占用图中的位置,从而获得各非全局匹配patch在该点云的占用图中的位置。
可以理解的,如果全局占用图的高度过大,则可能导致点云组中的N帧点云的占用图过大,从而导致后续对每一帧点云的深度图和纹理图采用视频/图像编码时性能下降。此时,采用现有技术(如图5所示的打包方法)进行打包反而不会导致编码性能下降。因此,作为一种可选的实施例,上述方法还包括:判断是否采用全局占用图(即是否采用本申请实施例提供的上述方法)进行打包。对此,如果点云组中的所有帧高度的最大值与全局占用图的高度相当,则采用本申请实施例提供的方法进行打包,很有可能获得更大的编码增益;否则,可以采用其他打包方法如图5所示的打包方法进行打包。
基于这一思想,以下提供本申请实施例提供几种实现方式:
方式1:包括以下步骤A-1~A-2:
步骤A-1:根据预打包算法对上述N帧点云中的第一部分点云中的patch的占用图进行预打包,得到第一部分点云的预占用图;预打包算法是不采用全局占用图进行打包的算法。
步骤A-2:当第一部分点云的预占用图的最大尺寸(如高度)与全局占用图的尺寸(如高度)之差在预设范围之内时,确定对上述N帧点云中的每帧点云进行打包的方法是:根据全局占用图对上述N帧点云中的每帧点云进行打包。否则,确定对上述N帧点云中的每帧点云进行打包的方法是该预打包算法。
其中,第一部分点云是N帧点云中的任意一帧或多帧点云。预打包算法是指非本申请实施例提供的打包方法。例如可以是如图5所示的打包方法。
如果第一部分点云包括多帧点云,则执行预打包之后,该多帧点云中的每帧点云对应一个预占用图。第一部分点云的预占用图的最大尺寸是指该多帧点云对应的多个预占用图中的尺寸最大(如高最大)的预占用图的尺寸。
预设范围可以依据编码增益来确定,具体可以是一个经验值。第一部分点云的预占用图的最大尺寸与全局占用图的尺寸之差在预设范围之内,可以理解为:第一部分点云的预占用图的最大高度与全局占用图的高度相当。
方式2:包括以下步骤B-1~B-2:
步骤B-1:根据预打包算法对上述N帧点云中的第一部分点云中的patch的占用图进行预打包,得到第一部分点云的预占用图;预打包算法是不采用全局占用图进行打包的算法。
步骤B-2:根据全局占用图,对N帧点云中的第二部分点云中的patch的占用图进行预打包,得到第二部分点云的预占用图。
步骤B-3:当第一部分点云的预占用图的最大尺寸(如高度)与第二部分点云的 预占用图的最大尺寸(如高度)之差在预设范围之内时,确定对上述N帧点云中的每帧点云进行打包的方法是:根据全局占用图对上述N帧点云中的每帧点云进行打包。否则,确定对上述N帧点云中的每帧点云进行打包的方法是该预打包算法。
其中,第一部分点云是N帧点云中的任意一帧或多帧点云。预打包算法是指非本申请实施例提供的打包方法。例如可以是如图5所示的打包方法。
第二部分点云可以是N帧点云中的任意一帧或多帧点云。可选的,第一部分点云与第二部分点云相同,这样有助于更好地对比本申请实施例和现有技术提供的技术方案。
方式3:根据率失真代价准则,确定采用全局占用图,对上述N帧点云中的每帧点云进行打包。
具体的:根据预打包算法对上述N帧点云中的一部分或全部点云中的patch的占用图进行预打包,得到该部分或全部点云的第一预占用图;根据全局占用图,对该部分或全部点云中的patch的占用图进行预打包,得到该部分或全部点云的第二预占用图。当第二预占用图的码流传输开销小于或等于第一预占用图的码流比特开销;或者虽然第二预占用图的码流传输开销大于第一预占用图的码流比特开销,但是二者之差在预设范围之内时,则定对上述N帧点云中的每帧点云进行打包的方法是:根据全局占用图对上述N帧点云中的每帧点云进行打包。否则,确定对上述N帧点云中的每帧点云进行打包的方法是该预打包算法。
如图13所示,为本申请实施例提供的一种打包方法的流程示意图。本实施例可以认为是图6中提供的打包方法的一个具体示例。该方法包括如下步骤:
S201:以数组形式存放点云组的各点云中的patch。其中,一个点云中的全局匹配patch排列在非全局匹配patch之前,也就是说,全局匹配patch所在的数组的编号小于非全局匹配patch所在的数组的编号。当然具体实现时不限于此。
本实施例中,点云组是一个GOF,且一个GOF包括N帧点云,例如N=32。一个GOF内的每一帧全局匹配patch个数均为M,M在下文的程序中被标记为GobalPatchCount。
S202:计算M个集合对应的并集patch占用图(unionPatch)。其中,每个集合中包含的全局匹配patch的相关说明可以参考上文。S202可以包括:
首先,对第i帧点云,遍历其所有全局匹配patch,i=0、1、2、3~31;根据第i帧点中的第j个全局匹配patch,获取第j个并集patch占用图的宽和高的程序可以如下:
Figure PCTCN2019103124-appb-000002
其次,对第i帧点云,遍历其所有全局匹配patch。对于第j个全局匹配patch,计算第j个并集patch占用图。可选的,并集patch的占用图的分辨率可以是16*16。
获取第j个并集patch占用图的程序可以如下:
Figure PCTCN2019103124-appb-000003
S203:对点云组的M个并集patch占用图进行打包,获得全局占用图(occupancymap)。
其中,打包方法可以参考图5所示的方法。执行该步骤之后,可以确定每一个并集patch占用图在全局占用图中的位置坐标,即u0和v0值。
S204:判断点云组中的部分或全部帧高度的最大值与全局占用图的高度之差是否在预设范围之内。若是,说明点云组中的帧高度的最大值与全局占用图的高度相当,则执行S205。若否,则执行S206。
S204具体包括如下步骤:
S204A:计算全局占用图的高度。假设全局占用图的高度是globalOCMPHeight,第j个全局匹配patch为unionPatch[j],j=0、1、2…、globalPatchCount-1,那么:globalOCMPHeight=max{globalOCMPHeight,unionPatch[j].v0+unionPatch[j].sizeV0}。
S204B:对点云组中的部分或全部帧中全局匹配patch按照如图5所示的方法进行打包,并获得点云的占用图高度frame[i].height。其计算方式可以为:
frame[i].height=max(frame[i].height,frame[i].Patch[j].v0+unionPatch[j].sizeV0)
其中,frame[i].Patch[j]表示第i帧中的第j个patch。
该步骤S204B中的全局匹配patch也可以替换为非全局匹配patch,或者全局匹配patch与非全局匹配patch。
S204C:计算该部分或全部点云的占用图高度的最大值maxHeight。该部分或全部帧可以是1帧,例如点云组中的第0,…或者第N-1帧。该部分或全部帧可以是2帧,例如点云组中的第0帧和第1帧或者第N-2帧和第N-1帧等。当然本申请实施例不限于此。
S204D:设置判断条件,并引入标识usingGlobalPatchPacking。若globalOCMPHeight>maxHeight*w,w为权重因子,w可以取大于1.0的值如1.1,则设置usingGlobalPatchPacking=0时采用现有技术提供的方法(如图5所示的方法)进行打包;否则设置usingGlobalPatchPacking=1,采用本申请实施例提供的方法(如图6所示的方法)进行打包。
S205:将全局占用图作为点云组中的每帧点云的初始占用图,并对每帧点云的所有patch的占用图进行打包。遍历点云组的每帧点云,对于第i帧点云执行如下步骤S205A~S205B:
S205A:对第i帧点云patch数组的前globalPatchCount个全局匹配patch的占用图,采用如下方式确定其在第i帧占用图中的位置:
frame[i].patch[j].u0=unionPatch[j].u0;
frame[i].patch[j].v0=unionPatch[j].v0。
其中,frame[i].patch[j]表示第i帧中的第j个patch,j=0、…、globalPatchCount-1。
S205B:对第i帧点云patch数组的剩余patch采用如图5所示的打包方法依次打包到第i点云的占用图中。
更新第i帧点云的占用图的程序可以如下:
Figure PCTCN2019103124-appb-000004
执行S205之后,则结束。
S206:按照如图5所示的方法对点云组的各帧点云中的patch的占用图进行打包。
需要说明的是,如果GOF中的点云的动态变化不是很大,那么,该GOF的各帧分割得到的patch非常相似,该情况下,适合以GOF为单位执行本申请实施例提供的技术方案,也就是说,该情况下,点云组具体可以是一个GOF。
如果GOF中的点云的动态变化比较大,那么,具有匹配关系的patch在相邻两帧中的形状可能发生比较大的变化,这样,如果以一个GOF为单位执行本申请实施例提供的点云编码方法,会导致获得的并集patch占用图相对获得该并集patch占用图所采用的全局匹配patch有较大的空白空间,这些空间会导致对多个并集patch占用图打包 获得的全局占用图的高度变得很大,不利于后续视频编码。因此,该情况下,可以将一个GOF分为K个子GOF,也就是说,点云组具体可以是一个子GOF。其中,K是大于等于2的整数,可选的,K为2、4、8、16、……、N/2。可选的,当N帧的剩余的不够一个子GOF的帧作为一个子GOF。
上述主要从方法的角度对本申请实施例提供的方案进行了介绍。为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以根据上述方法示例对编码器进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
图14为本申请实施例提供的一种编码器140的示意性框图。编码器140可以包括打包模块1401和编码模块1402。作为一个示例,打包模块1401可以对应图2中的打包模块102,打包模块1401生成的点云的占用图被发送到深度图生成模块103、纹理图生成模块104和第二填充模块111,接收点云的占用图的模块与其相连接或相通信的模块共同配合完成编码功能,例如根据所述N帧点云的占用图对所述N帧点云进行编码。关于具体的编码功能可以参考现有技术或上文中对图2所示的编码器的原理的解释,此处不再赘述。
在一种可行的实施方式中,打包模块1401用于执行以下步骤:获取点云组的各帧点云中的全局匹配patch;该点云组包括N帧点云;N≥2,N是整数;确定与M个集合对应的M个并集patch占用图;该M个集合中的每个集合包括N个全局匹配patch,该N个全局匹配patch是跨该N帧点云中具有匹配关系的patch;该M个集合中的第m个集合对应的并集patch占用图为第m个集合中的各全局匹配patch的占用图的并集;1≤m≤M,m和M均是整数;对该M个并集patch占用图进行打包,得到全局占用图;全局占用图用于确定该M个并集patch占用图在全局占用图中的位置;对该N帧点云中的每帧点云进行打包,得到该N帧点云的占用图;其中,第n帧点云中的第m个全局匹配patch的占用图在第n帧点云的占用图中的位置,对应于该M个并集patch占用图中的第m个并集patch占用图在全局占用图中的位置。编码模块1402用于:根据该N帧点云的占用图对该N帧点云进行编码。例如,结合图6,打包模块1401可以用于执行S101~S104,编码模块1402可以用于执行S105。
在一种可行的实施方式中,打包模块1401具体用于:基于第m个并集patch占用图在全局占用图中的位置,确定第n帧点云中的第m个全局匹配patch的占用图在第n帧点云的占用图中的位置;基于第n帧点云中的部分或全部全局匹配patch的占用图在第n帧点云的占用图中的位置,对第n帧点云中的全局匹配patch和非全局匹配patch 的占用图进行打包,得到第n帧点云的占用图;其中,该部分或全部全局匹配patch包括第m个全局匹配patch。
在一种可行的实施方式中,打包模块1401具体用于:基于该部分或全部全局匹配patch的占用图在第n帧点云的占用图中的位置,将该部分或全部全局匹配patch的占用图映射到第n帧点云的初始占用图中;基于映射有该部分或全部patch的占用图的第n帧点云的初始占用图,对第n帧点云中的除该部分或全部全局匹配patch之外的其他patch的占用图进行打包,得到第n帧点云的占用图。
在一种可行的实施方式中,第n帧点云中的非全局匹配patch的占用图占用预设位置范围,或者不占用预设位置范围;其中,预设位置范围是指属于第n帧点云中的全局匹配patch所属的集合对应的并集patch占用图,且不属于第n帧点云中的全局匹配patch的占用图的位置范围。
在一种可行的实施方式中,打包模块1401具体用于:当该N帧点云中的第一部分点云的预占用图的最大尺寸与全局占用图的尺寸之差在预设范围之内时,根据全局占用图对该N帧点云中的每帧点云进行打包。关于第一部分点云的预占用图的相关说明可以参考上文。
在一种可行的实施方式中,打包模块1404具体用于:当该N帧点云中的第一部分点云的预占用图的最大尺寸与所述N帧点云中的第二部分点云的预占用图的最大尺寸之差在预设范围之内时,根据所述全局占用图对所述N帧点云中的每帧点云进行打包。关于第一部分点云的预占用图和第二部分点云的预占用图的相关说明可以参考上文。
在一种可行的实施方式中,点云组是一个帧组GOF;或者,点云组是一个子GOF,该子GOF包括一个GOF中的时间上连续的多帧点云;打包模块1401具体用于:获取参考点云;参考点云是该GOF中的任意一帧点云;对于参考点云中的第i个patch,如果该GOF中的各非参考点云中均存在与目标patch相匹配的patch,则确定第i个patch和与第i个patch相匹配的patch为全局匹配patch;其中,目标patch是第i个patch或目标patch是第i个patch的匹配patch,第i个patch是参考点云中的任意一个patch。
在一种可行的实施方式中,当点云组是GOF时,参考点云是该GOF中的第一帧点云。
在一种可行的实施方式中,当点云组是子GOF时,参考点云是该子GOF中的第一帧点云或者该子GOF的前一个子GOF中的任意一帧点云。
在一种可行的实施方式中,参考点云是该子GOF的前一个子GOF中的最后一帧点云。
在一种可行的实施方式中,该最后一帧点云是执行打包操作后得到的点云。
可以理解的,本申请实施例的编码器中的各个模块为实现本申请对应的点云编码方法中所包含的各种执行步骤的功能主体,即具备实现完整实现本申请对应的点云编码方法中的各个步骤以及这些步骤的扩展及变形的功能主体,为简洁起见,本文将不再赘述。
图15为用于本申请实施例的编码设备150的一种实现方式的示意性框图。其中, 编码设备150可以包括处理器1510、存储器1530和总线系统1550。其中,处理器1510和存储器1530通过总线系统1550相连,该存储器1530用于存储指令,该处理器1510用于执行该存储器1530存储的指令,以执行本申请描述的各种点云编码或解码方法。为避免重复,这里不再详细描述。
在本申请实施例中,该处理器1510可以是中央处理单元(central processing unit,CPU),该处理器1510还可以是其他通用处理器、DSP、ASIC、FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
该存储器1530可以包括ROM设备或者RAM设备。任何其他适宜类型的存储设备也可以用作存储器1530。存储器1530可以包括由处理器1510使用总线1550访问的代码和数据1531。存储器1530可以进一步包括操作系统1533和应用程序1535,该应用程序1535包括允许处理器1510执行本申请描述的点云编码或解码方法(尤其是本申请描述的基于当前图像块的块尺寸对当前图像块进行滤波的方法)的至少一个程序。例如,应用程序1535可以包括应用1至N,其进一步包括执行在本申请描述的点云编码或解码方法的点云编码或解码应用(简称点云译码应用)。
该总线系统1550除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线系统1550。
可选的,编码设备150还可以包括一个或多个输出设备,诸如显示器1570。在一个示例中,显示器1570可以是触感显示器,其将显示器与可操作地感测触摸输入的触感单元合并。显示器1570可以经由总线1550连接到处理器1510。
本领域技术人员能够领会,结合本文公开描述的各种说明性逻辑框、模块和算法步骤所描述的功能可以硬件、软件、固件或其任何组合来实施。如果以软件来实施,那么各种说明性逻辑框、模块、和步骤描述的功能可作为一或多个指令或代码在计算机可读媒体上存储或传输,且由基于硬件的处理单元执行。计算机可读媒体可包含计算机可读存储媒体,其对应于有形媒体,例如数据存储媒体,或包括任何促进将计算机程序从一处传送到另一处的媒体(例如,根据通信协议)的通信媒体。以此方式,计算机可读媒体大体上可对应于(1)非暂时性的有形计算机可读存储媒体,或(2)通信媒体,例如信号或载波。数据存储媒体可为可由一或多个计算机或一或多个处理器存取以检索用于实施本申请中描述的技术的指令、代码和/或数据结构的任何可用媒体。计算机程序产品可包含计算机可读媒体。
作为实例而非限制,此类计算机可读存储媒体可包括RAM、ROM、EEPROM、CD-ROM或其它光盘存储装置、磁盘存储装置或其它磁性存储装置、快闪存储器或可用来存储指令或数据结构的形式的所要程序代码并且可由计算机存取的任何其它媒体。并且,任何连接被恰当地称作计算机可读媒体。举例来说,如果使用同轴缆线、光纤缆线、双绞线、数字订户线(DSL)或例如红外线、无线电和微波等无线技术从网站、服务器或其它远程源传输指令,那么同轴缆线、光纤缆线、双绞线、DSL或例如红外线、无线电和微波等无线技术包含在媒体的定义中。但是,应理解,所述计算机可读存储媒体和数据存储媒体并不包括连接、载波、信号或其它暂时媒体,而是实际上针对于非暂时性有形存储媒体。如本文中所使用,磁盘和光盘包含压缩光盘(CD)、激光光盘、 光学光盘、DVD和蓝光光盘,其中磁盘通常以磁性方式再现数据,而光盘利用激光以光学方式再现数据。以上各项的组合也应包含在计算机可读媒体的范围内。
可通过例如一或多个数字信号处理器(DSP)、通用微处理器、专用集成电路(ASIC)、现场可编程逻辑阵列(FPGA)或其它等效集成或离散逻辑电路等一或多个处理器来执行指令。因此,如本文中所使用的术语“处理器”可指前述结构或适合于实施本文中所描述的技术的任一其它结构中的任一者。另外,在一些方面中,本文中所描述的各种说明性逻辑框、模块、和步骤所描述的功能可以提供于经配置以用于编码和解码的专用硬件和/或软件模块内,或者并入在组合编解码器中。而且,所述技术可完全实施于一或多个电路或逻辑元件中。在一种示例下,编码器100及解码器200中的各种说明性逻辑框、单元、模块可以理解为对应的电路器件或逻辑元件。
本申请的技术可在各种各样的装置或设备中实施,包含无线手持机、集成电路(IC)或一组IC(例如,芯片组)。本申请中描述各种组件、模块或单元是为了强调用于执行所揭示的技术的装置的功能方面,但未必需要由不同硬件单元实现。实际上,如上文所描述,各种单元可结合合适的软件和/或固件组合在编码解码器硬件单元中,或者通过互操作硬件单元(包含如上文所描述的一或多个处理器)来提供。
以上所述,仅为本申请示例性的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应该以权利要求的保护范围为准。

Claims (24)

  1. 一种点云编码方法,其特征在于,点云组包括N帧点云,N≥2,N是整数,所述方法包括:
    获取所述N帧点云的全局匹配patch;
    确定与M个集合对应的M个并集patch占用图;所述M个集合中的每个集合包括N个全局匹配patch,所述N个全局匹配patch是跨所述N帧点云中具有匹配关系的patch;所述M个集合中的第m个集合对应的并集patch占用图为所述第m个集合中的各全局匹配patch的占用图的并集;1≤m≤M,m和M均是整数;
    对所述M个并集patch占用图进行打包,得到全局占用图;所述全局占用图用于确定所述M个并集patch占用图在所述全局占用图中的位置;
    利用全局占用图对所述N帧点云中的每帧点云中的全局匹配patch和非全局匹配patch的占用图进行打包,得到所述N帧点云的占用图;
    根据所述N帧点云的占用图对所述N帧点云进行编码。
  2. 根据权利要求1所述的点云编码方法,其特征在于,所述N帧点云中的第n帧点云中的第m个全局匹配patch的占用图在所述第n帧点云的占用图中的位置,对应于所述M个并集patch占用图中的第m个并集patch占用图在所述全局占用图中的位置;1≤n≤N,n是整数。
  3. 根据权利要求2所述的点云编码方法,其特征在于,所述利用全局占用图对所述N帧点云中的每帧点云中的全局匹配patch和非全局匹配patch的占用图进行打包,得到所述N帧点云的占用图,包括:
    基于所述第m个并集patch占用图在所述全局占用图中的位置,确定所述第n帧点云中的第m个全局匹配patch的占用图在所述第n帧点云的占用图中的位置;
    基于所述第n帧点云中的部分或全部全局匹配patch的占用图在所述第n帧点云的占用图中的位置,对所述第n帧点云中的全局匹配patch和非全局匹配patch的占用图进行打包,得到所述第n帧点云的占用图;其中,所述部分或全部全局匹配patch包括所述第m个全局匹配patch。
  4. 根据权利要求3所述的点云编码方法,其特征在于,所述基于所述第n帧点云中的部分或全部全局匹配patch的占用图在所述第n帧点云的占用图中的位置,对所述第n帧点云中的全局匹配patch和非全局匹配patch的占用图进行打包,得到所述第n帧点云的占用图,包括:
    基于所述部分或全部全局匹配patch的占用图在所述第n帧点云的占用图中的位置,将所述部分或全部全局匹配patch的占用图映射到所述第n帧点云的初始占用图中;
    基于映射有所述部分或全部patch的占用图的所述第n帧点云的初始占用图,对所述第n帧点云中的除所述部分或全部全局匹配patch之外的其他patch的占用图进行打包,得到所述第n帧点云的占用图。
  5. 根据权利要求3或4所述的点云编码方法,其特征在于,所述第n帧点云中的非全局匹配patch的占用图占用预设位置范围,或者不占用预设位置范围;其中,所述预设位置范围是指属于所述第n帧点云中的全局匹配patch所属的集合对应的并集 patch占用图,且不属于所述第n帧点云中的全局匹配patch的占用图的位置范围。
  6. 根据权利要求1至5任一项所述的点云编码方法,其特征在于,所述利用全局占用图对所述N帧点云中的每帧点云中的全局匹配patch和非全局匹配patch的占用图进行打包,得到所述N帧点云的占用图,包括:
    当所述N帧点云中的第一部分点云的预占用图的最大尺寸与所述全局占用图的尺寸之差在预设范围之内时,根据所述全局占用图对所述N帧点云中的每帧点云中的全局匹配patch和非全局匹配patch的占用图进行打包。
  7. 根据权利要求1至5任一项所述的点云编码方法,其特征在于,所述利用全局占用图对所述N帧点云中的每帧点云中的全局匹配patch和非全局匹配patch的占用图进行打包,得到所述N帧点云的占用图,包括:
    当所述N帧点云中的第一部分点云的预占用图的最大尺寸与所述N帧点云中的第二部分点云的预占用图的最大尺寸之差在预设范围之内时,根据所述全局占用图对所述N帧点云中的每帧点云中的全局匹配patch和非全局匹配patch的占用图进行打包。
  8. 根据权利要求1至7任一项所述的点云编码方法,其特征在于,所述点云组是一个帧组GOF;或者,所述点云组是一个子GOF,所述子GOF包括一个GOF中的时间上连续的多帧点云;所述获取所述N帧点云的全局匹配patch,包括:
    获取参考点云;所述参考点云是所述GOF中的任意一帧点云;
    对于所述参考点云中的第i个patch,如果所述GOF中的各非参考点云中均存在与目标patch相匹配的patch,则确定所述第i个patch和与所述目标patch相匹配的patch为全局匹配patch;其中,所述目标patch是所述第i个patch或所述目标patch是所述第i个patch的匹配patch,所述第i个patch是所述参考点云中的任意一个patch。
  9. 根据权利要求8所述的点云编码方法,其特征在于,
    当所述点云组是所述GOF时,所述参考点云是所述GOF中的第一帧点云;
    或者,当所述点云组是所述子GOF时,所述参考点云是所述子GOF中的第一帧点云或者所述子GOF的前一个子GOF中的任意一帧点云。
  10. 根据权利要求9所述的点云编码方法,其特征在于,所述参考点云是所述子GOF的前一个子GOF中的最后一帧点云。
  11. 根据权利要求10所述的点云编码方法,其特征在于,所述最后一帧点云是执行打包操作后得到的点云。
  12. 一种编码器,其特征在于,点云组包括N帧点云,N≥2,N是整数,所述编码器包括打包模块和编码模块;其中:
    所述打包模块用于:获取所述N帧点云的全局匹配patch;
    确定与M个集合对应的M个并集patch占用图;所述M个集合中的每个集合包括N个全局匹配patch,所述N个全局匹配patch是跨所述N帧点云中具有匹配关系的patch;所述M个集合中的第m个集合对应的并集patch占用图为所述第m个集合中的各全局匹配patch的占用图的并集;1≤m≤M,m和M均是整数;
    对所述M个并集patch占用图进行打包,得到全局占用图;所述全局占用图用于确定所述M个并集patch占用图在所述全局占用图中的位置;
    利用全局占用图对所述N帧点云中的每帧点云中的全局匹配patch和非全局匹配 patch的占用图进行打包,得到所述N帧点云的占用图;
    所述编码模块用于:根据所述N帧点云的占用图对所述N帧点云进行编码。
  13. 根据权利要求12所述的编码器,其特征在于,所述N帧点云中的第n帧点云中的第m个全局匹配patch的占用图在所述第n帧点云的占用图中的位置,对应于所述M个并集patch占用图中的第m个并集patch占用图在所述全局占用图中的位置;1≤n≤N,n是整数。
  14. 根据权利要求13所述的编码器,其特征在于,所述打包模块具体用于:
    基于所述第m个并集patch占用图在所述全局占用图中的位置,确定所述第n帧点云中的第m个全局匹配patch的占用图在所述第n帧点云的占用图中的位置;
    基于所述第n帧点云中的部分或全部全局匹配patch的占用图在所述第n帧点云的占用图中的位置,对所述第n帧点云中的全局匹配patch和非全局匹配patch的占用图进行打包,得到所述第n帧点云的占用图;其中,所述部分或全部全局匹配patch包括所述第m个全局匹配patch。
  15. 根据权利要求14所述的编码器,其特征在于,所述打包模块具体用于:
    基于所述部分或全部全局匹配patch的占用图在所述第n帧点云的占用图中的位置,将所述部分或全部全局匹配patch的占用图映射到所述第n帧点云的初始占用图中;
    基于映射有所述部分或全部patch的占用图的所述第n帧点云的初始占用图,对所述第n帧点云中的除所述部分或全部全局匹配patch之外的其他patch的占用图进行打包,得到所述第n帧点云的占用图。
  16. 根据权利要求14或15所述的编码器,其特征在于,所述第n帧点云中的非全局匹配patch的占用图占用预设位置范围,或者不占用预设位置范围;其中,所述预设位置范围是指属于所述第n帧点云中的全局匹配patch所属的集合对应的并集patch占用图,且不属于所述第n帧点云中的全局匹配patch的占用图的位置范围。
  17. 根据权利要求12至16任一项所述的编码器,其特征在于,
    所述打包模块具体用于:当所述N帧点云中的第一部分点云的预占用图的最大尺寸与所述全局占用图的尺寸之差在预设范围之内时,根据所述全局占用图对所述N帧点云中的每帧点云中的全局匹配patch和非全局匹配patch的占用图进行打包。
  18. 根据权利要求12至16任一项所述的编码器,其特征在于,
    所述打包模块具体用于:当所述N帧点云中的第一部分点云的预占用图的最大尺寸与所述N帧点云中的第二部分点云的预占用图的最大尺寸之差在预设范围之内时,根据所述全局占用图对所述N帧点云中的每帧点云中的全局匹配patch和非全局匹配patch的占用图进行打包。
  19. 根据权利要求12至18任一项所述的编码器,其特征在于,所述点云组是一个帧组GOF;或者,所述点云组是一个子GOF,所述子GOF包括一个GOF中的时间上连续的多帧点云;所述打包模块具体用于:
    获取参考点云;所述参考点云是所述GOF中的任意一帧点云;
    对于所述参考点云中的第i个patch,如果所述GOF中的各非参考点云中均存在与目标patch相匹配的patch,则确定所述第i个patch和与所述目标patch相匹配的patch 为全局匹配patch;其中,所述目标patch是所述第i个patch或所述目标patch是所述第i个patch的匹配patch,所述第i个patch是所述参考点云中的任意一个patch。
  20. 根据权利要求19所述的编码器,其特征在于,
    当所述点云组是所述GOF时,所述参考点云是所述GOF中的第一帧点云;
    或者,当所述点云组是所述子GOF时,所述参考点云是所述子GOF中的第一帧点云或者所述子GOF的前一个子GOF中的任意一帧点云。
  21. 根据权利要求20所述的编码器,其特征在于,所述参考点云是所述子GOF的前一个子GOF中的最后一帧点云。
  22. 根据权利要求21所述的编码器,其特征在于,所述最后一帧点云是执行打包操作后得到的点云。
  23. 一种编码装置,其特征在于,所述装置包括:存储器和处理器;其中,所述存储器用于存储程序代码;所述处理器用于调用所述程序代码,以执行如权利要求1至11任一项所述的点云编码方法。
  24. 一种计算机可读存储介质,其特征在于,包括程序代码,当所述程序代码在计算机上运行时,使得如权利要求1至11任一项所述的点云编码方法。
PCT/CN2019/103124 2018-09-19 2019-08-28 点云编码方法和编码器 WO2020057338A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP19861619.5A EP3849188A4 (en) 2018-09-19 2019-08-28 POINT CLOUD CODING PROCESS AND ENCODER
US17/205,100 US11875538B2 (en) 2018-09-19 2021-03-18 Point cloud encoding method and encoder

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201811120300.8 2018-09-19
CN201811120300 2018-09-19
CN201811121017.7A CN110944187B (zh) 2018-09-19 2018-09-25 点云编码方法和编码器
CN201811121017.7 2018-09-25

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/205,100 Continuation US11875538B2 (en) 2018-09-19 2021-03-18 Point cloud encoding method and encoder

Publications (1)

Publication Number Publication Date
WO2020057338A1 true WO2020057338A1 (zh) 2020-03-26

Family

ID=69888262

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/103124 WO2020057338A1 (zh) 2018-09-19 2019-08-28 点云编码方法和编码器

Country Status (1)

Country Link
WO (1) WO2020057338A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110205338A1 (en) * 2010-02-24 2011-08-25 Samsung Electronics Co., Ltd. Apparatus for estimating position of mobile robot and method thereof
CN104298998A (zh) * 2014-09-28 2015-01-21 北京理工大学 一种3d点云的数据处理方法
CN107240129A (zh) * 2017-05-10 2017-10-10 同济大学 基于rgb‑d相机数据的物体及室内小场景恢复与建模方法
US20170347120A1 (en) * 2016-05-28 2017-11-30 Microsoft Technology Licensing, Llc Motion-compensated compression of dynamic voxelized point clouds

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110205338A1 (en) * 2010-02-24 2011-08-25 Samsung Electronics Co., Ltd. Apparatus for estimating position of mobile robot and method thereof
CN104298998A (zh) * 2014-09-28 2015-01-21 北京理工大学 一种3d点云的数据处理方法
US20170347120A1 (en) * 2016-05-28 2017-11-30 Microsoft Technology Licensing, Llc Motion-compensated compression of dynamic voxelized point clouds
CN107240129A (zh) * 2017-05-10 2017-10-10 同济大学 基于rgb‑d相机数据的物体及室内小场景恢复与建模方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3849188A4 *

Similar Documents

Publication Publication Date Title
US20210029381A1 (en) Method and apparatus for obtaining global matched patch
US11704837B2 (en) Point cloud encoding method, point cloud decoding method, encoder, and decoder
CN110944187B (zh) 点云编码方法和编码器
JP2022524785A (ja) ポイントクラウドジオメトリパディング
US11388442B2 (en) Point cloud encoding method, point cloud decoding method, encoder, and decoder
US11961265B2 (en) Point cloud encoding and decoding method and apparatus
WO2020011265A1 (zh) 点云编解码方法和编解码器
WO2020151496A1 (zh) 点云的编解码方法及装置
WO2020063294A1 (zh) 点云编解码方法及编解码器
WO2020147379A1 (zh) 点云滤波方法、装置及存储介质
US11935269B2 (en) Point cloud encoding method and encoder
US20220007037A1 (en) Point cloud encoding method and apparatus, point cloud decoding method and apparatus, and storage medium
WO2020063718A1 (zh) 点云编解码方法和编解码器
WO2020119509A1 (zh) 点云编解码方法和编解码器
WO2020057338A1 (zh) 点云编码方法和编码器
WO2022257150A1 (zh) 点云编解码方法、装置、点云编解码器及存储介质
CN112017292A (zh) 网格译码方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19861619

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019861619

Country of ref document: EP

Effective date: 20210406