WO2023202095A1 - Point cloud media encoding method and apparatus, point cloud media decoding method and apparatus, and electronic device and storage medium - Google Patents

Point cloud media encoding method and apparatus, point cloud media decoding method and apparatus, and electronic device and storage medium Download PDF

Info

Publication number
WO2023202095A1
WO2023202095A1 PCT/CN2022/137764 CN2022137764W WO2023202095A1 WO 2023202095 A1 WO2023202095 A1 WO 2023202095A1 CN 2022137764 W CN2022137764 W CN 2022137764W WO 2023202095 A1 WO2023202095 A1 WO 2023202095A1
Authority
WO
WIPO (PCT)
Prior art keywords
point cloud
sample
sub
subframe
data
Prior art date
Application number
PCT/CN2022/137764
Other languages
French (fr)
Chinese (zh)
Inventor
胡颖
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2023202095A1 publication Critical patent/WO2023202095A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation

Definitions

  • This application belongs to the field of audio and video technology, and specifically relates to a point cloud media encoding method, a point cloud media decoding method, a point cloud media encoding device, a point cloud media decoding device, computer readable media, electronic equipment and computer programs. product.
  • Point cloud is a set of discrete points randomly distributed in space that expresses the spatial structure and surface properties of a three-dimensional object or scene. After obtaining large-scale point cloud data through point cloud acquisition equipment, the point cloud data can be encoded and encapsulated for transmission, decoding and presentation to users. There are some point cloud frames with less content in the point cloud media, and there are overlapping point cloud contents between some point cloud frames. Therefore, encoding and decoding each point cloud frame separately will cause a waste of computing resources and affect Point cloud media processing efficiency.
  • a point cloud media encoding method a point cloud media decoding method, a point cloud media encoding device, a point cloud media decoding device, a computer-readable medium, an electronic device, and a computer are provided Program Products.
  • a method for decoding point cloud media is provided, which is executed by an electronic device, including:
  • the point cloud media file including point cloud samples encapsulated in one or more tracks;
  • the point cloud media file is decapsulated and decoded according to the index information of the one or more point cloud subframes to obtain point cloud data.
  • a method for encoding point cloud media is provided, which is executed by an electronic device, including:
  • the point cloud source data includes a point cloud frame having one or more point cloud subframes
  • the at least one data unit is encapsulated to obtain a point cloud media file.
  • the point cloud media file includes point cloud samples encapsulated in one or more tracks; media files for each subsample in the point cloud sample.
  • the data box includes a subframe index field; the subframe index field is used to indicate index information of one or more point cloud subframes corresponding to each data unit in the subsample; when one data unit in the subsample When the unit corresponds to the index information of at least two point cloud subframes, the at least two point cloud subframes have overlapping point cloud data.
  • a point cloud media decoding device including:
  • An acquisition module configured to acquire point cloud media files, where the point cloud media files include point cloud samples encapsulated in one or more tracks;
  • a parsing module configured to parse the media file data box of each sub-sample in the point cloud sample and obtain the value of the sub-sample flag field
  • An index module configured to obtain the index information of one or more point cloud subframes corresponding to each data unit in the subsample according to the value of the subsample flag field; when one of the subsamples When the data unit corresponds to the index information of at least two point cloud subframes, the at least two point cloud subframes have overlapping point cloud data;
  • the decoding module is configured to decapsulate and decode the point cloud media file according to the index information of the one or more point cloud subframes to obtain point cloud data.
  • a point cloud media encoding device including:
  • An acquisition module configured to acquire point cloud source data, where the point cloud source data includes a point cloud frame having one or more point cloud subframes;
  • An encoding module configured to encode the point cloud frame to obtain at least one data unit
  • An encapsulation module configured to encapsulate the at least one data unit to obtain a point cloud media file, where the point cloud media file includes point cloud samples encapsulated in one or more tracks;
  • the media file data box of each subsample includes a subframe index field; the subframe index field is used to indicate the index information of one or more point cloud subframes corresponding to each data unit in the subsample; when the When one data unit in the subsample corresponds to the index information of at least two point cloud subframes, the at least two point cloud subframes have overlapping point cloud data.
  • a computer-readable medium on which computer-readable instructions are stored.
  • the computer-readable instructions are executed by a processor, the encoding and decoding method of point cloud media in the above technical solution is implemented. .
  • an electronic device comprising: a processor; and a memory for storing computer readable instructions of the processor; wherein the processor is configured to execute The computer readable instructions are used to execute the point cloud media encoding and decoding method in the above technical solution.
  • a computer program product includes computer-readable instructions, and the computer-readable instructions are stored in a computer-readable storage medium.
  • the processor of the computer device reads the computer-readable instructions from the computer-readable storage medium, and the processor executes the computer-readable instructions, so that the computer device performs the encoding and decoding method of point cloud media as in the above technical solution.
  • Figure 1 schematically shows an exemplary system architecture block diagram applying the technical solution of the present application.
  • Figure 2 shows a schematic diagram of the point cloud media encoding and decoding process in an application scenario according to the embodiment of the present application.
  • Figure 3 shows the syntax structure of encapsulating point cloud samples based on TLV code stream format in one embodiment of the present application.
  • Figure 4 shows the syntax structure of a data unit encapsulated based on the TLV code stream format in one embodiment of the present application.
  • Figure 5 shows a schematic diagram of the principle of multi-frame combination of point cloud data in one embodiment of the present application.
  • Figure 6 shows a flow chart of the steps of the point cloud media decoding method in one embodiment of the present application.
  • Figure 7 shows an exemplary structure of encapsulating point cloud samples in a single track according to one embodiment of the present application.
  • Figure 8 shows an exemplary structure of encapsulating geometry code streams and attribute code streams in multiple tracks according to one embodiment of the present application.
  • Figure 9 shows the syntax structure of the coding-related parameter field codec_specific_parameters of the SubSampleInformationBox data box in an application scenario according to the embodiment of the present application.
  • Figure 10 shows the syntax structure of the coding-related parameter field codec_specific_parameters of the SubSampleInformationBox data box in the application scenario of uniform identification of subframe overlap/non-overlap according to the embodiment of the present application.
  • Figure 11 shows the syntax structure of the extended sample group tool in an application scenario according to the embodiment of the present application.
  • Figure 12 shows the syntax structure of identifying point cloud subframe related information based on the point cloud sample level media file data box in an application scenario according to the embodiment of the present application.
  • Figure 13 shows the syntax structure of identifying subframe related information through the subsample subframe information data box SubsampleSubframeInfoBox in an application scenario according to the embodiment of the present application.
  • Figure 14 shows the syntax structure of identifying subframe presentation time information through the subsample subframe information data box SubsampleSubframeInfoBox in an application scenario according to the embodiment of the present application.
  • Figure 15 shows the syntax structure of identifying subframe presentation time information through an extended media file data box in an application scenario according to an embodiment of the present application.
  • Figure 16 shows a structural block diagram of determining the spatial block correspondence based on three sub-sample division methods of data unit, spatial block and point cloud subframe in one embodiment of the present application.
  • Figure 17 shows a structural block diagram for determining the spatial block correspondence based on two sub-sample division methods of data unit and spatial block in an embodiment of the present application.
  • Figure 18 shows the syntax structure of the embodiment of the present application indicating the correspondence between point cloud subframes and spatial blocks through media file data boxes in an application scenario.
  • Figure 19 shows a step flow chart of a point cloud media encoding method in one embodiment of the present application.
  • Figure 20 shows a flow chart of encoding and decoding point cloud data in a streaming media transmission application scenario according to an embodiment of the present application.
  • Figure 21 shows the encapsulation result of single-track encapsulation of point cloud samples in an application scenario where point cloud subframes do not overlap with each other according to the embodiment of the present application.
  • Figure 22 shows the encapsulation result of not dividing attribute tracks into sub-samples when multi-track encapsulation of point cloud samples is performed in an application scenario where point cloud sub-frames do not overlap with each other according to the embodiment of the present application.
  • Figure 23 shows the encapsulation result of attribute track division sub-samples when multi-track encapsulation of point cloud samples in an application scenario where point cloud sub-frames do not overlap with each other according to the embodiment of the present application.
  • Figure 24 shows the encapsulation result of single-track encapsulation of point cloud samples in an application scenario where point cloud subframes overlap in an embodiment of the present application.
  • Figure 25 schematically shows a structural block diagram of a point cloud media decoding device provided by an embodiment of the present application.
  • Figure 26 schematically shows a structural block diagram of a point cloud media encoding device provided by an embodiment of the present application.
  • FIG. 27 schematically shows a structural block diagram of a computer system suitable for implementing an electronic device according to an embodiment of the present application.
  • Example embodiments will now be described more fully with reference to the accompanying drawings.
  • Example embodiments may, however, be embodied in various forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concepts of the example embodiments. To those skilled in the art.
  • the "plurality” mentioned in this article means two or more than two.
  • “And/or” describes the relationship between related objects, indicating that there can be three relationships.
  • a and/or B can mean: A exists alone, A and B exist simultaneously, and B exists alone.
  • the character “/” generally indicates that the related objects are in an "or” relationship.
  • this application involves user-related data such as point cloud media transmission content, decoding content, and consumption content.
  • user permission is required. Or agree, and the collection, use and processing of relevant data need to comply with relevant laws, regulations and standards of relevant countries and regions.
  • Immersive media media content that can bring immersive experience to consumers.
  • Immersive media can be divided into 3DoF media, 3DoF+ media and 6DoF media according to the user's degree of freedom when consuming media content.
  • Point cloud media is a typical 6DoF media.
  • DoF Degree of Freedom
  • degree of freedom In this application, it refers to the degree of freedom a user has to support movement and interact with content while viewing immersive media.
  • 3DoF Three degrees of freedom, which refers to the three degrees of freedom for the user's head to rotate around the x, y, and z axes.
  • 3DoF+ In addition to three degrees of freedom, the user also has limited degrees of freedom for movement along the x, y, and z axes.
  • 6DoF In addition to three degrees of freedom, the user also has the freedom to move freely along the x, y, and z axes.
  • Point cloud is a set of discrete points randomly distributed in space that expresses the spatial structure and surface properties of a three-dimensional object or scene. Each point in the point cloud has at least three-dimensional position information. Depending on the application scenario, it may also have color, material or other information. Typically, each point in a point cloud has the same number of additional attributes.
  • PCC Point Cloud Compression, point cloud compression.
  • G-PCC Geometry-based Point Cloud Compression, point cloud compression based on geometric model.
  • Sample the encapsulation unit in the media file encapsulation process.
  • a media file consists of many samples. Taking video media as an example, a sample of video media is usually a video frame.
  • Point cloud slice/point cloud strip which represents a set of syntax elements (such as geometric slices and attribute slices) of part or all of the encoded point cloud frame data.
  • Tile point cloud space tiles.
  • DASH dynamic adaptive streaming over HTTP
  • dynamic adaptive streaming based on HTTP is an adaptive bitrate streaming technology that enables high-quality streaming media to be delivered over the Internet through traditional HTTP web servers.
  • point cloud media can be divided into point cloud media (Video-based Point Cloud Compression, VPCC) that is compressed based on traditional video coding methods and point cloud media that is compressed based on geometric features (Geometry-based Point Cloud Compression, GPCC).
  • VPCC Video-based Point Cloud Compression
  • GPCC Geometry-based Point Cloud Compression
  • the three-dimensional position information is usually called the geometry component of the point cloud media file
  • the attribute information is called the attribute component of the point cloud media file.
  • a point cloud media file has only one geometric component, but there can be one or more attribute components.
  • Point cloud can flexibly and conveniently express the spatial structure and surface properties of three-dimensional objects or scenes, so it is widely used. Its main application scenarios can be classified into two categories. 1) Machine-perceived point clouds, such as Computer Aided Design (CAD), Autonomous Navigation System (ANS), real-time inspection system, Geography Information System (GIS), and visual sorting robots , rescue and disaster relief robots. 2) The human eye perceives point clouds, such as point cloud application scenarios such as virtual reality (VR) games, digital cultural heritage, free-viewpoint broadcasting, three-dimensional immersive communication, and three-dimensional immersive interaction.
  • CAD Computer Aided Design
  • ANS Autonomous Navigation System
  • GIS Geography Information System
  • VR visual sorting robots
  • rescue and disaster relief robots rescue and disaster relief robots.
  • point clouds such as point cloud application scenarios such as virtual reality (VR) games, digital cultural heritage, free-viewpoint broadcasting, three-dimensional immersive communication, and three-dimensional immersive interaction.
  • VR virtual reality
  • Point clouds The main ways to obtain point clouds are: computer generation, 3D laser scanning, 3D photogrammetry, etc.
  • Computers can generate point clouds of virtual three-dimensional objects and scenes.
  • 3D scanning can obtain point clouds of static real-world three-dimensional objects or scenes, and millions of point clouds can be obtained per second.
  • 3D photography can obtain point clouds of dynamic real-world three-dimensional objects or scenes, and tens of millions of point clouds can be obtained per second.
  • point clouds of biological tissues and organs can be obtained from MRI, CT, and electromagnetic positioning information.
  • Figure 1 shows a schematic diagram of an exemplary system architecture to which the technical solution of the embodiment of the present application can be applied.
  • system architecture 100 includes a plurality of terminal devices that can communicate with each other through, for example, network 150 .
  • the system architecture 100 may include a first terminal device 110 and a second terminal device 120 interconnected through a network 150 .
  • the first terminal device 110 and the second terminal device 120 perform one-way data transmission.
  • the first terminal device 110 may encode point cloud data (such as point cloud data collected by the terminal device 110) for transmission to the second terminal device 120 through the network 150, and the encoded point cloud data is represented by one or more
  • the encoded point cloud data is transmitted in the form of a code stream.
  • the second terminal device 120 can receive the encoded point cloud data from the network 150, decode the encoded point cloud data to restore the point cloud data, and display the point cloud according to the restored point cloud data. content.
  • the system architecture 100 may include a third terminal device 130 and a fourth terminal device 140 that perform bidirectional transmission of encoded point cloud data, which bidirectional transmission may occur during a video conference, for example.
  • each of the third terminal device 130 and the fourth terminal device 140 may encode point cloud data (eg, point cloud data collected by the terminal device) for transmission to the third terminal through the network 150 device 130 and another one of the fourth terminal devices 140 .
  • Each of the third terminal device 130 and the fourth terminal device 140 may also receive the encoded point cloud data transmitted by the other terminal device of the third terminal device 130 and the fourth terminal device 140, and may modify the encoded point cloud data.
  • the encoded point cloud data is decoded to recover the point cloud data, and the point cloud content can be displayed on an accessible display device based on the recovered point cloud data.
  • the first terminal device 110 , the second terminal device 120 , the third terminal device 130 and the fourth terminal device 140 may be servers, personal computers and smart phones, but the principles disclosed in this application may not be limited thereto. . Embodiments disclosed herein are suitable for use with laptops, tablets, media players, and/or dedicated video conferencing devices.
  • the network 150 represents any number of networks that transmit encoded point cloud data between the first terminal device 110 , the second terminal device 120 , the third terminal device 130 and the fourth terminal device 140 , including, for example, wired and/or wireless communication networks. .
  • Communication network 150 may exchange data in circuit-switched and/or packet-switched channels.
  • the network may include telecommunications networks, local area networks, wide area networks, and/or the Internet. For purposes of this application, unless explained below, the architecture and topology of network 150 may be immaterial to the operations disclosed herein.
  • the server in the embodiment of this application may be an independent physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server that provides cloud computing services.
  • the terminal can be a smartphone, tablet, laptop, desktop computer, smart speaker, smart watch, vehicle terminal, smart TV, etc., but is not limited to this.
  • the terminal and the server can be connected directly or indirectly through wired or wireless communication methods, which is not limited in this application.
  • the encoded data stream needs to be encapsulated and transmitted to the user.
  • the point cloud file needs to be decapsulated first, then decoded, and finally the decoded data stream is presented.
  • Figure 2 shows a schematic diagram of the point cloud media encoding and decoding process in an application scenario according to the embodiment of the present application.
  • the real-world visual scene A can be captured by collecting point cloud data through the collection device 210.
  • the collection device 210 may be, for example, a set of cameras or a camera device with multiple lenses and sensors.
  • the collection result is point cloud source data B, which is a frame sequence composed of a large number of point cloud frames.
  • One or more point cloud frames may be encoded by the encoder 220 to obtain an encoded G-PCC bit stream, which may specifically include an encoded geometry bit stream and an attribute bit stream E.
  • the file encapsulator 230 can encapsulate one or more encoded bit streams according to a specific media container file format to obtain a media file F for file playback or a series of initialization segments and media segments Fs for streaming transmission.
  • the media container file format may be, for example, the ISO basic media file format specified in ISO/IEC 14496-12 [ISOBMFF].
  • File encapsulator 230 may also encapsulate metadata in media files F or media segments Fs.
  • the media file F output by the file encapsulator 230 is the same as the media file F′ input by the file depackager 240 .
  • the file decapsulator can extract the encoded bit stream E' and parse the metadata by processing the media file F' or processing the received media fragments F's.
  • the decoder 250 may decode the G-PCC bit stream into a decoded signal D' and generate point cloud data according to the decoded signal D'.
  • point cloud data may be rendered and displayed by renderer 260 to a head mounted display or any other display device based on the current viewing position, viewing direction, or viewport determined by various types of sensors (e.g., head). on the screen.
  • the current viewing position or viewing direction can also be used for decoding optimization.
  • the current viewing position and viewing direction are also passed to the policy module, which can be used to determine which track to receive.
  • streaming transmission technology is usually used to handle the transmission of media resources between the server and the client.
  • Common media streaming transmission technologies include DASH (Dynamic Adaptive Streaming over HTTP), HLS (HTTP Live Streaming), SMT (Smart Media Transport) and other technologies.
  • DASH is an adaptive bitrate streaming technology that enables high-quality streaming media to be delivered over the Internet through traditional HTTP web servers.
  • DASH breaks the content into a series of small HTTP-based file fragments, each fragment contains a short length of playable content, and the total length of the content may be several hours (such as a movie or live sports event).
  • Content will be cut into multiple bitrate alternatives to provide multiple bitrate versions for selection.
  • the client When media content is played by a DASH client, the client will automatically select which alternative to download and play based on current network conditions. The client will select for playback the highest bitrate clip that can be downloaded in a timely manner, thus avoiding playback stutters or rebuffering events. Because of this, the DASH client can seamlessly adapt to changing network conditions and provide a high-quality playback experience with less lag and rebuffering.
  • DASH uses existing HTTP web server infrastructure. It allows devices such as Internet TVs, TV set-top boxes, desktop computers, smartphones, tablets and other devices to consume multimedia content (such as videos, TV, radio, etc.) transmitted through the Internet, and can cope with changing Internet reception conditions.
  • devices such as Internet TVs, TV set-top boxes, desktop computers, smartphones, tablets and other devices to consume multimedia content (such as videos, TV, radio, etc.) transmitted through the Internet, and can cope with changing Internet reception conditions.
  • each G-PCC point cloud sample corresponds to a point cloud frame and consists of one or more G-PCC data units belonging to the same presentation time.
  • FIG. 3 shows the syntax structure of encapsulating point cloud samples based on TLV code stream format in one embodiment of the present application.
  • each point cloud sample consists of one or more data units G-PCC unit.
  • gpcc_unit contains a single G-PCC data unit.
  • G-PCC data units in the same point cloud sample correspond to the same point cloud frame and belong to the same presentation time.
  • TLV code stream format namely Type-length-value bytestream format, refers to a structure composed of data type Type, data length Length and data value Value.
  • TLV code stream format namely Type-length-value bytestream format, refers to a structure composed of data type Type, data length Length and data value Value.
  • Figure 4 shows the syntax structure of a data unit encapsulated based on the TLV code stream format in one embodiment of the present application.
  • tlv_type is a type field used to indicate the type of data unit.
  • Table 1 shows the semantic description of different values of the data unit type field in an embodiment of the present application.
  • the type field with different values can be used to indicate different data unit types.
  • the type of the data unit is the geometry parameter set GPS (Geometry Parameter Set).
  • each point cloud frame of the point cloud media is encoded separately, which will lead to the following two problems in the encoding and decoding process of the point cloud media.
  • One issue is that in frame-based point cloud content, the file size of each frame may be relatively small, which is less efficient for the I/O interface.
  • Another problem is that the decoder needs to run from the initial bounding box and divide each single frame. There is a lot of overhead in initializing the decoder in edge devices.
  • the first problem can be easily solved by concatenating the encoded bitstream of consecutive frames.
  • the second problem is difficult to avoid unless the point clouds are combined before encoding.
  • the embodiment of this application proposes a combined frame coding solution to solve these two problems by introducing frame index coding into the combined point cloud.
  • the embodiments of the present application can greatly improve the coding efficiency, and are therefore also beneficial to the storage and use of frame-based point cloud content.
  • Figure 5 shows a schematic diagram of the principle of multi-frame combination of point cloud data in one embodiment of the present application.
  • a combined frame can be formed by combining the first point cloud frame Frame1 and the second point cloud frame Frame2.
  • each frame of the newly obtained sequence will contain multiple point cloud subframes.
  • Point cloud subframes are composed of points with the same frame number or frame index attribute value. Partial representation of a point cloud frame.
  • the single octree of each point cloud frame has a similar structure at a higher level; in the leaf nodes of the combined frame, there are also some from different frames.
  • Duplicate content that is, overlapping point cloud data.
  • Figure 6 shows a step flow chart of the point cloud media decoding method in one embodiment of the present application. This method can be applied to various electronic devices in the server, client or intermediate node of the point cloud media system.
  • the present application takes a point cloud media decoding method executed by a client device installed with a point cloud decoding device as an example. As shown in Figure 6, the point cloud media decoding method includes the following steps S610 to S640.
  • Step S610 Obtain a point cloud media file.
  • the point cloud media file includes point cloud samples encapsulated in one or more tracks;
  • Step S620 Analyze the media file data box of each sub-sample in the point cloud sample to obtain the value of the sub-sample flag field;
  • Step S630 Obtain the index information of one or more point cloud subframes corresponding to each data unit in the subsample according to the value of the subsample flag field; when one data unit in the subsample corresponds to at least two point cloud subframes When providing frame index information, at least two point cloud subframes have overlapping point cloud data; and
  • Step S640 Decapsulate and decode the point cloud media file according to the index information of one or more point cloud subframes to obtain point cloud data.
  • Subsamples are data encapsulation units in point cloud samples.
  • the subsample flag field can also be used to indicate the division method of subsamples. Different types of subsamples can be divided into point cloud samples based on different division methods. For example, based on Different dimensions such as data units, spatial blocks or point cloud subframes can be divided into point cloud samples to obtain subsamples with different data capacities.
  • a point cloud subframe is a partial representation of a point cloud frame consisting of points with the same index information (such as frame number or frame index attribute value).
  • index information such as frame number or frame index attribute value.
  • the index information of each data unit in the sub-sample and one or more point cloud sub-frames can be indicated.
  • one or more point cloud subframes corresponding to each data unit in the point cloud sample can be jointly decoded as a combined frame.
  • it can reduce the need to separately decode point cloud frames with less content.
  • the resulting waste of computing resources can, on the other hand, identify point cloud subframes with overlapping point cloud data and improve the decoding efficiency of point cloud media.
  • step S610 a point cloud media file is obtained.
  • the point cloud media file includes point cloud samples encapsulated in one or more tracks.
  • the point cloud media file may be a media file or media segment obtained after encoding and encapsulation processing as shown in Figure 2.
  • the media file or media segment carries a point cloud code stream to be transmitted.
  • the data source can encapsulate the point cloud code stream into a single track based on the geometric parameter information, attribute parameter information and point cloud slice parameter information contained in the point cloud code stream, or it can also encapsulate a single track
  • the point cloud media file of the track is repackaged into a point cloud media file containing multiple tracks.
  • a track refers to a volumetric visual track used to carry a coded geometry bitstream or a coded attribute bitstream, or a volumetric visual track that carries both a coded geometry bitstream and a coded attribute bitstream.
  • each point cloud sample can correspond to a complete point cloud frame.
  • step S620 the media file data box of each sub-sample in the point cloud sample is analyzed to obtain the value of the sub-sample flag field.
  • the media file data box may be a data box based on the ISO basic media file format ISOBMFF (ISO Base Media File Format).
  • ISOBMFF ISO Base Media File Format
  • ISOBMFF ISO Base Media File Format
  • Figure 7 shows an exemplary structure of encapsulating point cloud samples in a single track according to one embodiment of the present application.
  • moov represents the metadata information of the point cloud sample; the metadata information includes “trak”, “stbl”, “stsd”, “gpe1", “gpcC”, “xPS”, “stsz” and “subs”, etc.
  • Various fields; mdat represents the specific media data carried in the point cloud sample, including each point cloud sample.
  • Each point cloud subframe includes different data units, such as the geometry data unit (geometry data unit), attribute data unit (attribute data unit) and frame index attribute data unit (frame index attribute data unit) shown in the figure. ).
  • Figure 8 shows an exemplary structure of encapsulating geometry code streams and attribute code streams in multiple tracks according to one embodiment of the present application.
  • ftyp represents the file type and describes the version of the specification that the point cloud sample complies with
  • moov represents the metadata information of the point cloud sample
  • mdat represents the specific media data carried in the point cloud sample.
  • G-PCC geometry track contains at least one data unit G-PCC unit, which carries a single G-PCC component data unit instead of a geometry and attribute data unit or a multiplexing of different attribute data units.
  • G-PCC attribute tracks should not reuse different attribute substreams, such as color and reflectivity.
  • the relevant information of the point cloud subframe can be identified by extending the encoding-related parameter field of the subsample information data box SubSampleInformationBox.
  • the subsample flag field is also used to indicate the division method of the subsamples, and the division method of each subsample in the point cloud sample may include:
  • the sub-samples are divided based on the spatial block, so that one sub-sample contains one or more continuous data units corresponding to a first division object, and the first division object Including at least one of spatial block, parameter set, spatial block set information or frame boundary identification; and
  • the subsamples are divided based on the point cloud subframe, so that one subsample contains one or more continuous data units corresponding to a second division object, and the second The divided object consists of a complete point cloud subframe.
  • various sub-sample division methods such as data unit division method, spatial block division method, and space block division method are distinguished through different values of the sub-sample flag field, so that sub-samples can be divided according to different division methods.
  • Dividing facilitates decoding of data unit combinations in subsamples with different data capacities, which can reduce waste of computing resources and improve decoding processing efficiency.
  • the media file data box of the sub-sample includes:
  • the subframe index field is used to indicate the index information of the point cloud subframe contained in the current subsample. Therefore, the point cloud subframe can be determined through the index information of the point cloud subframe in the subframe index field, and one or more point cloud subframes corresponding to each data unit in the point cloud sample can be jointly decoded as a combined frame. On the one hand, it can It reduces the waste of computing resources caused by separately decoding point cloud frames with less content. On the other hand, it can identify point cloud subframes with overlapping point cloud data and improve the decoding efficiency of point cloud media.
  • the subsample definition should be based on the value of the flag field in the SubSampleInformationBox data box.
  • Figure 9 shows the syntax structure of the coding-related parameter field codec_specific_parameters of the SubSampleInformationBox data box in an application scenario according to the embodiment of the present application.
  • subsample information data box SubSampleInformationBox can include the following fields:
  • payloadType indicating the tlv_type type of the G-PCC unit contained in the subsample
  • AttrIdx indicates the value of the ash_attr_sps_attr_idx field corresponding to the attribute data contained in the subsample.
  • a subsample contains one or more continuous data units corresponding to a spatial block tile, or one or more continuous data units corresponding to a parameter set, spatial block set information, or frame boundary identification.
  • subsample information data box SubSampleInformationBox can include the following fields:
  • tile_data when the value is 1, it means that the sub-sample contains the geometric data or attribute data of the corresponding tile; when the value is 0, it means that the sub-sample contains parameter set data, tile geometry information or frame boundary identification.
  • tile_id indicating the tile index number associated with the data in the subsample.
  • a subsample flag field flags When the value of the subsample flag field flags is 2, it indicates that the subframe-based subsample division method is used.
  • a subsample contains continuous data units corresponding to a complete point cloud subframe.
  • subsample information data box SubSampleInformationBox can include the following fields:
  • the subframe index field subframe_idx indicates the value of the frame number attribute corresponding to the point cloud subframe contained in the current subsample.
  • the division method of each subsample in the point cloud sample may include:
  • the sub-samples are divided based on the spatial block, so that one sub-sample contains one or more continuous data units corresponding to a first division object, and the first division object Including at least one of spatial block, parameter set, spatial block set information or frame boundary identification; and
  • the subsamples are divided based on the point cloud subframe, so that one subsample contains one or more continuous data units corresponding to a second division object, and the second The division object includes one or more point cloud subframes.
  • various sub-sample division methods such as data unit division method, spatial block division method and spatial block division method are distinguished, and the sub-frames can be divided into overlapping or non-overlapping divisions.
  • Overlapping division enables decoding of combinations of data units in subsamples with different data capacities, which can reduce waste of computing resources and improve decoding processing efficiency.
  • the media file data box of the sub-sample includes:
  • the subframe complete flag field is used to indicate whether the current subsample contains all data that constitutes the point cloud subframe
  • the number of subframes field is used to indicate the number of point cloud subframes corresponding to the current subsample.
  • the subframe index field is used to indicate the index information of the point cloud subframe corresponding to the current subsample.
  • all the data constituting the point cloud subframe is indicated through the subframe complete flag field, the number of point cloud subframes is indicated through the subframe number field, and the index information of the point cloud subframe is indicated through the subframe index field, so that
  • the corresponding point cloud subframe can be described in detail through the media file data box of the subsample, so that one or more point cloud subframes corresponding to each data unit in the point cloud sample can be jointly decoded as a combined frame.
  • it can reduce There is a waste of computing resources caused by separate decoding of point cloud frames with less content.
  • point cloud subframes with overlapping point cloud data can be identified to improve the decoding efficiency of point cloud media.
  • all data constituting the point cloud subframe include all geometric data and all attribute data; when the point cloud sample is packaged in multiple tracks, All data of point cloud subframes include all geometric data or all attribute data.
  • Figure 10 shows the syntax structure of the coding-related parameter field codec_specific_parameters of the SubSampleInformationBox data box in the application scenario of uniform identification of subframe overlap/non-overlap according to the embodiment of the present application.
  • subsample information data box SubSampleInformationBox can include the following fields:
  • payloadType indicating the tlv_type type of the data unit contained in the subsample
  • AttrIdx indicates the value of the ash_attr_sps_attr_idx field corresponding to the attribute data contained in the subsample.
  • a subsample contains one or more continuous data units corresponding to a spatial block tile, or one or more continuous data units corresponding to a parameter set, spatial block set information, or frame boundary identification.
  • subsample information data box SubSampleInformationBox can include the following fields:
  • tile_data when the value is 1, it means that the sub-sample contains the geometric data or attribute data of the corresponding tile; when the value is 0, it means that the sub-sample contains parameter set data, tile geometry information or frame boundary identification.
  • tile_id indicating the tile index number associated with the data in the subsample.
  • a subsample contains one or more continuous data units, corresponding to one or more point cloud subframes.
  • the subsample information data box SubSampleInformationBox can include the following fields.
  • subsample information data box SubSampleInformationBox can include the following fields:
  • Subframe complete flag field complete_subframe_flag When the value is 1, it means that the data unit corresponding to the current subsample contains all the data that constitutes the corresponding subframe; when the value is 0, it means that the data unit corresponding to the current subsample contains all the data that constitutes the corresponding subframe. Partial data of the subframe. (In single-track packaging mode, all data refers to all geometry and attribute data; in multi-track packaging mode, all data refers to all geometric data or attribute data of all feature types)
  • the subframe number field num_subframes indicates the number of subframes corresponding to the data unit in the subsample.
  • the subframe corresponding to the current subsample is indicated by subframe_idx; when the value of this field is greater than 1, the subframe corresponding to the current subsample is the subframe indicated by subframe_idx and the subsequent num_subframes-1 Subframes of consecutive frame numbers.
  • the subframe index field subframe_idx indicates the value of the frame number attribute corresponding to the point cloud subframe contained in the current subsample.
  • the encapsulation tracks of geometric data and attribute data can be flexibly set, which can be applied to various application scenarios and ensure decoding processing efficiency in different scenarios.
  • the media file data box of the point cloud subframe when the value of the subsample flag field is the first value, includes: a related subframe number field, used to indicate the point cloud subframe corresponding to the current subsample. The number; and the subframe index field, used to indicate the index information of the point cloud subframe corresponding to the current subsample.
  • the sample group tool in the extended media file data box can be used to identify the relevant information of the point cloud subframe, including the related subframe number field and the subframe index field, so that the number and index information of the point cloud subframe can be described, so as to One or more point cloud subframes corresponding to each data unit in the point cloud sample are jointly decoded as a combined frame. On the one hand, it can reduce the waste of computing resources caused by individually decoding point cloud frames with less content. On the other hand, it can Identify point cloud subframes with overlapping point cloud data to improve the decoding efficiency of point cloud media.
  • the relevant information of the point cloud subframe can be identified through the sample group tool in the extended media file data box.
  • the media file data box of the point cloud subframe includes:
  • Subsample number field used to indicate the number of subsamples included in the current sample
  • the related subframe number field is used to indicate the number of point cloud subframes corresponding to the current subsample.
  • the subframe index field is used to indicate the index information of the point cloud subframe corresponding to the current subsample.
  • the media file data box of the point cloud subframe includes a subsample number field, a related subframe number field, and a subframe index field, which can describe the number of subsamples, as well as the number and index information of point cloud subframes.
  • a subsample number field a subsample number field
  • a related subframe number field a subframe index field
  • the media file data box of the point cloud subframe includes a subsample number field, a related subframe number field, and a subframe index field, which can describe the number of subsamples, as well as the number and index information of point cloud subframes.
  • Figure 11 shows the syntax structure of the extended sample group tool in an application scenario according to the embodiment of the present application.
  • sample group tool in the media file data box can include the following fields:
  • the subsample number field subsample_count indicates the number of subsamples contained in the current sample.
  • the related subframe number field related_subframe_num indicates the number of point cloud subframes corresponding to the current subsample. When the value of this field is 0, it means that the information contained in the current subsample has nothing to do with the point cloud subframe division (such as tile set information or frame end identifier).
  • the subframe index field subframe_index indicates the point cloud subframe sequence number corresponding to the current subsample.
  • the value of this sequence number is the same as the value in the frame number attribute.
  • the information related to the sub-frame of the point cloud sub-frame may not be indicated at the sub-sample level, but only the information related to the sub-frame of the point cloud sub-frame may be indicated at the point cloud sample sample level. That is, in the case of overlapping point cloud subframes, only the samples with point cloud subframes and the corresponding point cloud subframe index numbers are identified at the system layer.
  • Figure 12 shows the syntax structure of identifying point cloud subframe related information based on the point cloud sample level media file data box in an application scenario according to the embodiment of the present application.
  • the media file data box in this embodiment of the present application may include the following fields:
  • the related subframe number field related_subframe_num indicates the number of point cloud subframes corresponding to the current sample.
  • the subframe index field subframe_index indicates the subframe sequence number corresponding to the current sample.
  • the value of this sequence number should be the same as the value in the frame number attribute.
  • the information related to the point cloud subframe can be identified by defining a subsample subframe information data box SubsampleSubframeInfoBox.
  • the media file data box of the point cloud subframe includes:
  • Subframe related sample number field used to indicate the number of point cloud samples containing multiple point cloud subframes
  • the sample serial number difference field is used to indicate the serial number difference between the current point cloud sample containing multiple point cloud subframes and the previous point cloud sample containing multiple point cloud subframes in the decoding order;
  • Subsample number field used to indicate the number of subsamples contained in the current point cloud sample
  • the related subframe number field is used to indicate the number of point cloud subframes corresponding to the current subsample
  • the subframe index field is used to indicate the index information of the point cloud subframe corresponding to the current subsample.
  • the media file data box of the point cloud subframe also includes a subframe-related sample number field and a sample serial number difference field to calculate the number of point cloud samples in the point cloud subframe and the serial number difference between point cloud samples. Values are described to facilitate the combined decoding processing of each point cloud sample in order, which can reduce the waste of computing resources and improve the efficiency of decoding processing.
  • Figure 13 shows the syntax structure of identifying subframe related information through the subsample subframe information data box SubsampleSubframeInfoBox in an application scenario according to the embodiment of the present application.
  • the data box type of the subsample subframe information data box SubsampleSubframeInfoBox may be 'sbfi', for example, and is included in SampleEntry or TrackFragmentBox.
  • the subsample subframe information data box is used to indicate the subframe information corresponding to each subsample divided based on the G-PCC data unit in a point cloud sample containing multiple subframes.
  • the sub-sample flag field in the sub-sample information data box must have a value of 0, that is, the sub-sample division method based on data units is adopted.
  • subsample subframe information data box SubsampleSubframeInfoBox can include the following fields:
  • the subframe related sample number field subframe_related_sample_num indicates the number of point cloud samples containing multiple point cloud subframes.
  • sample serial number difference field sample_delta indicates the difference between the current sample serial number containing multiple subframes and the previous sample serial number containing multiple subframes in the decoding order.
  • the value of this field is the serial number of the point cloud sample.
  • the subsample number field subsample_count indicates the number of subsamples contained in the current sample.
  • the related subframe number field related_subframe_num indicates the number of point cloud subframes corresponding to the current subsample. When the value of this field is 0, it means that the information contained in the current subsample has nothing to do with the point cloud subframe division (such as tile set information or frame end identifier).
  • the subframe index field subframe_index indicates the point cloud subframe sequence number corresponding to the current subsample.
  • the value of this sequence number is the same as the value in the frame number attribute.
  • the presentation time of the point cloud subframe can be indicated through the media file data box.
  • the media file data box of the sub-sample includes:
  • the presentation time flag field is used to indicate whether each point cloud subframe included in the point cloud sample has the same presentation duration
  • the subsample duration field is used to indicate the presentation duration of the current subsample when each point cloud subframe contained in the point cloud sample has a different presentation duration.
  • the presentation time flag field in the media file data box can be used to indicate whether the presentation duration of the point cloud subframes is the same, and the subsample duration field can be used to indicate the presentation duration of the subsamples, so that each point cloud subframe can be The presentation duration of the sub-sample is indicated so that the display can be performed based on the presentation duration to ensure the media presentation effect.
  • Figure 14 shows the syntax structure of identifying subframe presentation time information through the subsample subframe information data box SubsampleSubframeInfoBox in an application scenario according to the embodiment of the present application.
  • the coding-related parameter field codec_specific_parameters in the SubsampleSubframeInfoBox includes the following fields:
  • a value of 0 indicates that multiple subframes included in the sample have the same presentation time.
  • the presentation duration of each subframe can be calculated based on the presentation time of the sample itself and the number of subframes in the sample.
  • the value of the sample_delta field corresponding to the sample should be an integer multiple of the number of subframes.
  • a value of 1 indicates that multiple subframes included in the sample have different presentation durations.
  • the subsample duration field sub_sample_duration indicates the presentation duration of the subsample.
  • the sum of the values of this field for multiple subsamples in the sample should be equal to the value of the sample_delta field corresponding to the sample.
  • the presentation duration of each subframe of two subframe scenarios can be uniformly indicated by extending the media file data box.
  • the media file data box of the point cloud sample includes:
  • the number of subframes field is used to indicate the number of point cloud subframes contained in the current point cloud sample
  • the presentation time flag field is used to indicate whether each point cloud subframe contained in the point cloud sample has the same presentation duration
  • the subframe index field is used to indicate the index information of the point cloud subframe corresponding to the current subsample when each point cloud subframe included in the point cloud sample has different presentation duration
  • the subsample duration field is used to indicate the presentation duration of the current subsample when each point cloud subframe contained in the point cloud sample has a different presentation duration.
  • the number of point cloud subframes can be indicated through the subframe number field, and the index information of the point cloud subframe can be indicated through the subframe index field, so that one or more point clouds corresponding to each data unit in the point cloud sample can be Subframes are jointly decoded as combined frames.
  • it can reduce the waste of computing resources caused by separately decoding point cloud frames with less content.
  • it can identify point cloud subframes with overlapping point cloud data, improving point cloud media.
  • the presentation time flag field in the media file data box indicates whether the presentation duration of the point cloud subframes is the same, and the subsample duration field indicates the presentation duration of the subsamples, so that each point cloud subframe can be The presentation duration of the sub-sample is indicated so that the display can be performed based on the presentation duration to ensure the media presentation effect.
  • Figure 15 shows the syntax structure of the extended media file data box identifying subframe presentation time information in an application scenario according to the embodiment of the present application.
  • the media file data box in the embodiment of this application includes the following fields:
  • the subframe number field nb_subframes indicates the number of subframes corresponding to the current sample.
  • a value of 0 indicates that multiple corresponding subframes in the sample have the same presentation time.
  • the presentation duration of each subframe can be calculated based on the presentation time of the sample itself and the number of subframes in the sample.
  • the value of the sample_delta field corresponding to the sample should be an integer multiple of the number of subframes.
  • a value of 1 indicates that multiple subframes included in the sample have different presentation durations.
  • the subframe index field subframe_index indicates the sequence number of the point cloud subframe.
  • the subsample duration field sub_sample_duration indicates the presentation duration of the corresponding subframe.
  • the sum of the values of this field for all corresponding subframes in the sample should be equal to the value of the sample_delta field corresponding to the sample.
  • the spatial information of the point cloud subframe can be obtained implicitly.
  • the corresponding relationship between the point cloud sub-frame and the spatial block tile can be found based on the information carried in each sub-sample division method.
  • Figure 16 shows a structural block diagram of determining the spatial block correspondence based on three sub-sample division methods of data unit, spatial block and point cloud subframe in one embodiment of the present application.
  • a first spatial block tile0 and a second spatial block tile1 corresponding to the first subframe, and a third spatial block tile2 corresponding to the second subframe may be determined.
  • multiple data units corresponding to the first spatial block tile0 can be determined, such as the geometric slice Geo slice0, the geometric slice Geo slice1, the color attribute slice Attr color slice0, and the color attribute slice as shown in the figure.
  • Attr color slice1 frame index attribute slice Attr frameIdx slice0, frame index attribute slice Attr frameIdx slice1.
  • multiple data units corresponding to the second space partition tile1 can be determined, such as the geometric slice Geo slice2, the geometric slice Geo slice3, the color attribute slice Attr color slice2, the color attribute slice Attr color slice3, and the frame index attribute as shown in the figure.
  • multiple data units corresponding to the third space partition tile2 can also be determined, such as the geometric slice Geo slice4, the geometric slice Geo slice5, the color attribute slice Attr color slice4, the color attribute slice Attr color slice5, and the frame index as shown in the figure.
  • Figure 17 shows a structural block diagram for determining the spatial block correspondence based on two sub-sample division methods of data unit and spatial block in an embodiment of the present application.
  • the first spatial tile tile0 and the second spatial tile tile1 can be determined.
  • multiple data units corresponding to the first spatial block tile0 can be determined, such as the geometric slice Geo slice0, the color attribute slice Attr color slice1 and the corresponding sub-frame sub-frame0 as shown in the figure.
  • multiple data units corresponding to the second spatial block tile1 can be determined, such as the geometric slice Geo slice2, the color attribute slice Attr color slice2, and the frame index corresponding to the sub-frames sub-frame0 and sub-frame1 as shown in the figure.
  • the correspondence between the point cloud subframes and the spatial blocks can also be explicitly indicated in the media file data box.
  • the media file data box of the point cloud sample includes:
  • the spatial block flag field is used to indicate whether the point cloud subframe in the current sample corresponds to one or more different spatial blocks
  • Subframe index field used to indicate the index information of the current point cloud subframe
  • the spatial block number field is used to indicate the number of spatial blocks corresponding to the current point cloud subframe
  • Space block identification field used to indicate the identifier of the current space block.
  • the number of spatial blocks corresponding to the point cloud subframe can be indicated through the spatial block flag field
  • the index information of the point cloud subframe can be indicated through the subframe index field
  • the index information of the point cloud subframe can be indicated through the spatial block quantity field.
  • the number of spatial blocks, indicating the identifier of the current spatial block through the spatial block identification field, so that based on the information in each field in the media file data box of the point cloud sample, one or more corresponding data units in the point cloud sample can be Point cloud sub-frames are jointly decoded as combined frames. On the one hand, it can reduce the waste of computing resources caused by individually decoding point cloud frames with less content.
  • Figure 18 shows the syntax structure of the embodiment of the present application for indicating the correspondence between point cloud subframes and spatial blocks through media file data boxes in an application scenario.
  • the media file data box in this embodiment of the present application may include the following fields:
  • Spatial tile flag field with_tile_info_flag When the value is 1, it means that the subframes in the current sample correspond to one or more different point cloud spatial tiles. When the value is 0, it means that the subframes in the current sample cannot be divided according to the point cloud. Divide space into blocks.
  • the subframe index field subframe_index indicates the sequence number of the point cloud subframe.
  • the spatial tile number field num_tiles indicates the number of point cloud spatial tiles corresponding to the corresponding point cloud subframe.
  • the spatial tile identification field tile_id indicates the identifier of the corresponding point cloud spatial tile.
  • Figure 19 shows a step flow chart of the point cloud media encoding method in one embodiment of the present application. This method can be applied to electronic devices in the server, client, intermediate node and other links of the point cloud media system.
  • the embodiment of the present application is based on A client device installed with a point cloud encoding device executes a point cloud media encoding method as an example.
  • the point cloud media encoding method includes the following steps S1910 to S1930.
  • step S1910 point cloud source data is obtained, and the point cloud source data includes a point cloud frame having one or more point cloud subframes.
  • step S1920 the point cloud frame is encoded to obtain at least one data unit.
  • step S1930 at least one data unit is encapsulated to obtain a point cloud media file.
  • the point cloud media file includes point cloud samples encapsulated in one or more tracks; media file data of each subsample in the point cloud sample.
  • the box includes a subframe index field; the subframe index field is used to indicate the index information of one or more point cloud subframes corresponding to each data unit in the subsample; when one data unit in the subsample corresponds to at least two point clouds When the index information of the subframe is included, at least two point cloud subframes have overlapping point cloud data.
  • Point cloud source data includes point cloud videos (images and/or videos) representing objects and/or environments located in various 3D spaces (eg, 3D spaces representing real environments, 3D spaces representing virtual environments, etc.).
  • the data source may use one or more cameras (for example, an infrared camera capable of protecting depth information, an RGB camera capable of extracting color information corresponding to depth information, etc.), a projector (such as , infrared pattern projectors used to protect depth information), LiDRA and other acquisition devices to capture point cloud source data.
  • cameras for example, an infrared camera capable of protecting depth information, an RGB camera capable of extracting color information corresponding to depth information, etc.
  • a projector such as , infrared pattern projectors used to protect depth information
  • LiDRA LiDRA and other acquisition devices to capture point cloud source data.
  • the shape of the geometric structure composed of points in the 3D space can be extracted from the depth information of the point cloud source data, and the attributes of each point can be extracted from the color information of the point cloud source data to protect the point cloud source data.
  • a point cloud video can include one or more point cloud frames, and one point cloud frame can represent one frame of point cloud image.
  • point cloud video data may be captured based on at least one of inward-facing technology and outward-facing technology.
  • Inward-facing technology refers to a technology that captures images of a central object with one or more cameras (or camera sensors) arranged around the central object. Inward-facing techniques can be used to generate point cloud content that provides the user with 360-degree images of key objects (e.g., VR/AR that provides the user with 360-degree images of key objects such as characters, players, objects, or actors). content).
  • point cloud content e.g., VR/AR that provides the user with 360-degree images of key objects such as characters, players, objects, or actors).
  • Outward-facing technology refers to a technology that uses one or more cameras (or camera sensors) arranged around the central object to capture the environment of the central object rather than the image of the central object.
  • Point cloud content that provides the surrounding environment as it appears from the user's perspective may be generated using outward-facing techniques (eg, content representing the external environment that may be provided to a user of a self-driving vehicle).
  • the data source can calibrate one or more cameras to set the global coordinate system prior to the capture operation. .
  • the data source may generate point cloud content by compositing arbitrary images and/or videos with images and/or videos captured via the capture techniques described above.
  • the data source may perform post-processing on the captured images and/or videos, which may, for example, remove unwanted areas (such as background), identify the spaces to which the captured images and/or videos are connected, and perform filling when spatial holes are present The operation of space holes and so on.
  • the data source can generate a piece of point cloud content by performing coordinate transformations on the points of the point cloud video secured from each camera.
  • the data source can perform coordinate transformations on points based on the coordinates of each camera location. Therefore, the data source can generate a point cloud content that represents a broad spatial extent, or it can generate point cloud content with a high density of points.
  • the corresponding relationship between each data unit in the sub-sample and the index information of one or more point cloud sub-frames is indicated through the media file data box of each sub-sample in the point cloud sample, so that the corresponding relationship can be achieved.
  • One or more point cloud subframes corresponding to each data unit in the point cloud sample are jointly encoded as a combined frame. On the one hand, it can reduce the waste of computing resources caused by separately encoding point cloud frames with less content. On the other hand, it can Identify point cloud subframes with overlapping point cloud data to improve the coding efficiency of point cloud media.
  • Figure 20 shows a flow chart of encoding and decoding point cloud data in a streaming media transmission application scenario according to an embodiment of the present application.
  • the server as the data source for producing point cloud media files, can encode and send the point cloud data to the user's client. After decoding the point cloud media files through the client, the point cloud data can be obtained for use. User consumption.
  • the specific point cloud data encoding and decoding process may include the following steps.
  • Step S2010 The server determines one or more subframes corresponding to each geometric slice according to the subframe index number corresponding to each geometric slice in the point cloud code stream.
  • Step S2020 The server encapsulates the point cloud code stream into a point cloud file, in which the point cloud subframes are divided and indicated in the form of subsamples.
  • the server's encapsulation of point cloud code streams can be single-track encapsulation or component-based multi-track encapsulation.
  • Step S2030 For the point cloud subframes existing in the file, indicate the presentation time information of the point cloud subframes contained in these samples.
  • Step S2040 For the spatial information of the point cloud subframe, indicate the corresponding relationship between the point cloud subframe and the tile.
  • Step S2050 The server transmits the point cloud file to the client.
  • Step S2060 When the client decapsulates and decodes the point cloud file, it extracts each point cloud subframe based on the information related to the point cloud subframe.
  • Step S2070 After the client reorders the point cloud sequence, it combines the presentation time information of the point cloud subframes for presentation.
  • the embodiment of this application proposes a file encapsulation method for point cloud subframes for GPCC point cloud media.
  • this file encapsulation method defines the way point cloud subframes are encapsulated in samples under different scenarios, indicates the identity and duration of point cloud subframes, and indicates the correspondence between point cloud subframes and point cloud spatial blocks.
  • This application can more flexibly support the encapsulation of point cloud subframes in files, thereby supporting more application scenarios and maximizing the coding efficiency improvement brought by point cloud subframes.
  • the server determines the subframe corresponding to each geometric slice based on the subframe index number corresponding to each geometric slice in the point cloud code stream.
  • Server S1 encapsulates the point cloud code stream into point cloud file F1 in a single-track manner, and the file encapsulation result is shown in Figure 21.
  • the point cloud subframes are divided and indicated in the form of subsamples.
  • the flags field in the SubSampleInformationBox data box has a value of 2, indicating that the sub-frame indexes are 1 and 2 in sub-sample0 and sub-sample1 respectively.
  • Each point cloud sample corresponds to its own point cloud frame, including point cloud sample Sample 0, point cloud sample Sample1 and point cloud sample SampleN.
  • point cloud sample Sample2 corresponds to point cloud sub-frame sub-frame1 and point cloud sub-frame sub -frame2.
  • the server S2 encapsulates the point cloud code stream into a point cloud file F2 in a component-based multi-track manner.
  • the geometric track is divided into sub-samples.
  • the data information belonging to the attribute track can be found through the index relationship between the geometry track and the attribute track.
  • the encapsulation result is shown in Figure 22.
  • each point cloud sample corresponds to its own point cloud frame.
  • point cloud sample Sample 0 point cloud sample Sample 1 and point cloud sample Sample N all correspond to the corresponding point cloud frame;
  • point cloud sample Sample2 corresponds to the point cloud sub-frame sub-frame1 and the point cloud sub-frame sub-frame2; while in the attribute track, the point cloud sample Sample2 also corresponds to the point cloud frame frame.
  • each point cloud sample corresponds to its own point cloud frame.
  • point cloud sample Sample 0, point cloud sample Sample 1 and point cloud sample Sample N all correspond to the corresponding point cloud frame;
  • the respective point cloud sample Sample2 corresponds to the point cloud sub-frame sub-frame1 and the point cloud sub-frame sub-frame2.
  • the point cloud subframes are divided and indicated in the form of subsamples.
  • the flags field in the SubSampleInformationBox data box has a value of 2, indicating that the sub-frame indexes are 1 and 2 in sub-sample0 and sub-sample1 respectively.
  • sample2 is a sample with a sub-frame
  • presentation duration of each sub-frame can be known.
  • Respectively are 10 units of timescale (defined by).
  • sample2 is a sample with a sub-frame, and then by providing fields in the embodiment of this application, the tile id information corresponding to each sub-frame can be known.
  • the spatial information corresponding to each subframe can be known.
  • the server transmits the point cloud file to the client.
  • the client decapsulates and decodes the point cloud file, it extracts each point cloud subframe based on the information related to the point cloud subframe, reorders the point cloud sequence, and combines the presentation time of the point cloud subframe Information and spatial information are presented.
  • the point cloud sequence can be reordered during the decapsulation stage and then decoded. It can also be decapsulated and decoded first, and then reordered according to the subframe information.
  • the server determines the subframe corresponding to each geometric slice based on the subframe index number corresponding to each geometric slice in the point cloud code stream.
  • Each point cloud sample corresponds to its own point cloud frame, including point cloud sample Sample 0, point cloud sample Sample1 and point cloud sample SampleN. Among them, point cloud sample Sample2 corresponds to the gpcc unit, that is, to the gpcc unit.
  • the subframe information to which each G-PCC data unit in each sub-sample in sample2 belongs can be indicated.
  • the sub-sample can be associated with the corresponding subframe information.
  • the multi-track mode is encapsulated in sub-sample division, which is the same as the processing method in the previous embodiment.
  • Sub-samples can be divided only in the geometry track, or sub-samples can be divided in both the geometry and attribute tracks.
  • sample2 is a sample with a sub-frame, and then by providing fields in the embodiment of this application, it can be known that the presentation duration of each sub-frame is the same. Assuming that the duration of sample2 is 20 units of timescale (defined by), then each sub-frame is 10 units of timescale.
  • sample2 is a sample with a sub-frame, and then by providing fields in the embodiment of this application, the tile id information corresponding to each sub-frame can be known.
  • the spatial information corresponding to each subframe can be known.
  • the server transmits the point cloud file to the client.
  • the client decapsulates and decodes the point cloud file, it extracts each point cloud subframe based on the information related to the point cloud subframe, reorders the point cloud sequence, and combines the presentation time of the point cloud subframe Information and spatial information are presented.
  • the point cloud sequence can be reordered during the decapsulation stage and then decoded. It can also be decapsulated and decoded first, and then reordered according to the subframe information.
  • Figure 25 schematically shows a structural block diagram of a point cloud media decoding device provided by an embodiment of the present application.
  • the point cloud media decoding device 2500 includes:
  • the acquisition module 2510 is configured to acquire point cloud media files, where the point cloud media files include point cloud samples encapsulated in one or more tracks;
  • the parsing module 2520 is configured to parse the media file data box of each sub-sample in the point cloud sample to obtain the value of the sub-sample flag field;
  • the index module 2530 is configured to obtain the index information of one or more point cloud subframes corresponding to each data unit in the subsample according to the value of the subsample flag field; when the subsample When one data unit corresponds to the index information of at least two point cloud subframes, the at least two point cloud subframes have overlapping point cloud data; and
  • the decoding module 2540 is configured to decapsulate and decode the point cloud media file according to the index information of the one or more point cloud subframes to obtain point cloud data.
  • Figure 26 schematically shows a structural block diagram of a point cloud media encoding device provided by an embodiment of the present application.
  • the point cloud media encoding device 2600 includes:
  • the acquisition module 2610 is configured to acquire point cloud source data, where the point cloud source data includes a point cloud frame having one or more point cloud subframes;
  • the encoding module 2620 is configured to encode the point cloud frame to obtain at least one data unit
  • the encapsulating module 2630 is configured to encapsulate the at least one data unit to obtain a point cloud media file.
  • the point cloud media file includes a point cloud sample encapsulated in one or more tracks; in the point cloud sample
  • the media file data box of each subsample includes a subframe index field; the subframe index field is used to indicate the index information of one or more point cloud subframes corresponding to each data unit in the subsample; when the When one data unit in the subsample corresponds to the index information of at least two point cloud subframes, the at least two point cloud subframes have overlapping point cloud data.
  • Figure 27 schematically shows a block diagram of a computer system used to implement an electronic device according to an embodiment of the present application.
  • the computer system 2700 includes a central processing unit 2701 (Central Processing Unit, CPU), which can process data according to computer readable instructions stored in a read-only memory 2702 (Read-Only Memory, ROM) or from a storage portion 2708
  • the computer-readable instructions loaded into the random access memory 2703 perform various appropriate actions and processes.
  • RAM Random Access Memory
  • various computer readable instructions and data required for system operation are also stored.
  • the central processing unit 2701, the read-only memory 2702 and the random access memory 2703 are connected to each other through a bus 2704.
  • the input/output interface 2705 Input/Output interface, ie, I/O interface
  • I/O interface input/output interface
  • the following components are connected to the input/output interface 2705: an input part 2706 including a keyboard, a mouse, etc.; an output part 2707 including a cathode ray tube (Cathode Ray Tube, CRT), a liquid crystal display (Liquid Crystal Display, LCD), etc., and a speaker, etc. ; A storage section 2708 including a hard disk, etc.; and a communication section 2709 including a network interface card such as a LAN card, a modem, etc. The communication section 2709 performs communication processing via a network such as the Internet.
  • Driver 2710 is also connected to input/output interface 2705 as needed.
  • Removable media 2711 such as magnetic disks, optical disks, magneto-optical disks, semiconductor memories, etc., are installed on the drive 2710 as needed so that computer readable instructions read therefrom are installed into the storage portion 2708 as needed.
  • each method flowchart may be implemented as computer-readable instructions.
  • embodiments of the present application include a computer program product including computer-readable instructions carried on a computer-readable medium, the computer-readable instructions including computer-readable instruction code for performing the method illustrated in the flowchart .
  • the computer readable instructions may be downloaded and installed from the network via communications portion 2709 and/or installed from removable media 2711.
  • the central processor 2701 When the computer readable instructions are executed by the central processor 2701, various functions defined in the system of the present application are performed.
  • the computer-readable medium shown in the embodiments of the present application may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium may be, for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination thereof.
  • Computer readable storage media may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard drive, random access memory (RAM), read only memory (ROM), removable Programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), flash memory, optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or any of the above suitable The combination.
  • a computer-readable storage medium may be any tangible medium that contains or stores computer-readable instructions that may be used by or in connection with an instruction execution system, apparatus, or device.
  • the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, in which computer-readable program code is carried. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that can be sent, propagated, or transmitted for use by or in connection with an instruction execution system, apparatus, or device. Read instructions. Program code embodied on a computer-readable medium may be transmitted using any suitable medium, including but not limited to: wireless, wired, etc., or any suitable combination of the above.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logic functions that implement the specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved.
  • each block in the block diagram or flowchart illustration, and combinations of blocks in the block diagram or flowchart illustration can be implemented by special purpose hardware-based systems that perform the specified functions or operations, or may be implemented by special purpose hardware-based systems that perform the specified functions or operations. Achieved by a combination of specialized hardware and computer instructions.
  • the example embodiments described here can be implemented by software, or can be implemented by software combined with necessary hardware. Therefore, the technical solution according to the embodiment of the present application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.) or on the network , including several instructions to cause a computing device (which can be a personal computer, server, touch terminal, or network device, etc.) to execute the method according to the embodiment of the present application.
  • a non-volatile storage medium which can be a CD-ROM, U disk, mobile hard disk, etc.
  • a computing device which can be a personal computer, server, touch terminal, or network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A point cloud media decoding method, which is executed by an electronic device. The method comprises: acquiring a point cloud media file, wherein the point cloud media file comprises point cloud samples, which are encapsulated in one or more tracks (S610); parsing a media file data box of each sub-sample in the point cloud samples, so as to obtain a value of a sub-sample flag bit field (S620); according to the value of the sub-sample flag bit field, acquiring index information of one or more point cloud subframes corresponding to data units in the sub-sample, wherein when one data unit in the sub-sample corresponds to the index information of at least two point cloud sub-frames, the at least two point cloud subframes have overlapped point cloud data (S630); and decapsulating and decoding the point cloud media file according to the index information of the one or more point cloud subframes, so as to obtain point cloud data (S640).

Description

点云媒体的编解码方法、装置、电子设备和存储介质Point cloud media encoding and decoding methods, devices, electronic equipment and storage media
本申请要求于2022年04月22日提交中国专利局、申请号为2022104281526、发明名称为“点云媒体的编解码方法及相关产品”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application submitted to the China Patent Office on April 22, 2022, with application number 2022104281526 and the invention name "Encoding and decoding methods of point cloud media and related products", the entire content of which is incorporated by reference in in this application.
技术领域Technical field
本申请属于音视频技术领域,具体涉及一种点云媒体的编码方法、点云媒体的解码方法、点云媒体的编码装置、点云媒体的解码装置、计算机可读介质、电子设备以及计算机程序产品。This application belongs to the field of audio and video technology, and specifically relates to a point cloud media encoding method, a point cloud media decoding method, a point cloud media encoding device, a point cloud media decoding device, computer readable media, electronic equipment and computer programs. product.
背景技术Background technique
点云是空间中一组无规则分布的、表达三维物体或场景的空间结构及表面属性的离散点集。在通过点云采集设备获取到大规模的点云数据后,可以对点云数据进行编码封装以向用户传输和解码呈现。点云媒体中存在一些内容较少的点云帧,而且部分点云帧之间存在重叠的点云内容,因此对每个点云帧单独进行编解码处理,将会产生计算资源的浪费并且影响点云媒体的处理效率。Point cloud is a set of discrete points randomly distributed in space that expresses the spatial structure and surface properties of a three-dimensional object or scene. After obtaining large-scale point cloud data through point cloud acquisition equipment, the point cloud data can be encoded and encapsulated for transmission, decoding and presentation to users. There are some point cloud frames with less content in the point cloud media, and there are overlapping point cloud contents between some point cloud frames. Therefore, encoding and decoding each point cloud frame separately will cause a waste of computing resources and affect Point cloud media processing efficiency.
发明内容Contents of the invention
根据本申请的各种实施例,提供了一种点云媒体的编码方法、点云媒体的解码方法、点云媒体的编码装置、点云媒体的解码装置、计算机可读介质、电子设备以及计算机程序产品。According to various embodiments of the present application, a point cloud media encoding method, a point cloud media decoding method, a point cloud media encoding device, a point cloud media decoding device, a computer-readable medium, an electronic device, and a computer are provided Program Products.
本申请的其他特性和优点将通过下面的详细描述变得显然,或部分地通过本申请的实践而习得。Additional features and advantages of the invention will be apparent from the detailed description which follows, or, in part, may be learned by practice of the invention.
根据本申请实施例的一个方面,提供一种点云媒体的解码方法,由电子设备执行,包括:According to one aspect of the embodiment of the present application, a method for decoding point cloud media is provided, which is executed by an electronic device, including:
获取点云媒体文件,所述点云媒体文件包括封装于一个或者多个轨道中的点云样本;Obtaining a point cloud media file, the point cloud media file including point cloud samples encapsulated in one or more tracks;
解析所述点云样本中的各个子样本的媒体文件数据盒,得到子样本标志位字段的取值;Analyze the media file data box of each sub-sample in the point cloud sample to obtain the value of the sub-sample flag field;
根据所述子样本标志位字段的取值获取与所述子样本中各个数据单元相对应的一个或者多个点云子帧的索引信息;当所述子样本中的一个数据单元对应至少两个点云子帧的索引信息时,所述至少两个点云子帧具有重叠的点云数据;及Obtain the index information of one or more point cloud subframes corresponding to each data unit in the subsample according to the value of the subsample flag field; when one data unit in the subsample corresponds to at least two When the index information of point cloud sub-frames is provided, the at least two point cloud sub-frames have overlapping point cloud data; and
根据所述一个或者多个点云子帧的索引信息对所述点云媒体文件进行解封装和解码处理,得到点云数据。The point cloud media file is decapsulated and decoded according to the index information of the one or more point cloud subframes to obtain point cloud data.
根据本申请实施例的一个方面,提供一种点云媒体的编码方法,由电子设备执行,包括:According to one aspect of the embodiment of the present application, a method for encoding point cloud media is provided, which is executed by an electronic device, including:
获取点云源数据,所述点云源数据包括具有一个或者多个点云子帧的点云帧;Obtaining point cloud source data, the point cloud source data includes a point cloud frame having one or more point cloud subframes;
对所述点云帧进行编码处理,得到至少一个数据单元;及Encoding the point cloud frame to obtain at least one data unit; and
对所述至少一个数据单元进行封装处理,得到点云媒体文件,所述点云媒体文件包括封装于一个或者多个轨道中的点云样本;所述点云样本中的各个子样本的媒体文件数据盒包括子帧索引字段;所述子帧索引字段用于指示与所述子样本中各个数据单元相对应的一个或者多个点云子帧的索引信息;当所述子样本中的一个数据单元对应至少两个点云子帧的索引信息时,所述至少两个点云子帧具有重叠的点云数据。The at least one data unit is encapsulated to obtain a point cloud media file. The point cloud media file includes point cloud samples encapsulated in one or more tracks; media files for each subsample in the point cloud sample. The data box includes a subframe index field; the subframe index field is used to indicate index information of one or more point cloud subframes corresponding to each data unit in the subsample; when one data unit in the subsample When the unit corresponds to the index information of at least two point cloud subframes, the at least two point cloud subframes have overlapping point cloud data.
根据本申请实施例的一个方面,提供一种点云媒体的解码装置,包括:According to one aspect of the embodiment of the present application, a point cloud media decoding device is provided, including:
获取模块,被配置为获取点云媒体文件,所述点云媒体文件包括封装于一个或者多个轨道中的点云样本;An acquisition module configured to acquire point cloud media files, where the point cloud media files include point cloud samples encapsulated in one or more tracks;
解析模块,被配置为解析所述点云样本中的各个子样本的媒体文件数据盒,得到子样本标志位字段的取值;A parsing module configured to parse the media file data box of each sub-sample in the point cloud sample and obtain the value of the sub-sample flag field;
索引模块,被配置为根据所述子样本标志位字段的取值获取与所述子样本中各个数据单元相对应的一个或者多个点云子帧的索引信息;当所述子样本中的一个数据单元对应至少两个点云子帧的索引信息时,所述至少两个点云子帧具有重叠的点云数据;及An index module configured to obtain the index information of one or more point cloud subframes corresponding to each data unit in the subsample according to the value of the subsample flag field; when one of the subsamples When the data unit corresponds to the index information of at least two point cloud subframes, the at least two point cloud subframes have overlapping point cloud data; and
解码模块,被配置为根据所述一个或者多个点云子帧的索引信息对所述点云媒体文件进行解封装和解码处理,得到点云数据。The decoding module is configured to decapsulate and decode the point cloud media file according to the index information of the one or more point cloud subframes to obtain point cloud data.
根据本申请实施例的一个方面,提供一种点云媒体的编码装置,包括:According to one aspect of the embodiment of the present application, a point cloud media encoding device is provided, including:
获取模块,被配置为获取点云源数据,所述点云源数据包括具有一个或者多个点云子帧的点云帧;An acquisition module configured to acquire point cloud source data, where the point cloud source data includes a point cloud frame having one or more point cloud subframes;
编码模块,被配置为对所述点云帧进行编码处理,得到至少一个数据单元;及An encoding module configured to encode the point cloud frame to obtain at least one data unit; and
封装模块,被配置为对所述至少一个数据单元进行封装处理,得到点云媒体文件,所述点云媒体文件包括封装于一个或者多个轨道中的点云样本;所述点云样本中的各个子样本的媒体文件数据盒包括子帧索引字段;所述子帧索引字段用于指示与所述子样本中各个数据单元相对应的一个或者多个点云子帧的索引信息;当所述子样本中的一个数据单元对应至少两个点云子帧的索引信息时,所述至少两个点云子帧具有重叠的点云数据。An encapsulation module, configured to encapsulate the at least one data unit to obtain a point cloud media file, where the point cloud media file includes point cloud samples encapsulated in one or more tracks; The media file data box of each subsample includes a subframe index field; the subframe index field is used to indicate the index information of one or more point cloud subframes corresponding to each data unit in the subsample; when the When one data unit in the subsample corresponds to the index information of at least two point cloud subframes, the at least two point cloud subframes have overlapping point cloud data.
根据本申请实施例的一个方面,提供一种计算机可读介质,其上存储有计算机可读指令,该计算机可读指令被处理器执行时实现如以上技术方案中的点云媒体的编解码方法。According to one aspect of the embodiment of the present application, a computer-readable medium is provided, on which computer-readable instructions are stored. When the computer-readable instructions are executed by a processor, the encoding and decoding method of point cloud media in the above technical solution is implemented. .
根据本申请实施例的一个方面,提供一种电子设备,该电子设备包括:处理器;以及存储器,用于存储所述处理器的计算机可读指令;其中,所述处理器被配置为经由执行所述计算机可读指令来执行如以上技术方案中的点云媒体的编解码方法。According to an aspect of an embodiment of the present application, an electronic device is provided, the electronic device comprising: a processor; and a memory for storing computer readable instructions of the processor; wherein the processor is configured to execute The computer readable instructions are used to execute the point cloud media encoding and decoding method in the above technical solution.
根据本申请实施例的一个方面,提供一种计算机程序产品,该计算机程序产品包括计算机可读指令,该计算机可读指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机可读指令,处理器执行该计算机可读指令,使得该计算机设备执行如以上技术方案中的点云媒体的编解码方法。According to an aspect of an embodiment of the present application, a computer program product is provided. The computer program product includes computer-readable instructions, and the computer-readable instructions are stored in a computer-readable storage medium. The processor of the computer device reads the computer-readable instructions from the computer-readable storage medium, and the processor executes the computer-readable instructions, so that the computer device performs the encoding and decoding method of point cloud media as in the above technical solution.
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征、目的和优点将从说明书、附图以及权利要求书变得明显。应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本申请。The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below. Other features, objects and advantages of the application will become apparent from the description, drawings and claims. It should be understood that the above general description and the following detailed description are only exemplary and explanatory, and do not limit the present application.
附图说明Description of the drawings
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并与说明书一起用于解释本申请的原理。显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without exerting creative efforts.
图1示意性地示出了应用本申请技术方案的示例性系统架构框图。Figure 1 schematically shows an exemplary system architecture block diagram applying the technical solution of the present application.
图2示出了本申请实施例在一个应用场景中的点云媒体编解码流程示意图。Figure 2 shows a schematic diagram of the point cloud media encoding and decoding process in an application scenario according to the embodiment of the present application.
图3示出了本申请一个实施例中基于TLV码流格式封装点云样本的语法结构。Figure 3 shows the syntax structure of encapsulating point cloud samples based on TLV code stream format in one embodiment of the present application.
图4示出了本申请一个实施例中基于TLV码流格式封装的数据单元的语法结构。Figure 4 shows the syntax structure of a data unit encapsulated based on the TLV code stream format in one embodiment of the present application.
图5示出了本申请一个实施例中对点云数据进行多帧组合的原理示意图。Figure 5 shows a schematic diagram of the principle of multi-frame combination of point cloud data in one embodiment of the present application.
图6示出了本申请一个实施例中的点云媒体的解码方法的步骤流程图。Figure 6 shows a flow chart of the steps of the point cloud media decoding method in one embodiment of the present application.
图7示出了本申请一个实施例在单个轨道中封装点云样本的示例性结构。Figure 7 shows an exemplary structure of encapsulating point cloud samples in a single track according to one embodiment of the present application.
图8示出了本申请一个实施例在多个轨道中封装几何码流和属性码流的示例性结构。Figure 8 shows an exemplary structure of encapsulating geometry code streams and attribute code streams in multiple tracks according to one embodiment of the present application.
图9示出了本申请实施例在一个应用场景中SubSampleInformationBox数据盒的编码相关参数字段codec_specific_parameters的语法结构。Figure 9 shows the syntax structure of the coding-related parameter field codec_specific_parameters of the SubSampleInformationBox data box in an application scenario according to the embodiment of the present application.
图10示出了本申请实施例在统一标识子帧重叠/不重叠的应用场景中SubSampleInformationBox数据盒的编码相关参数字段codec_specific_parameters的语法 结构。Figure 10 shows the syntax structure of the coding-related parameter field codec_specific_parameters of the SubSampleInformationBox data box in the application scenario of uniform identification of subframe overlap/non-overlap according to the embodiment of the present application.
图11示出了本申请实施例在一个应用场景中扩展样本组工具的语法结构。Figure 11 shows the syntax structure of the extended sample group tool in an application scenario according to the embodiment of the present application.
图12示出了本申请实施例在一个应用场景中基于点云样本级别的媒体文件数据盒标识点云子帧相关信息的语法结构。Figure 12 shows the syntax structure of identifying point cloud subframe related information based on the point cloud sample level media file data box in an application scenario according to the embodiment of the present application.
图13示出了本申请实施例在一个应用场景中通过子样本子帧信息数据盒SubsampleSubframeInfoBox标识子帧相关信息的语法结构。Figure 13 shows the syntax structure of identifying subframe related information through the subsample subframe information data box SubsampleSubframeInfoBox in an application scenario according to the embodiment of the present application.
图14示出了本申请实施例在一个应用场景中通过子样本子帧信息数据盒SubsampleSubframeInfoBox标识子帧呈现时间信息的语法结构。Figure 14 shows the syntax structure of identifying subframe presentation time information through the subsample subframe information data box SubsampleSubframeInfoBox in an application scenario according to the embodiment of the present application.
图15示出了本申请实施例在一个应用场景中通过扩展媒体文件数据盒标识子帧呈现时间信息的语法结构。Figure 15 shows the syntax structure of identifying subframe presentation time information through an extended media file data box in an application scenario according to an embodiment of the present application.
图16示出了本申请一个实施例中基于数据单元、空间分块以及点云子帧三种子样本划分方式确定空间分块对应关系的结构框图。Figure 16 shows a structural block diagram of determining the spatial block correspondence based on three sub-sample division methods of data unit, spatial block and point cloud subframe in one embodiment of the present application.
图17示出了本申请一个实施例中基于数据单元、空间分块两种子样本划分方式确定空间分块对应关系的结构框图。Figure 17 shows a structural block diagram for determining the spatial block correspondence based on two sub-sample division methods of data unit and spatial block in an embodiment of the present application.
图18示出了本申请实施例在一个应用场景中通过媒体文件数据盒指示点云子帧与空间分块之间的对应关系的语法结构。Figure 18 shows the syntax structure of the embodiment of the present application indicating the correspondence between point cloud subframes and spatial blocks through media file data boxes in an application scenario.
图19示出了本申请一个实施例中点云媒体的编码方法的步骤流程图。Figure 19 shows a step flow chart of a point cloud media encoding method in one embodiment of the present application.
图20示出了本申请实施例在流媒体传输应用场景中进行点云数据编解码的流程图。Figure 20 shows a flow chart of encoding and decoding point cloud data in a streaming media transmission application scenario according to an embodiment of the present application.
图21出了本申请实施例在点云子帧互不重叠的应用场景中对点云样本进行单轨封装的封装结果。Figure 21 shows the encapsulation result of single-track encapsulation of point cloud samples in an application scenario where point cloud subframes do not overlap with each other according to the embodiment of the present application.
图22示出了本申请实施例在点云子帧互不重叠的应用场景中对点云样本进行多轨封装时对属性轨道不划分子样本的封装结果。Figure 22 shows the encapsulation result of not dividing attribute tracks into sub-samples when multi-track encapsulation of point cloud samples is performed in an application scenario where point cloud sub-frames do not overlap with each other according to the embodiment of the present application.
图23示出了本申请实施例在点云子帧互不重叠的应用场景中对点云样本进行多轨封装时对属性轨道划分子样本的封装结果。Figure 23 shows the encapsulation result of attribute track division sub-samples when multi-track encapsulation of point cloud samples in an application scenario where point cloud sub-frames do not overlap with each other according to the embodiment of the present application.
图24示出了本申请实施例在点云子帧存在重叠的应用场景中对点云样本进行单轨封装的封装结果。Figure 24 shows the encapsulation result of single-track encapsulation of point cloud samples in an application scenario where point cloud subframes overlap in an embodiment of the present application.
图25示意性地示出了本申请实施例提供的点云媒体的解码装置的结构框图。Figure 25 schematically shows a structural block diagram of a point cloud media decoding device provided by an embodiment of the present application.
图26示意性地示出了本申请实施例提供的点云媒体的编码装置的结构框图。Figure 26 schematically shows a structural block diagram of a point cloud media encoding device provided by an embodiment of the present application.
图27示意性示出了适于用来实现本申请实施例的电子设备的计算机系统结构框图。FIG. 27 schematically shows a structural block diagram of a computer system suitable for implementing an electronic device according to an embodiment of the present application.
具体实施方式Detailed ways
现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本申请将更加全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in various forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concepts of the example embodiments. To those skilled in the art.
此外,所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施例中。在下面的描述中,提供许多具体细节从而给出对本申请的实施例的充分理解。然而,本领域技术人员将意识到,可以实践本申请的技术方案而没有特定细节中的一个或更多,或者可以采用其它的方法、组元、装置、步骤等。在其它情况下,不详细示出或描述公知方法、装置、实现或者操作以避免模糊本申请的各方面。Furthermore, the described features, structures or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to provide a thorough understanding of embodiments of the present application. However, those skilled in the art will appreciate that the technical solutions of the present application may be practiced without one or more of the specific details, or other methods, components, devices, steps, etc. may be adopted. In other instances, well-known methods, apparatus, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the present application.
附图中所示的方框图仅仅是功能实体,不一定必须与物理上独立的实体相对应。即,可以采用软件形式来实现这些功能实体,或在一个或多个硬件模块或集成电路中实现这些功能实体,或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. That is, these functional entities may be implemented in software form, or implemented in one or more hardware modules or integrated circuits, or implemented in different networks and/or processor devices and/or microcontroller devices. entity.
附图中所示的流程图仅是示例性说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序执行。例如,有的操作/步骤还可以分解,而有的操作/步骤可以合并或部分合并,因此实际执行的顺序有可能根据实际情况改变。The flowcharts shown in the drawings are only illustrative, and do not necessarily include all contents and operations/steps, nor must they be performed in the order described. For example, some operations/steps can be decomposed, and some operations/steps can be merged or partially merged, so the actual order of execution may change according to the actual situation.
在本文中提及的“多个”是指两个或两个以上。“和/或”描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。The "plurality" mentioned in this article means two or more than two. "And/or" describes the relationship between related objects, indicating that there can be three relationships. For example, A and/or B can mean: A exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the related objects are in an "or" relationship.
在本申请的具体实施方式中,涉及到点云媒体的传输内容、解码内容和消费内容等与用户相关的数据,当本申请的各个实施例运用到具体产品或技术中时,需要获得用户许可或者同意,且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准。In the specific implementation of this application, it involves user-related data such as point cloud media transmission content, decoding content, and consumption content. When various embodiments of this application are applied to specific products or technologies, user permission is required. Or agree, and the collection, use and processing of relevant data need to comply with relevant laws, regulations and standards of relevant countries and regions.
本申请实施例中涉及的相关术语或者缩略语解释如下。Relevant terms or abbreviations involved in the embodiments of this application are explained as follows.
沉浸式媒体:能为消费者带来沉浸式体验的媒体内容,沉浸式媒体按照用户在消费媒体内容时的自由度,可以分为3DoF媒体、3DoF+媒体以及6DoF媒体。点云媒体即一种典型的6DoF媒体。Immersive media: media content that can bring immersive experience to consumers. Immersive media can be divided into 3DoF media, 3DoF+ media and 6DoF media according to the user's degree of freedom when consuming media content. Point cloud media is a typical 6DoF media.
DoF:Degree of Freedom,自由度。本申请中是指用户在观看沉浸式媒体时支持的运动并产生内容交互的自由度。DoF: Degree of Freedom, degree of freedom. In this application, it refers to the degree of freedom a user has to support movement and interact with content while viewing immersive media.
3DoF:即三自由度,指用户头部围绕x,y,z轴旋转的三种自由度。3DoF: Three degrees of freedom, which refers to the three degrees of freedom for the user's head to rotate around the x, y, and z axes.
3DoF+:即在三自由度的基础上,用户还拥有沿x,y,z轴有限运动的自由度。3DoF+: In addition to three degrees of freedom, the user also has limited degrees of freedom for movement along the x, y, and z axes.
6DoF:即在三自由度的基础上,用户还拥有沿x,y,z轴自由运动的自由度。6DoF: In addition to three degrees of freedom, the user also has the freedom to move freely along the x, y, and z axes.
点云:点云是空间中一组无规则分布的、表达三维物体或场景的空间结构及表面属性的离散点集。点云中的每个点至少具有三维位置信息,根据应用场景的不同,还可能具有色彩、材质或其他信息。通常,点云中的每个点都具有相同数量的附加属性。Point cloud: Point cloud is a set of discrete points randomly distributed in space that expresses the spatial structure and surface properties of a three-dimensional object or scene. Each point in the point cloud has at least three-dimensional position information. Depending on the application scenario, it may also have color, material or other information. Typically, each point in a point cloud has the same number of additional attributes.
PCC:Point Cloud Compression,点云压缩。PCC: Point Cloud Compression, point cloud compression.
G-PCC:Geometry-based Point Cloud Compression,基于几何模型的点云压缩。G-PCC: Geometry-based Point Cloud Compression, point cloud compression based on geometric model.
Sample:样本,媒体文件封装过程中的封装单位,一个媒体文件由很多个样本组成。以视频媒体为例,视频媒体的一个样本通常为一个视频帧。Sample: sample, the encapsulation unit in the media file encapsulation process. A media file consists of many samples. Taking video media as an example, a sample of video media is usually a video frame.
Slice:点云片/点云条,代表部分或全部编码后点云帧数据的一系列语法元素(比如几何slice、属性slice)集合。Slice: point cloud slice/point cloud strip, which represents a set of syntax elements (such as geometric slices and attribute slices) of part or all of the encoded point cloud frame data.
Tile:点云空间分块。Tile: point cloud space tiles.
DASH:dynamic adaptive streaming over HTTP,基于HTTP的动态自适应流是一种自适应比特率流技术,使高质量流媒体可以通过传统的HTTP网络服务器以互联网传递。DASH: dynamic adaptive streaming over HTTP, dynamic adaptive streaming based on HTTP is an adaptive bitrate streaming technology that enables high-quality streaming media to be delivered over the Internet through traditional HTTP web servers.
点云媒体从编码方式上又可以分为基于传统视频编码方式进行压缩的点云媒体(Video-based Point Cloud Compression,VPCC)以及基于几何特征进行压缩的点云媒体(Geometry-based Point Cloud Compression,GPCC)。在点云媒体的文件封装中,三维位置信息通常称为点云媒体文件的几何组件(Geometry Component),属性信息称为点云媒体文件的属性组件(Attribute Component)。一个点云媒体文件仅有一个几何组件,但可以存在一个或多个属性组件。In terms of encoding methods, point cloud media can be divided into point cloud media (Video-based Point Cloud Compression, VPCC) that is compressed based on traditional video coding methods and point cloud media that is compressed based on geometric features (Geometry-based Point Cloud Compression, GPCC). In the file encapsulation of point cloud media, the three-dimensional position information is usually called the geometry component of the point cloud media file, and the attribute information is called the attribute component of the point cloud media file. A point cloud media file has only one geometric component, but there can be one or more attribute components.
点云可以灵活方便地表达三维物体或场景的空间结构及表面属性,因而应用广泛,其主要应用场景可以归为两大类别。1)机器感知点云,例如计算机辅助设计(Computer Aided Design,CAD)、自主导航系统(Autonomous Navigation System,ANS)、实时巡检系统、地理信息系统(Geography Information System,GIS)、视觉分拣机器人、抢险救灾机器人。2)人眼感知点云,例如虚拟现实(Virtual Reality,VR)游戏、数字文化遗产、自由视点广播、三维沉浸通信、三维沉浸交互等点云应用场景。Point cloud can flexibly and conveniently express the spatial structure and surface properties of three-dimensional objects or scenes, so it is widely used. Its main application scenarios can be classified into two categories. 1) Machine-perceived point clouds, such as Computer Aided Design (CAD), Autonomous Navigation System (ANS), real-time inspection system, Geography Information System (GIS), and visual sorting robots , rescue and disaster relief robots. 2) The human eye perceives point clouds, such as point cloud application scenarios such as virtual reality (VR) games, digital cultural heritage, free-viewpoint broadcasting, three-dimensional immersive communication, and three-dimensional immersive interaction.
点云的获取主要有以下途径:计算机生成、3D激光扫描、3D摄影测量等。计算机可以生成虚拟三维物体及场景的点云。3D扫描可以获得静态现实世界三维物体或场景的点云,每秒可以获取百万级点云。3D摄像可以获得动态现实世界三维物体或场景的点云,每秒可以获取千万级点云。此外,在医学领域,由MRI、CT、电磁定位信息,可以获得生物组织器官的点云。这些技术降低了点云数据获取成本和时间周期,提高了 数据的精度。点云数据获取方式的变革,使大量点云数据的获取成为可能。伴随着大规模的点云数据不断积累,点云数据的高效存储、传输、发布、共享和标准化,成为点云应用的关键。The main ways to obtain point clouds are: computer generation, 3D laser scanning, 3D photogrammetry, etc. Computers can generate point clouds of virtual three-dimensional objects and scenes. 3D scanning can obtain point clouds of static real-world three-dimensional objects or scenes, and millions of point clouds can be obtained per second. 3D photography can obtain point clouds of dynamic real-world three-dimensional objects or scenes, and tens of millions of point clouds can be obtained per second. In addition, in the medical field, point clouds of biological tissues and organs can be obtained from MRI, CT, and electromagnetic positioning information. These technologies reduce the cost and time period of point cloud data acquisition and improve the accuracy of the data. Changes in point cloud data acquisition methods have made it possible to acquire large amounts of point cloud data. With the continuous accumulation of large-scale point cloud data, efficient storage, transmission, release, sharing and standardization of point cloud data have become the key to point cloud applications.
图1示出了可以应用本申请实施例技术方案的示例性系统架构的示意图。Figure 1 shows a schematic diagram of an exemplary system architecture to which the technical solution of the embodiment of the present application can be applied.
如图1所示,系统架构100包括多个终端装置,所述终端装置可通过例如网络150彼此通信。举例来说,系统架构100可以包括通过网络150互连的第一终端装置110和第二终端装置120。在图1的实施例中,第一终端装置110和第二终端装置120执行单向数据传输。As shown in FIG. 1 , system architecture 100 includes a plurality of terminal devices that can communicate with each other through, for example, network 150 . For example, the system architecture 100 may include a first terminal device 110 and a second terminal device 120 interconnected through a network 150 . In the embodiment of FIG. 1 , the first terminal device 110 and the second terminal device 120 perform one-way data transmission.
举例来说,第一终端装置110可对点云数据(例如由终端装置110采集的点云数据)进行编码以通过网络150传输到第二终端装置120,已编码的点云数据以一个或多个已编码点云码流形式传输,第二终端装置120可从网络150接收已编码点云数据,对已编码点云数据进行解码以恢复点云数据,并根据恢复的点云数据显示点云内容。For example, the first terminal device 110 may encode point cloud data (such as point cloud data collected by the terminal device 110) for transmission to the second terminal device 120 through the network 150, and the encoded point cloud data is represented by one or more The encoded point cloud data is transmitted in the form of a code stream. The second terminal device 120 can receive the encoded point cloud data from the network 150, decode the encoded point cloud data to restore the point cloud data, and display the point cloud according to the restored point cloud data. content.
在本申请的一个实施例中,系统架构100可以包括执行已编码点云数据的双向传输的第三终端装置130和第四终端装置140,所述双向传输比如可以发生在视频会议期间。对于双向数据传输,第三终端装置130和第四终端装置140中的每个终端装置可对点云数据(例如由终端装置采集的点云数据)进行编码,以通过网络150传输到第三终端装置130和第四终端装置140中的另一终端装置。第三终端装置130和第四终端装置140中的每个终端装置还可接收由第三终端装置130和第四终端装置140中的另一终端装置传输的已编码点云数据,且可对已编码点云数据进行解码以恢复点云数据,并可根据恢复的点云数据在可访问的显示装置上显示点云内容。In one embodiment of the present application, the system architecture 100 may include a third terminal device 130 and a fourth terminal device 140 that perform bidirectional transmission of encoded point cloud data, which bidirectional transmission may occur during a video conference, for example. For bidirectional data transmission, each of the third terminal device 130 and the fourth terminal device 140 may encode point cloud data (eg, point cloud data collected by the terminal device) for transmission to the third terminal through the network 150 device 130 and another one of the fourth terminal devices 140 . Each of the third terminal device 130 and the fourth terminal device 140 may also receive the encoded point cloud data transmitted by the other terminal device of the third terminal device 130 and the fourth terminal device 140, and may modify the encoded point cloud data. The encoded point cloud data is decoded to recover the point cloud data, and the point cloud content can be displayed on an accessible display device based on the recovered point cloud data.
在图1的实施例中,第一终端装置110、第二终端装置120、第三终端装置130和第四终端装置140可为服务器、个人计算机和智能电话,但本申请公开的原理可不限于此。本申请公开的实施例适用于膝上型计算机、平板电脑、媒体播放器和/或专用视频会议设备。网络150表示在第一终端装置110、第二终端装置120、第三终端装置130和第四终端装置140之间传送已编码点云数据的任何数目的网络,包括例如有线和/或无线通信网络。通信网络150可在电路交换和/或分组交换信道中交换数据。该网络可包括电信网络、局域网、广域网和/或互联网。出于本申请的目的,除非在下文中有所解释,否则网络150的架构和拓扑对于本申请公开的操作来说可能是无关紧要的。In the embodiment of FIG. 1 , the first terminal device 110 , the second terminal device 120 , the third terminal device 130 and the fourth terminal device 140 may be servers, personal computers and smart phones, but the principles disclosed in this application may not be limited thereto. . Embodiments disclosed herein are suitable for use with laptops, tablets, media players, and/or dedicated video conferencing devices. The network 150 represents any number of networks that transmit encoded point cloud data between the first terminal device 110 , the second terminal device 120 , the third terminal device 130 and the fourth terminal device 140 , including, for example, wired and/or wireless communication networks. . Communication network 150 may exchange data in circuit-switched and/or packet-switched channels. The network may include telecommunications networks, local area networks, wide area networks, and/or the Internet. For purposes of this application, unless explained below, the architecture and topology of network 150 may be immaterial to the operations disclosed herein.
本申请实施例中的服务器可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云计算服务的云服务器。终端可以是智能手机、平板电脑、笔记本电脑、台式计算机、智能音箱、智能手表、车载终端、智能电视等,但并不局限于此。终端以及服务器可以通过有线或无线通信方式进行直接或间接地连接,本申请在此不做限制。The server in the embodiment of this application may be an independent physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server that provides cloud computing services. The terminal can be a smartphone, tablet, laptop, desktop computer, smart speaker, smart watch, vehicle terminal, smart TV, etc., but is not limited to this. The terminal and the server can be connected directly or indirectly through wired or wireless communication methods, which is not limited in this application.
在对点云媒体进行编码后,需要对编码后的数据流进行封装并传输给用户。相对应地,在点云媒体播放器端,需要先对点云文件进行解封装,然后再进行解码,最后将解码后的数据流呈现。After encoding the point cloud media, the encoded data stream needs to be encapsulated and transmitted to the user. Correspondingly, on the point cloud media player side, the point cloud file needs to be decapsulated first, then decoded, and finally the decoded data stream is presented.
图2示出了本申请实施例在一个应用场景中的点云媒体编解码流程示意图。Figure 2 shows a schematic diagram of the point cloud media encoding and decoding process in an application scenario according to the embodiment of the present application.
通过采集设备210进行点云数据采集可以捕获真实世界的视觉场景A,采集设备210例如可以是一组相机或者一个具有多镜头和传感器的相机设备。采集结果为点云源数据B,点云源数据B是由大量点云帧组成的帧序列。通过编码器220可以对一个或多个点云帧进行编码处理,得到编码后的G-PCC比特流,具体可以包括编码的几何比特流和属性比特流E。文件封装器230可以根据特定的媒体容器文件格式,对一个或多个编码比特流进行封装处理,得到用于文件回放的媒体文件F或一系列初始化段和用于流式传输的媒体片段Fs。在本申请的一些实施例中,媒体容器文件格式例如可以是ISO/IEC 14496-12[ISOBMFF]中指定的ISO基本媒体文件格式。文件封装器230还可 以将元数据封装在媒体文件F或媒体片段Fs中。The real-world visual scene A can be captured by collecting point cloud data through the collection device 210. The collection device 210 may be, for example, a set of cameras or a camera device with multiple lenses and sensors. The collection result is point cloud source data B, which is a frame sequence composed of a large number of point cloud frames. One or more point cloud frames may be encoded by the encoder 220 to obtain an encoded G-PCC bit stream, which may specifically include an encoded geometry bit stream and an attribute bit stream E. The file encapsulator 230 can encapsulate one or more encoded bit streams according to a specific media container file format to obtain a media file F for file playback or a series of initialization segments and media segments Fs for streaming transmission. In some embodiments of the present application, the media container file format may be, for example, the ISO basic media file format specified in ISO/IEC 14496-12 [ISOBMFF]. File encapsulator 230 may also encapsulate metadata in media files F or media segments Fs.
文件封装器230输出的媒体文件F与文件解封装器240输入的媒体文件F'相同。文件解封装器通过处理媒体文件F'或处理接收到的媒体片段F's,可以提取得到编码比特流E'并解析元数据。解码器250可以将G-PCC比特流解码为解码信号D',并根据解码信号D'生成点云数据。适用时,基于由各种类型的传感器(例如头部)确定的当前观看位置、观看方向或视口,可以通过渲染器260将点云数据渲染并显示到头戴式显示器或任何其他显示设备的屏幕上。除了被播放器用来访问解码后的点云数据的适当部分外,当前的观看位置或观看方向也可以用于解码优化。在视口相关的内容分发器270中,当前的观看位置和观看方向也被传递给策略模块,该策略模块可以用于确定要接收的轨道。The media file F output by the file encapsulator 230 is the same as the media file F′ input by the file depackager 240 . The file decapsulator can extract the encoded bit stream E' and parse the metadata by processing the media file F' or processing the received media fragments F's. The decoder 250 may decode the G-PCC bit stream into a decoded signal D' and generate point cloud data according to the decoded signal D'. When applicable, point cloud data may be rendered and displayed by renderer 260 to a head mounted display or any other display device based on the current viewing position, viewing direction, or viewport determined by various types of sensors (e.g., head). on the screen. In addition to being used by the player to access the appropriate portion of the decoded point cloud data, the current viewing position or viewing direction can also be used for decoding optimization. In the viewport-dependent content distributor 270, the current viewing position and viewing direction are also passed to the policy module, which can be used to determine which track to receive.
在点云媒体的传输技术中,通常采用流化传输技术来处理服务器和客户端之间的媒体资源传输。常见的媒体流化传输技术包括DASH(Dynamic Adaptive Streaming over HTTP),HLS(HTTP Live Streaming),SMT(Smart Media Transport)等技术。In the transmission technology of point cloud media, streaming transmission technology is usually used to handle the transmission of media resources between the server and the client. Common media streaming transmission technologies include DASH (Dynamic Adaptive Streaming over HTTP), HLS (HTTP Live Streaming), SMT (Smart Media Transport) and other technologies.
以DASH为例,DASH是一种自适应比特率流技术,使高质量流媒体可以通过传统的HTTP网络服务器以互联网传递。DASH会将内容分解成一系列小型的基于HTTP的文件片段,每个片段包含很短长度的可播放内容,而内容总长度可能长达数小时(例如电影或体育赛事直播)。内容将被制成多种比特率的备选片段,以提供多种比特率的版本供选用。当媒体内容被DASH客户端播放时,客户端将根据当前网络条件自动选择下载和播放哪一个备选方案。客户端将选择可及时下载的最高比特率片段进行播放,从而避免播放卡顿或重新缓冲事件。也因如此,DASH客户端可以无缝适应不断变化的网络条件并提供高质量的播放体验,拥有更少的卡顿与重新缓冲发生率。Take DASH as an example. DASH is an adaptive bitrate streaming technology that enables high-quality streaming media to be delivered over the Internet through traditional HTTP web servers. DASH breaks the content into a series of small HTTP-based file fragments, each fragment contains a short length of playable content, and the total length of the content may be several hours (such as a movie or live sports event). Content will be cut into multiple bitrate alternatives to provide multiple bitrate versions for selection. When media content is played by a DASH client, the client will automatically select which alternative to download and play based on current network conditions. The client will select for playback the highest bitrate clip that can be downloaded in a timely manner, thus avoiding playback stutters or rebuffering events. Because of this, the DASH client can seamlessly adapt to changing network conditions and provide a high-quality playback experience with less lag and rebuffering.
DASH使用现有的HTTP网络服务器基础设施。它允许如互联网电视、电视机顶盒、台式电脑、智能手机、平板电脑等设备消费通过互联网传送的多媒体内容(如视频、电视、广播等),并可应对变动的互联网接收条件。DASH uses existing HTTP web server infrastructure. It allows devices such as Internet TVs, TV set-top boxes, desktop computers, smartphones, tablets and other devices to consume multimedia content (such as videos, TV, radio, etc.) transmitted through the Internet, and can cope with changing Internet reception conditions.
以基于几何模型的点云压缩G-PCC作为示例,每个G-PCC点云样本对应一个点云帧,由一个或多个属于同一呈现时间的G-PCC数据单元组成。Taking point cloud compression G-PCC based on geometric models as an example, each G-PCC point cloud sample corresponds to a point cloud frame and consists of one or more G-PCC data units belonging to the same presentation time.
图3示出了本申请一个实施例中基于TLV码流格式封装点云样本的语法结构。其中,每个点云样本由一个或多个数据单元G-PCC unit组成。gpcc_unit包含单个G-PCC数据单元。同一个点云样本中的G-PCC数据单元对应同一个点云帧,属于同一个呈现时间。TLV码流格式,即Type-length-value bytestream format,指的是由数据的类型Type、数据的长度Length和数据的值Value组成的结构体。关于TLV码流格式的具体信息可以参考标准ISO/IEC 23090-9。Figure 3 shows the syntax structure of encapsulating point cloud samples based on TLV code stream format in one embodiment of the present application. Among them, each point cloud sample consists of one or more data units G-PCC unit. gpcc_unit contains a single G-PCC data unit. G-PCC data units in the same point cloud sample correspond to the same point cloud frame and belong to the same presentation time. TLV code stream format, namely Type-length-value bytestream format, refers to a structure composed of data type Type, data length Length and data value Value. For specific information about the TLV code stream format, please refer to the standard ISO/IEC 23090-9.
图4示出了本申请一个实施例中基于TLV码流格式封装的数据单元的语法结构。其中,tlv_type是用于指示数据单元的类型的类型字段。表1示出了本申请一个实施例中关于数据单元类型字段不同取值的语义描述。Figure 4 shows the syntax structure of a data unit encapsulated based on the TLV code stream format in one embodiment of the present application. Among them, tlv_type is a type field used to indicate the type of data unit. Table 1 shows the semantic description of different values of the data unit type field in an embodiment of the present application.
表1Table 1
类型tlv_typetypetlv_type 语义描述Description Semantic descriptionDescription
00 Sequence parameter setSequence parameter set
11 Geometry parameter setGeometry parameter set
22 Geometry data unit Geometry data unit
33 Attribute parameter setAttribute parameter set
44 Attribute data unit Attribute data unit
55 Tile inventory Tile inventory
66 Frame boundary markerFrame boundary marker
如表1所示,不同取值的类型字段可以用于指示不同的数据单元类型。As shown in Table 1, the type field with different values can be used to indicate different data unit types.
当类型字段取值为0时,表示数据单元的类型为序列参数集合SPS(Sequence Parameter Set)。When the value of the type field is 0, it means that the type of the data unit is SPS (Sequence Parameter Set).
当类型字段取值为1时,表示数据单元的类型为几何参数集合GPS(Geometry Parameter Set)。When the value of the type field is 1, it means that the type of the data unit is the geometry parameter set GPS (Geometry Parameter Set).
当类型字段取值为2时,表示数据单元的类型为几何数据单元Geometry data unit。When the value of the type field is 2, it means that the type of the data unit is geometry data unit.
当类型字段取值为3时,表示数据单元的类型为属性参数集合APS(Attribute Parameter Set)。When the value of the type field is 3, it means that the type of the data unit is attribute parameter set APS (Attribute Parameter Set).
当类型字段取值为4时,表示数据单元的类型为属性数据单元Attribute data unit。When the value of the type field is 4, it means that the type of the data unit is attribute data unit.
当类型字段取值为5时,表示数据单元的类型为图块集合Tile inventory。When the value of the type field is 5, it indicates that the type of the data unit is the tile collection Tile inventory.
当类型字段取值为6时,表示数据单元的类型为帧边界标记Frame boundary marker。When the value of the type field is 6, it indicates that the type of the data unit is Frame boundary marker.
以上数据单元类型的具体信息可以参考标准ISO/IEC 23090-9。For specific information on the above data unit types, please refer to the standard ISO/IEC 23090-9.
在本申请的相关技术中,对于点云媒体的每个点云帧进行单独编码,这将导致点云媒体的编解码过程中存在如下两个问题。一个问题是,在基于帧的点云内容中,每帧的文件大小可能相对较小,这对于I/O接口的效率较低。另一个问题是,解码器需要从初始边界框开始运行并对每个单帧进行划分,在边缘设备中存在大量的初始化解码器的开销。In the related technology of this application, each point cloud frame of the point cloud media is encoded separately, which will lead to the following two problems in the encoding and decoding process of the point cloud media. One issue is that in frame-based point cloud content, the file size of each frame may be relatively small, which is less efficient for the I/O interface. Another problem is that the decoder needs to run from the initial bounding box and divide each single frame. There is a lot of overhead in initializing the decoder in edge devices.
第一个问题可以通过连接连续帧的编码比特流来轻松解决。然而,第二个问题则难以避免,除非在编码之前组合点云。本申请实施例提出组合帧编码的方案,通过在组合点云中引入帧索引的编码来解决这两个问题。此外,本申请实施例能够极大地提高了编码效率,因此也有利于基于帧的点云内容的存储使用。The first problem can be easily solved by concatenating the encoded bitstream of consecutive frames. However, the second problem is difficult to avoid unless the point clouds are combined before encoding. The embodiment of this application proposes a combined frame coding solution to solve these two problems by introducing frame index coding into the combined point cloud. In addition, the embodiments of the present application can greatly improve the coding efficiency, and are therefore also beneficial to the storage and use of frame-based point cloud content.
组合帧编码技术时将原始序列中的多个点云帧结合后进行编码,例如原始序列中存在100个点云帧,将其按照每4帧结合的方式,将原始点云序列重新构造后,得到25帧的点云序列,再对其进行编码。这样的编码方式,对于每一帧的点较少或者前后帧关联性较强的场景,能获得较大的编码效率提升。When combining frame coding technology, multiple point cloud frames in the original sequence are combined and encoded. For example, there are 100 point cloud frames in the original sequence, and they are combined every 4 frames to reconstruct the original point cloud sequence. A 25-frame point cloud sequence is obtained and then encoded. This encoding method can greatly improve the encoding efficiency for scenes with fewer points in each frame or strong correlation between the previous and next frames.
图5示出了本申请一个实施例中对点云数据进行多帧组合的原理示意图。如图5所示,将第一点云帧Frame1和第二点云帧Frame2进行组合后可以形成一个组合帧Combined Frame。当采用了结合多帧的组合编码技术后,新得到的序列中,其每一帧内会包含多个点云子帧,点云子帧是由具有相同帧号或帧索引属性值的点组成的点云帧的部分表示。当把两个或者两个以上的点云帧进行相互关联时,每个点云帧的单个八叉树在更高级别具有相似的结构;在组合帧的叶子节点中,也存在一些来自不同帧的重复内容,即重叠的点云数据。Figure 5 shows a schematic diagram of the principle of multi-frame combination of point cloud data in one embodiment of the present application. As shown in Figure 5, a combined frame can be formed by combining the first point cloud frame Frame1 and the second point cloud frame Frame2. When combined coding technology that combines multiple frames is used, each frame of the newly obtained sequence will contain multiple point cloud subframes. Point cloud subframes are composed of points with the same frame number or frame index attribute value. Partial representation of a point cloud frame. When two or more point cloud frames are related to each other, the single octree of each point cloud frame has a similar structure at a higher level; in the leaf nodes of the combined frame, there are also some from different frames. Duplicate content, that is, overlapping point cloud data.
下面结合具体实施方式对本申请提供的点云媒体的编码方法、点云媒体的解码方法、点云媒体的编码装置、点云媒体的解码装置、计算机可读介质、电子设备以及计算机程序产品等技术方案做出详细说明。本申请实施例的各项技术方案可以应用于沉浸式媒体系统的服务器端、播放器端或者中间节点等环节。The following describes the point cloud media encoding method, point cloud media decoding method, point cloud media encoding device, point cloud media decoding device, computer readable medium, electronic equipment, computer program products and other technologies provided in this application in conjunction with the specific embodiments. The plan is explained in detail. Various technical solutions in the embodiments of this application can be applied to the server side, player side or intermediate nodes of the immersive media system.
图6示出了本申请一个实施例中的点云媒体的解码方法的步骤流程图,该方法可以应用于点云媒体系统的服务器、客户端或者中间节点等环节的各种电子设备,本申请实施例以安装有点云解码装置的客户端设备执行的点云媒体的解码方法作为示例。如图6所示,该点云媒体的解码方法包括如下的步骤S610至S640。Figure 6 shows a step flow chart of the point cloud media decoding method in one embodiment of the present application. This method can be applied to various electronic devices in the server, client or intermediate node of the point cloud media system. The present application The embodiment takes a point cloud media decoding method executed by a client device installed with a point cloud decoding device as an example. As shown in Figure 6, the point cloud media decoding method includes the following steps S610 to S640.
步骤S610:获取点云媒体文件,点云媒体文件包括封装于一个或者多个轨道中的点云样本;Step S610: Obtain a point cloud media file. The point cloud media file includes point cloud samples encapsulated in one or more tracks;
步骤S620:解析点云样本中的各个子样本的媒体文件数据盒,得到子样本标志位字段的取值;Step S620: Analyze the media file data box of each sub-sample in the point cloud sample to obtain the value of the sub-sample flag field;
步骤S630:根据子样本标志位字段的取值获取与子样本中各个数据单元相对应的一个或者多个点云子帧的索引信息;当子样本中的一个数据单元对应至少两个点云子帧 的索引信息时,至少两个点云子帧具有重叠的点云数据;及Step S630: Obtain the index information of one or more point cloud subframes corresponding to each data unit in the subsample according to the value of the subsample flag field; when one data unit in the subsample corresponds to at least two point cloud subframes When providing frame index information, at least two point cloud subframes have overlapping point cloud data; and
步骤S640:根据一个或者多个点云子帧的索引信息对点云媒体文件进行解封装和解码处理,得到点云数据。Step S640: Decapsulate and decode the point cloud media file according to the index information of one or more point cloud subframes to obtain point cloud data.
子样本是在点云样本中的数据封装单元,子样本标志位字段还可以用于指示子样本的划分方式,基于不同的划分方式可以在点云样本中划分得到不同类型的子样本,例如基于数据单元、空间分块或者点云子帧等不同的维度可以在点云样本中划分得到具有不同数据容量的子样本。Subsamples are data encapsulation units in point cloud samples. The subsample flag field can also be used to indicate the division method of subsamples. Different types of subsamples can be divided into point cloud samples based on different division methods. For example, based on Different dimensions such as data units, spatial blocks or point cloud subframes can be divided into point cloud samples to obtain subsamples with different data capacities.
点云子帧是由具有相同的索引信息(例如帧号或帧索引属性值)的点组成的点云帧的部分表示。当一个点云帧是由多个点云帧组合形成的组合帧时,该组合帧内的各个点云帧即构成点云子帧。A point cloud subframe is a partial representation of a point cloud frame consisting of points with the same index information (such as frame number or frame index attribute value). When a point cloud frame is a combined frame formed by combining multiple point cloud frames, each point cloud frame in the combined frame constitutes a point cloud subframe.
在本申请实施例提供的点云媒体的解码方法中,通过点云样本中各个子样本的媒体文件数据盒,可以指示子样本中的各个数据单元与一个或者多个点云子帧的索引信息之间的对应关系,由此可以实现对点云样本中各个数据单元对应的一个或者多个点云子帧作为组合帧进行共同解码,一方面可以减少对于内容较少的点云帧进行单独解码而产生的计算资源浪费,另一方面可以标识点云数据重叠的点云子帧,提高点云媒体的解码效率。In the point cloud media decoding method provided by the embodiment of the present application, through the media file data box of each sub-sample in the point cloud sample, the index information of each data unit in the sub-sample and one or more point cloud sub-frames can be indicated. Through the corresponding relationship between each other, one or more point cloud subframes corresponding to each data unit in the point cloud sample can be jointly decoded as a combined frame. On the one hand, it can reduce the need to separately decode point cloud frames with less content. The resulting waste of computing resources can, on the other hand, identify point cloud subframes with overlapping point cloud data and improve the decoding efficiency of point cloud media.
下面结合多个实施例分别对本申请的点云媒体解码方法中各个方法步骤的具体实现方式做详细说明。The specific implementation of each method step in the point cloud media decoding method of the present application will be described in detail below in conjunction with multiple embodiments.
在步骤S610中,获取点云媒体文件,点云媒体文件包括封装于一个或者多个轨道中的点云样本。In step S610, a point cloud media file is obtained. The point cloud media file includes point cloud samples encapsulated in one or more tracks.
点云媒体文件可以是如图2所示的经过编码和封装处理后得到的媒体文件或者媒体片段,该媒体文件或者媒体片段中承载有待传输的点云码流。The point cloud media file may be a media file or media segment obtained after encoding and encapsulation processing as shown in Figure 2. The media file or media segment carries a point cloud code stream to be transmitted.
在本申请的一个实施例中,数据源可以根据点云码流中包含的几何参数信息、属性参数信息以及点云片的参数信息,将点云码流封装为单一轨道,或者也可以将单一轨道的点云媒体文件重新封装为包含多个轨道的点云媒体文件。In one embodiment of the present application, the data source can encapsulate the point cloud code stream into a single track based on the geometric parameter information, attribute parameter information and point cloud slice parameter information contained in the point cloud code stream, or it can also encapsulate a single track The point cloud media file of the track is repackaged into a point cloud media file containing multiple tracks.
轨道是指用于承载编码几何比特流或者编码属性比特流的体积视觉轨道(volumetric visual track),也可以是同时承载编码几何比特流和编码属性比特流的体积视觉轨道。A track refers to a volumetric visual track used to carry a coded geometry bitstream or a coded attribute bitstream, or a volumetric visual track that carries both a coded geometry bitstream and a coded attribute bitstream.
在点云码流以单轨道封装的情况下,每个点云样本都可以对应于一个完整的点云帧。When the point cloud code stream is packaged in a single track, each point cloud sample can correspond to a complete point cloud frame.
在步骤S620中,解析点云样本中的各个子样本的媒体文件数据盒,得到子样本标志位字段的取值。In step S620, the media file data box of each sub-sample in the point cloud sample is analyzed to obtain the value of the sub-sample flag field.
媒体文件数据盒可以是基于ISO基本媒体文件格式ISOBMFF(ISO Base Media File Format)的数据盒。ISOBMFF的具体信息可以参考标准ISO/IEC 14496-12。The media file data box may be a data box based on the ISO basic media file format ISOBMFF (ISO Base Media File Format). For specific information on ISOBMFF, please refer to the standard ISO/IEC 14496-12.
当G-PCC码流被承载在单个轨道中时,可以通过将G-PCC码流存储在单个轨道中来利用简单的ISOBMFF封装,而无需进一步处理。When the G-PCC codestream is carried in a single track, simple ISOBMFF encapsulation can be utilized by storing the G-PCC codestream in a single track without further processing.
图7示出了本申请一个实施例在单个轨道中封装点云样本的示例性结构。其中,moov表示点云样本的元数据信息metadata;元数据信息中包括“trak”、“stbl”、“stsd”、“gpe1”、“gpcC”、“xPS”、“stsz”以及“subs”等各种字段;mdat表示点云样本中携带的具体的媒体数据,包括各个点云样本。如图7中所示,组件Component仍然可以用标志值flags=0的“subs”数据盒来描述,而子帧索引subframe_idx由另一个标志值flags=2的“subs”数据盒来提供。这是由ISOBMFF规定的:当多个SubSampleInformationBox存在于同一个容器盒中时,每个SubSampleInformationBoxes中的标志值应不同。Figure 7 shows an exemplary structure of encapsulating point cloud samples in a single track according to one embodiment of the present application. Among them, moov represents the metadata information of the point cloud sample; the metadata information includes "trak", "stbl", "stsd", "gpe1", "gpcC", "xPS", "stsz" and "subs", etc. Various fields; mdat represents the specific media data carried in the point cloud sample, including each point cloud sample. As shown in Figure 7, the component Component can still be described by a "subs" data box with a flag value of flags=0, and the subframe index subframe_idx is provided by another "subs" data box with a flag value of flags=2. This is specified by ISOBMFF: when multiple SubSampleInformationBoxes exist in the same container box, the flag value in each SubSampleInformationBoxes should be different.
继续参考图7所示,在一个标志值flags=2的“subs”数据盒中,针对各个点云样本Sample 1……Sample n,其内部包含有基于子帧划分的一个或者多个子样本,例如点 云样本Sample 1包含有X个子样本,分别对应于子帧索引值为subframe_idx=1……X的多个点云子帧。每个点云子帧内又包括有不同的数据单元,例如图中所示的几何数据单元(geometry data unit)、属性数据单元(attribute data unit)和帧索引属性数据单元(frame index attribute data unit)。Continuing to refer to Figure 7, in a "subs" data box with a flag value of flags=2, for each point cloud sample Sample 1...Sample n, it contains one or more subsamples based on subframe division, for example Point cloud sample Sample 1 contains X subsamples, which correspond to multiple point cloud subframes with subframe index values subframe_idx=1...X. Each point cloud subframe includes different data units, such as the geometry data unit (geometry data unit), attribute data unit (attribute data unit) and frame index attribute data unit (frame index attribute data unit) shown in the figure. ).
图8示出了本申请一个实施例在多个轨道中封装几何码流和属性码流的示例性结构。其中,ftyp表示文件类型,描述点云样本遵从的规范的版本;moov表示点云样本的元数据信息metadata;mdat表示点云样本中携带的具体的媒体数据。Figure 8 shows an exemplary structure of encapsulating geometry code streams and attribute code streams in multiple tracks according to one embodiment of the present application. Among them, ftyp represents the file type and describes the version of the specification that the point cloud sample complies with; moov represents the metadata information of the point cloud sample; mdat represents the specific media data carried in the point cloud sample.
如图8所示,在多轨道封装模式下,每个点云组件的码流数据被映射到单独的轨道中。G-PCC组件轨道有两种类型:G-PCC几何轨道和G-PCC属性轨道。轨道中的每个点云样本都包含至少一个数据单元G-PCC unit,该数据单元承载单个G-PCC组件数据单元,而不是几何和属性数据单元或者不同属性数据单元的复用。G-PCC属性轨道不应复用不同的属性子流,例如颜色、反射率。As shown in Figure 8, in multi-track packaging mode, the code stream data of each point cloud component is mapped to a separate track. There are two types of G-PCC component tracks: G-PCC geometry track and G-PCC attribute track. Each point cloud sample in the track contains at least one data unit G-PCC unit, which carries a single G-PCC component data unit instead of a geometry and attribute data unit or a multiplexing of different attribute data units. G-PCC attribute tracks should not reuse different attribute substreams, such as color and reflectivity.
在本申请的一个实施例中,可以通过扩展子样本信息数据盒SubSampleInformationBox的编码相关参数字段,标识点云子帧的相关信息。在本申请实施例中,子样本标志位字段还用于指示子样本的划分方式,而点云样本中的各个子样本的划分方式可以包括:In one embodiment of the present application, the relevant information of the point cloud subframe can be identified by extending the encoding-related parameter field of the subsample information data box SubSampleInformationBox. In this embodiment of the present application, the subsample flag field is also used to indicate the division method of the subsamples, and the division method of each subsample in the point cloud sample may include:
当子样本标志位字段的取值为第一数值时,基于数据单元划分子样本,以使一个子样本中包含一个数据单元;When the value of the sub-sample flag field is the first value, divide the sub-samples based on the data unit so that one sub-sample contains one data unit;
当子样本标志位字段的取值为第二数值时,基于空间分块划分子样本,以使一个子样本包含对应于一个第一划分对象的一个或多个连续的数据单元,第一划分对象包括空间分块、参数集合、空间分块集合信息或者帧边界标识中的至少一个;及When the value of the sub-sample flag field is the second value, the sub-samples are divided based on the spatial block, so that one sub-sample contains one or more continuous data units corresponding to a first division object, and the first division object Including at least one of spatial block, parameter set, spatial block set information or frame boundary identification; and
当子样本标志位字段的取值为第三数值时,基于点云子帧划分子样本,以使一个子样本中包含对应于一个第二划分对象的一个或多个连续的数据单元,第二划分对象包括一个完整的点云子帧。When the value of the subsample flag field is the third value, the subsamples are divided based on the point cloud subframe, so that one subsample contains one or more continuous data units corresponding to a second division object, and the second The divided object consists of a complete point cloud subframe.
本实施例中,通过子样本标志位字段的不同取值,区分数据单元划分方式、空间分块划分方式以及空间分块划分方式等各种子样本划分方式,从而可以对子样本按照不同划分方式进行划分,便于对不同数据容量的子样本中数据单元组合进行解码,能够减少计算资源浪费,并提高解码处理效率。In this embodiment, various sub-sample division methods such as data unit division method, spatial block division method, and space block division method are distinguished through different values of the sub-sample flag field, so that sub-samples can be divided according to different division methods. Dividing facilitates decoding of data unit combinations in subsamples with different data capacities, which can reduce waste of computing resources and improve decoding processing efficiency.
在本申请的一个实施例中,当子样本标志位字段的取值为第三数值时,子样本的媒体文件数据盒包括:In one embodiment of the present application, when the value of the sub-sample flag field is the third value, the media file data box of the sub-sample includes:
子帧索引字段,子帧索引字段用于指示当前子样本中包含的点云子帧的索引信息。从而能够通过子帧索引字段中点云子帧的索引信息确定点云子帧,实现将点云样本中各个数据单元对应的一个或者多个点云子帧作为组合帧进行共同解码,一方面可以减少对于内容较少的点云帧进行单独解码而产生的计算资源浪费,另一方面可以标识点云数据重叠的点云子帧,提高点云媒体的解码效率。Subframe index field. The subframe index field is used to indicate the index information of the point cloud subframe contained in the current subsample. Therefore, the point cloud subframe can be determined through the index information of the point cloud subframe in the subframe index field, and one or more point cloud subframes corresponding to each data unit in the point cloud sample can be jointly decoded as a combined frame. On the one hand, it can It reduces the waste of computing resources caused by separately decoding point cloud frames with less content. On the other hand, it can identify point cloud subframes with overlapping point cloud data and improve the decoding efficiency of point cloud media.
以ISOBMFF数据盒为例,对于点云文件中的子样本信息数据盒SubSampleInformationBox,其子样本定义应基于SubSampleInformationBox数据盒中标志位字段的取值。Taking the ISOBMFF data box as an example, for the subsample information data box SubSampleInformationBox in the point cloud file, the subsample definition should be based on the value of the flag field in the SubSampleInformationBox data box.
针对以上多个实施例在具体应用场景中的实现方式,图9示出了本申请实施例在一个应用场景中SubSampleInformationBox数据盒的编码相关参数字段codec_specific_parameters的语法结构。Regarding the implementation of the above multiple embodiments in specific application scenarios, Figure 9 shows the syntax structure of the coding-related parameter field codec_specific_parameters of the SubSampleInformationBox data box in an application scenario according to the embodiment of the present application.
如图9所示,当子样本标志位字段flags取值为0时,表示采用基于G-PCC数据单元的子样本划分方式。一个子样本仅包含一个G-PCC单元。As shown in Figure 9, when the value of the sub-sample flag field flags is 0, it means that the sub-sample division method based on the G-PCC data unit is adopted. A subsample contains only one G-PCC unit.
在此基础上,子样本信息数据盒SubSampleInformationBox可以包括如下字段:On this basis, the subsample information data box SubSampleInformationBox can include the following fields:
payloadType,指示子样本中包含的G-PCC单元的tlv_type类型;payloadType, indicating the tlv_type type of the G-PCC unit contained in the subsample;
attrIdx,指示子样本中包含的属性数据对应的ash_attr_sps_attr_idx字段取值。attrIdx indicates the value of the ash_attr_sps_attr_idx field corresponding to the attribute data contained in the subsample.
当子样本标志位字段flags取值为1时,表示采用基于空间分块tile的子样本划分方式。一个子样本中包含对应一个空间分块tile的一个或多个连续数据单元,或者包含对应参数集合、空间分块集合信息或帧边界标识的一个或多个连续数据单元。When the value of the sub-sample flag field flags is 1, it indicates that the sub-sample division method based on spatial tiles is adopted. A subsample contains one or more continuous data units corresponding to a spatial block tile, or one or more continuous data units corresponding to a parameter set, spatial block set information, or frame boundary identification.
在此基础上,子样本信息数据盒SubSampleInformationBox可以包括如下字段:On this basis, the subsample information data box SubSampleInformationBox can include the following fields:
tile_data,取值为1时表示子样本中包含对应tile的几何数据或属性数据;取值为0时表示子样本包含参数集数据、tile几何信息或帧边界标识。tile_data, when the value is 1, it means that the sub-sample contains the geometric data or attribute data of the corresponding tile; when the value is 0, it means that the sub-sample contains parameter set data, tile geometry information or frame boundary identification.
tile_id,指示子样本中的数据关联的tile索引号。tile_id, indicating the tile index number associated with the data in the subsample.
当子样本标志位字段flags取值为2时,表示采用基于子帧的子样本划分方式。一个子样本中包含对应一个完整点云子帧的连续数据单元。When the value of the subsample flag field flags is 2, it indicates that the subframe-based subsample division method is used. A subsample contains continuous data units corresponding to a complete point cloud subframe.
在此基础上,子样本信息数据盒SubSampleInformationBox可以包括如下字段:On this basis, the subsample information data box SubSampleInformationBox can include the following fields:
子帧索引字段subframe_idx,指示当前子样本中包含的点云子帧对应的帧号属性取值。The subframe index field subframe_idx indicates the value of the frame number attribute corresponding to the point cloud subframe contained in the current subsample.
在本申请的一个实施例中,为了将子帧完全不重叠以及子帧存在重叠两种应用场景进行统一,点云样本中的各个子样本的划分方式可以包括:In one embodiment of the present application, in order to unify two application scenarios where subframes do not overlap at all and subframes overlap, the division method of each subsample in the point cloud sample may include:
当子样本标志位字段的取值为第一数值时,基于数据单元划分子样本,以使一个子样本中包含一个数据单元;When the value of the sub-sample flag field is the first value, divide the sub-samples based on the data unit so that one sub-sample contains one data unit;
当子样本标志位字段的取值为第二数值时,基于空间分块划分子样本,以使一个子样本包含对应于一个第一划分对象的一个或多个连续的数据单元,第一划分对象包括空间分块、参数集合、空间分块集合信息或者帧边界标识中的至少一个;及When the value of the sub-sample flag field is the second value, the sub-samples are divided based on the spatial block, so that one sub-sample contains one or more continuous data units corresponding to a first division object, and the first division object Including at least one of spatial block, parameter set, spatial block set information or frame boundary identification; and
当子样本标志位字段的取值为第三数值时,基于点云子帧划分子样本,以使一个子样本中包含对应于一个第二划分对象的一个或多个连续的数据单元,第二划分对象包括一个或者多个点云子帧。When the value of the subsample flag field is the third value, the subsamples are divided based on the point cloud subframe, so that one subsample contains one or more continuous data units corresponding to a second division object, and the second The division object includes one or more point cloud subframes.
本实施例中,通过子样本标志位字段的不同取值,区分数据单元划分方式、空间分块划分方式以及空间分块划分方式等各种子样本划分方式,可以将子帧进行重叠划分或不重叠划分,从而能够实现对不同数据容量的子样本中数据单元组合进行解码,能够减少计算资源浪费,并提高解码处理效率。In this embodiment, through different values of the sub-sample flag field, various sub-sample division methods such as data unit division method, spatial block division method and spatial block division method are distinguished, and the sub-frames can be divided into overlapping or non-overlapping divisions. Overlapping division enables decoding of combinations of data units in subsamples with different data capacities, which can reduce waste of computing resources and improve decoding processing efficiency.
在本申请的一个实施例中,当子样本标志位字段的取值为第三数值时,子样本的媒体文件数据盒包括:In one embodiment of the present application, when the value of the sub-sample flag field is the third value, the media file data box of the sub-sample includes:
子帧完整标志位字段,用于指示当前子样本是否包含构成点云子帧的所有数据;The subframe complete flag field is used to indicate whether the current subsample contains all data that constitutes the point cloud subframe;
子帧数量字段,用于指示当前子样本对应的点云子帧的数量;及The number of subframes field is used to indicate the number of point cloud subframes corresponding to the current subsample; and
子帧索引字段,用于指示当前子样本对应的点云子帧的索引信息。The subframe index field is used to indicate the index information of the point cloud subframe corresponding to the current subsample.
本实施例中,通过子帧完整标志位字段指示构成点云子帧的所有数据,通过子帧数量字段指示点云子帧的数量,通过子帧索引字段指示点云子帧的索引信息,从而能通过子样本的媒体文件数据盒对相应的点云子帧进行详细描述,以便将点云样本中各个数据单元对应的一个或者多个点云子帧作为组合帧进行共同解码,一方面可以减少对于内容较少的点云帧进行单独解码而产生的计算资源浪费,另一方面可以标识点云数据重叠的点云子帧,提高点云媒体的解码效率。In this embodiment, all the data constituting the point cloud subframe is indicated through the subframe complete flag field, the number of point cloud subframes is indicated through the subframe number field, and the index information of the point cloud subframe is indicated through the subframe index field, so that The corresponding point cloud subframe can be described in detail through the media file data box of the subsample, so that one or more point cloud subframes corresponding to each data unit in the point cloud sample can be jointly decoded as a combined frame. On the one hand, it can reduce There is a waste of computing resources caused by separate decoding of point cloud frames with less content. On the other hand, point cloud subframes with overlapping point cloud data can be identified to improve the decoding efficiency of point cloud media.
在本申请的一个实施例中,当点云样本封装于一个轨道时,构成点云子帧的所有数据包括所有的几何数据和所有的属性数据;当点云样本封装于多个轨道时,构成点云子帧的所有数据包括所有的几何数据或者所有的属性数据。In one embodiment of the present application, when the point cloud sample is packaged in one track, all data constituting the point cloud subframe include all geometric data and all attribute data; when the point cloud sample is packaged in multiple tracks, All data of point cloud subframes include all geometric data or all attribute data.
针对以上多个实施例在具体应用场景中的实现方式,图10示出了本申请实施例在统一标识子帧重叠/不重叠的应用场景中SubSampleInformationBox数据盒的编码相关参数字段codec_specific_parameters的语法结构。Regarding the implementation of the above multiple embodiments in specific application scenarios, Figure 10 shows the syntax structure of the coding-related parameter field codec_specific_parameters of the SubSampleInformationBox data box in the application scenario of uniform identification of subframe overlap/non-overlap according to the embodiment of the present application.
如图10所示,当子样本标志位字段flags取值为0时,表示采用基于G-PCC数据 单元的子样本划分方式。一个子样本仅包含一个数据单元。As shown in Figure 10, when the value of the sub-sample flag field flags is 0, it means that the sub-sample division method based on the G-PCC data unit is adopted. A subsample contains only one data unit.
在此基础上,子样本信息数据盒SubSampleInformationBox可以包括如下字段:On this basis, the subsample information data box SubSampleInformationBox can include the following fields:
payloadType,指示子样本中包含的数据单元的tlv_type类型;payloadType, indicating the tlv_type type of the data unit contained in the subsample;
attrIdx,指示子样本中包含的属性数据对应的ash_attr_sps_attr_idx字段取值。attrIdx indicates the value of the ash_attr_sps_attr_idx field corresponding to the attribute data contained in the subsample.
当子样本标志位字段flags取值为1时,表示采用基于空间分块tile的子样本划分方式。一个子样本中包含对应一个空间分块tile的一个或多个连续数据单元,或者包含对应参数集合、空间分块集合信息或帧边界标识的一个或多个连续数据单元。When the value of the sub-sample flag field flags is 1, it indicates that the sub-sample division method based on spatial tiles is adopted. A subsample contains one or more continuous data units corresponding to a spatial block tile, or one or more continuous data units corresponding to a parameter set, spatial block set information, or frame boundary identification.
在此基础上,子样本信息数据盒SubSampleInformationBox可以包括如下字段:On this basis, the subsample information data box SubSampleInformationBox can include the following fields:
tile_data,取值为1时表示子样本中包含对应tile的几何数据或属性数据;取值为0时表示子样本包含参数集数据、tile几何信息或帧边界标识。tile_data, when the value is 1, it means that the sub-sample contains the geometric data or attribute data of the corresponding tile; when the value is 0, it means that the sub-sample contains parameter set data, tile geometry information or frame boundary identification.
tile_id,指示子样本中的数据关联的tile索引号。tile_id, indicating the tile index number associated with the data in the subsample.
当子样本标志位字段flags取值为2时,表示采用基于子帧的子样本划分方式。一个子样本中包含一个或多个连续的数据单元,对应一个或多个点云子帧。When the value of the subsample flag field flags is 2, it indicates that the subframe-based subsample division method is used. A subsample contains one or more continuous data units, corresponding to one or more point cloud subframes.
在此基础上,子样本信息数据盒SubSampleInformationBox可以包括如下字段。On this basis, the subsample information data box SubSampleInformationBox can include the following fields.
在此基础上,子样本信息数据盒SubSampleInformationBox可以包括如下字段:On this basis, the subsample information data box SubSampleInformationBox can include the following fields:
子帧完整标志位字段complete_subframe_flag,取值为1时,表示当前子样本对应的数据单元中包含构成相应子帧的所有数据;取值为0时,表示当前子样本对应的数据单元中包含构成相应子帧的部分数据。(单轨封装模式下,所有数据指所有的几何和属性数据;多轨封装模式下,所有数据指所有的几何数据或者所有的特性类型的属性数据)Subframe complete flag field complete_subframe_flag. When the value is 1, it means that the data unit corresponding to the current subsample contains all the data that constitutes the corresponding subframe; when the value is 0, it means that the data unit corresponding to the current subsample contains all the data that constitutes the corresponding subframe. Partial data of the subframe. (In single-track packaging mode, all data refers to all geometry and attribute data; in multi-track packaging mode, all data refers to all geometric data or attribute data of all feature types)
子帧数量字段num_subframes,指示子样本中的数据单元对应的子帧数目。当该字段取值为1时,当前子样本对应的子帧由subframe_idx指示;当该字段取值大于1时,当前子样本对应的子帧为subframe_idx指示的子帧以及其后的num_subframes-1个连续帧号的子帧。The subframe number field num_subframes indicates the number of subframes corresponding to the data unit in the subsample. When the value of this field is 1, the subframe corresponding to the current subsample is indicated by subframe_idx; when the value of this field is greater than 1, the subframe corresponding to the current subsample is the subframe indicated by subframe_idx and the subsequent num_subframes-1 Subframes of consecutive frame numbers.
子帧索引字段subframe_idx,指示当前子样本中包含的点云子帧对应的帧号属性取值。The subframe index field subframe_idx indicates the value of the frame number attribute corresponding to the point cloud subframe contained in the current subsample.
本实施例中,对于单轨道封装和多轨道封装时,可以将几何数据和属性数据的封装轨道进行灵活设置,能够适用于各种应用场景,确保在不同场景下的解码处理效率。In this embodiment, for single-track encapsulation and multi-track encapsulation, the encapsulation tracks of geometric data and attribute data can be flexibly set, which can be applied to various application scenarios and ensure decoding processing efficiency in different scenarios.
在一个实施例中,当子样本标志位字段的取值为第一数值时,点云子帧的媒体文件数据盒包括:相关子帧数量字段,用于指示当前子样本对应的点云子帧的数量;及子帧索引字段,用于指示当前子样本对应的点云子帧的索引信息。具体可以通过扩展媒体文件数据盒中的样本组工具标识点云子帧的相关信息,包括相关子帧数量字段和子帧索引字段,从而能对点云子帧的数量和索引信息进行描述,以便将点云样本中各个数据单元对应的一个或者多个点云子帧作为组合帧进行共同解码,一方面可以减少对于内容较少的点云帧进行单独解码而产生的计算资源浪费,另一方面可以标识点云数据重叠的点云子帧,提高点云媒体的解码效率。In one embodiment, when the value of the subsample flag field is the first value, the media file data box of the point cloud subframe includes: a related subframe number field, used to indicate the point cloud subframe corresponding to the current subsample. The number; and the subframe index field, used to indicate the index information of the point cloud subframe corresponding to the current subsample. Specifically, the sample group tool in the extended media file data box can be used to identify the relevant information of the point cloud subframe, including the related subframe number field and the subframe index field, so that the number and index information of the point cloud subframe can be described, so as to One or more point cloud subframes corresponding to each data unit in the point cloud sample are jointly decoded as a combined frame. On the one hand, it can reduce the waste of computing resources caused by individually decoding point cloud frames with less content. On the other hand, it can Identify point cloud subframes with overlapping point cloud data to improve the decoding efficiency of point cloud media.
在本申请的一个实施例中,可以通过扩展媒体文件数据盒中的样本组工具标识点云子帧的相关信息。在本申请实施例中,当子样本标志位字段的取值为第一数值时,点云子帧的媒体文件数据盒包括:In one embodiment of the present application, the relevant information of the point cloud subframe can be identified through the sample group tool in the extended media file data box. In this embodiment of the present application, when the value of the subsample flag field is the first value, the media file data box of the point cloud subframe includes:
子样本数量字段,用于指示当前样本中包含的子样本的数量;Subsample number field, used to indicate the number of subsamples included in the current sample;
相关子帧数量字段,用于指示当前子样本对应的点云子帧的数量;及The related subframe number field is used to indicate the number of point cloud subframes corresponding to the current subsample; and
子帧索引字段,用于指示当前子样本对应的点云子帧的索引信息。The subframe index field is used to indicate the index information of the point cloud subframe corresponding to the current subsample.
本实施例中,点云子帧的媒体文件数据盒包括子样本数量字段、相关子帧数量字段以及子帧索引字段,能对子样本的数量,以及点云子帧的数量和索引信息进行描述,以便将点云样本中各个数据单元对应的一个或者多个点云子帧作为组合帧进行共同解码,一方面可以减少对于内容较少的点云帧进行单独解码而产生的计算资源浪费,另一方面 可以标识点云数据重叠的点云子帧,提高点云媒体的解码效率。In this embodiment, the media file data box of the point cloud subframe includes a subsample number field, a related subframe number field, and a subframe index field, which can describe the number of subsamples, as well as the number and index information of point cloud subframes. , so that one or more point cloud subframes corresponding to each data unit in the point cloud sample can be jointly decoded as a combined frame. On the one hand, it can reduce the waste of computing resources caused by individually decoding point cloud frames with less content. On the other hand, On the one hand, it can identify point cloud subframes with overlapping point cloud data and improve the decoding efficiency of point cloud media.
针对以上实施例在具体应用场景中的实现方式,图11示出了本申请实施例在一个应用场景中扩展样本组工具的语法结构。Regarding the implementation of the above embodiment in a specific application scenario, Figure 11 shows the syntax structure of the extended sample group tool in an application scenario according to the embodiment of the present application.
如图11所示,媒体文件数据盒中的样本组工具可以包括如下的字段:As shown in Figure 11, the sample group tool in the media file data box can include the following fields:
子样本数量字段subsample_count,指示当前样本中包含的子样本个数。The subsample number field subsample_count indicates the number of subsamples contained in the current sample.
相关子帧数量字段related_subframe_num,指示当前子样本对应的点云子帧个数。当该字段取值为0时,表示当前子样本中包含的信息与点云子帧划分无关(比如tile集合信息或者帧结尾标识符)。The related subframe number field related_subframe_num indicates the number of point cloud subframes corresponding to the current subsample. When the value of this field is 0, it means that the information contained in the current subsample has nothing to do with the point cloud subframe division (such as tile set information or frame end identifier).
子帧索引字段subframe_index,指示当前子样本对应的点云子帧序号,该序号的取值与帧号属性中的取值相同。The subframe index field subframe_index indicates the point cloud subframe sequence number corresponding to the current subsample. The value of this sequence number is the same as the value in the frame number attribute.
在一些可选的实施方式中,可以不在子样本sub-sample级别指示点云子帧sub-frame相关的信息,仅在点云样本sample级别指示点云子帧sub-frame相关的信息。即在点云子帧重叠的情况下,在系统层仅标识存在点云子帧的样本以及对应的点云子帧索引号。图12示出了本申请实施例在一个应用场景中基于点云样本级别的媒体文件数据盒标识点云子帧相关信息的语法结构。In some optional implementations, the information related to the sub-frame of the point cloud sub-frame may not be indicated at the sub-sample level, but only the information related to the sub-frame of the point cloud sub-frame may be indicated at the point cloud sample sample level. That is, in the case of overlapping point cloud subframes, only the samples with point cloud subframes and the corresponding point cloud subframe index numbers are identified at the system layer. Figure 12 shows the syntax structure of identifying point cloud subframe related information based on the point cloud sample level media file data box in an application scenario according to the embodiment of the present application.
如图12所示,本申请实施例中的媒体文件数据盒可以包括如下的字段:As shown in Figure 12, the media file data box in this embodiment of the present application may include the following fields:
相关子帧数量字段related_subframe_num,指示当前样本对应的点云子帧个数。The related subframe number field related_subframe_num indicates the number of point cloud subframes corresponding to the current sample.
子帧索引字段subframe_index,指示当前样本对应的子帧序号,该序号的取值应与帧号属性中的取值相同。The subframe index field subframe_index indicates the subframe sequence number corresponding to the current sample. The value of this sequence number should be the same as the value in the frame number attribute.
在本申请的一个实施例中,可以通过定义子样本子帧信息数据盒SubsampleSubframeInfoBox对点云子帧相关的信息进行标识。在本申请实施例中,当子样本标志位字段的取值为第一数值时,点云子帧的媒体文件数据盒包括:In one embodiment of the present application, the information related to the point cloud subframe can be identified by defining a subsample subframe information data box SubsampleSubframeInfoBox. In this embodiment of the present application, when the value of the subsample flag field is the first value, the media file data box of the point cloud subframe includes:
子帧相关样本数量字段,用于指示包含多个点云子帧的点云样本的数量;Subframe related sample number field, used to indicate the number of point cloud samples containing multiple point cloud subframes;
样本序号差值字段,用于指示在解码顺序下,当前包含多个点云子帧的点云样本与前一个包含多个点云子帧的点云样本之间的序号差值;The sample serial number difference field is used to indicate the serial number difference between the current point cloud sample containing multiple point cloud subframes and the previous point cloud sample containing multiple point cloud subframes in the decoding order;
子样本数量字段,用于指示当前点云样本中包含的子样本的数量;Subsample number field, used to indicate the number of subsamples contained in the current point cloud sample;
相关子帧数量字段,用于指示当前子样本对应的点云子帧的数量;The related subframe number field is used to indicate the number of point cloud subframes corresponding to the current subsample;
子帧索引字段,用于指示当前子样本对应的点云子帧的索引信息。The subframe index field is used to indicate the index information of the point cloud subframe corresponding to the current subsample.
本实施例中,点云子帧的媒体文件数据盒还包括子帧相关样本数量字段和样本序号差值字段,以对点云子帧的点云样本的数量、点云样本之间的序号差值进行描述,便于按序对各个点云样本进行组合解码处理,能够减少计算资源浪费,并提高解码处理效率。In this embodiment, the media file data box of the point cloud subframe also includes a subframe-related sample number field and a sample serial number difference field to calculate the number of point cloud samples in the point cloud subframe and the serial number difference between point cloud samples. Values are described to facilitate the combined decoding processing of each point cloud sample in order, which can reduce the waste of computing resources and improve the efficiency of decoding processing.
针对以上多个实施例在具体应用场景中的实现方式,图13示出了本申请实施例在一个应用场景中通过子样本子帧信息数据盒SubsampleSubframeInfoBox标识子帧相关信息的语法结构。该子样本子帧信息数据盒SubsampleSubframeInfoBox的数据盒类型例如可以是'sbfi',包含于SampleEntry或TrackFragmentBox。子样本子帧信息数据盒用于指示包含多个子帧的点云样本中,基于G-PCC数据单元划分的各个子样本对应的子帧信息。当该数据盒存在于轨道中时,子样本信息数据盒中的子样本标志位字段必须取值为0,即采用基于数据单元的子样本划分方式。Regarding the implementation of the above multiple embodiments in specific application scenarios, Figure 13 shows the syntax structure of identifying subframe related information through the subsample subframe information data box SubsampleSubframeInfoBox in an application scenario according to the embodiment of the present application. The data box type of the subsample subframe information data box SubsampleSubframeInfoBox may be 'sbfi', for example, and is included in SampleEntry or TrackFragmentBox. The subsample subframe information data box is used to indicate the subframe information corresponding to each subsample divided based on the G-PCC data unit in a point cloud sample containing multiple subframes. When the data box exists in the track, the sub-sample flag field in the sub-sample information data box must have a value of 0, that is, the sub-sample division method based on data units is adopted.
如图13所示,子样本子帧信息数据盒SubsampleSubframeInfoBox可以包括如下字段:As shown in Figure 13, the subsample subframe information data box SubsampleSubframeInfoBox can include the following fields:
子帧相关样本数量字段subframe_related_sample_num,指示包含多个点云子帧的点云样本个数。The subframe related sample number field subframe_related_sample_num indicates the number of point cloud samples containing multiple point cloud subframes.
样本序号差值字段sample_delta,指示解码顺序下,当前包含多个子帧的样本序号与前一个包含多个子帧的样本序号之间的差值。对于第一个包含多个点云子帧的点云样本,该字段的取值为该点云样本的序号。The sample serial number difference field sample_delta indicates the difference between the current sample serial number containing multiple subframes and the previous sample serial number containing multiple subframes in the decoding order. For the first point cloud sample containing multiple point cloud subframes, the value of this field is the serial number of the point cloud sample.
子样本数量字段subsample_count,指示当前样本中包含的子样本个数。The subsample number field subsample_count indicates the number of subsamples contained in the current sample.
相关子帧数量字段related_subframe_num,指示当前子样本对应的点云子帧个数。当该字段取值为0时,表示当前子样本中包含的信息与点云子帧划分无关(比如tile集合信息或者帧结尾标识符)。The related subframe number field related_subframe_num indicates the number of point cloud subframes corresponding to the current subsample. When the value of this field is 0, it means that the information contained in the current subsample has nothing to do with the point cloud subframe division (such as tile set information or frame end identifier).
子帧索引字段subframe_index,指示当前子样本对应的点云子帧序号,该序号的取值与帧号属性中的取值相同。The subframe index field subframe_index indicates the point cloud subframe sequence number corresponding to the current subsample. The value of this sequence number is the same as the value in the frame number attribute.
在本申请的一个实施例中,通过媒体文件数据盒可以指示点云子帧的呈现时间。在本申请实施例中,当点云样本中的各个子样本的划分方式为基于点云子帧划分子样本时,子样本的媒体文件数据盒包括:In one embodiment of the present application, the presentation time of the point cloud subframe can be indicated through the media file data box. In the embodiment of this application, when each sub-sample in the point cloud sample is divided into sub-samples based on point cloud sub-frames, the media file data box of the sub-sample includes:
呈现时间标志位字段,用于指示点云样本中包含的各个点云子帧是否具有相同的呈现时长;及The presentation time flag field is used to indicate whether each point cloud subframe included in the point cloud sample has the same presentation duration; and
子样本时长字段,用于指示当点云样本中包含的各个点云子帧具有不同的呈现时长时,当前子样本的呈现时长。The subsample duration field is used to indicate the presentation duration of the current subsample when each point cloud subframe contained in the point cloud sample has a different presentation duration.
本实施例中,可以通过媒体文件数据盒中的呈现时间标志位字段指示点云子帧的呈现时长是否相同,通过子样本时长字段指示子样本的呈现时长,从而能够对点云子帧中各个子样本的呈现时长进行指示,以便基于呈现时长进行展示,确保媒体呈现效果。In this embodiment, the presentation time flag field in the media file data box can be used to indicate whether the presentation duration of the point cloud subframes is the same, and the subsample duration field can be used to indicate the presentation duration of the subsamples, so that each point cloud subframe can be The presentation duration of the sub-sample is indicated so that the display can be performed based on the presentation duration to ensure the media presentation effect.
针对以上实施例在具体应用场景中的实现方式,图14示出了本申请实施例在一个应用场景中通过子样本子帧信息数据盒SubsampleSubframeInfoBox标识子帧呈现时间信息的语法结构。Regarding the implementation of the above embodiments in specific application scenarios, Figure 14 shows the syntax structure of identifying subframe presentation time information through the subsample subframe information data box SubsampleSubframeInfoBox in an application scenario according to the embodiment of the present application.
如图14所示,子样本子帧信息数据盒SubsampleSubframeInfoBox中的编码相关参数字段codec_specific_parameters包括如下字段:As shown in Figure 14, the coding-related parameter field codec_specific_parameters in the SubsampleSubframeInfoBox includes the following fields:
呈现时间标志位字段with_unique_duration_flag,取值为0表示样本中包含的多个子帧具备相同的呈现时间,则根据样本本身的呈现时间和样本内的子帧个数即可计算每个子帧的呈现时长。此时该样本对应的sample_delta字段取值应为子帧个数的整数倍。取值为1表示样本中包含的多个子帧具备不同的呈现时长。For the presentation time flag field with_unique_duration_flag, a value of 0 indicates that multiple subframes included in the sample have the same presentation time. The presentation duration of each subframe can be calculated based on the presentation time of the sample itself and the number of subframes in the sample. At this time, the value of the sample_delta field corresponding to the sample should be an integer multiple of the number of subframes. A value of 1 indicates that multiple subframes included in the sample have different presentation durations.
子样本时长字段sub_sample_duration,指示子样本的呈现时长。样本中多个子样本的该字段取值之和应等于该样本对应的sample_delta字段取值。The subsample duration field sub_sample_duration indicates the presentation duration of the subsample. The sum of the values of this field for multiple subsamples in the sample should be equal to the value of the sample_delta field corresponding to the sample.
在本申请的一个实施例中,通过扩展媒体文件数据盒可以统一指示两种子帧场景的各子帧呈现时长。在此基础上,点云样本的媒体文件数据盒包括:In one embodiment of the present application, the presentation duration of each subframe of two subframe scenarios can be uniformly indicated by extending the media file data box. On this basis, the media file data box of the point cloud sample includes:
子帧数量字段,用于指示当前点云样本中包含的点云子帧的数量;The number of subframes field is used to indicate the number of point cloud subframes contained in the current point cloud sample;
呈现时间标志位字段,用于指示点云样本中包含的各个点云子帧是否具有相同的呈现时长;The presentation time flag field is used to indicate whether each point cloud subframe contained in the point cloud sample has the same presentation duration;
子帧索引字段,用于指示当点云样本中包含的各个点云子帧具有不同的呈现时长时,当前子样本对应的点云子帧的索引信息;及The subframe index field is used to indicate the index information of the point cloud subframe corresponding to the current subsample when each point cloud subframe included in the point cloud sample has different presentation duration; and
子样本时长字段,用于指示当点云样本中包含的各个点云子帧具有不同的呈现时长时,当前子样本的呈现时长。The subsample duration field is used to indicate the presentation duration of the current subsample when each point cloud subframe contained in the point cloud sample has a different presentation duration.
本实施例中,可以通过子帧数量字段指示点云子帧的数量,通过子帧索引字段指示点云子帧的索引信息,以便将点云样本中各个数据单元对应的一个或者多个点云子帧作为组合帧进行共同解码,一方面可以减少对于内容较少的点云帧进行单独解码而产生的计算资源浪费,另一方面可以标识点云数据重叠的点云子帧,提高点云媒体的解码效率,而且,通过媒体文件数据盒中的呈现时间标志位字段指示点云子帧的呈现时长是否相同,通过子样本时长字段指示子样本的呈现时长,从而能够对点云子帧中各个子样本的呈现时长进行指示,以便基于呈现时长进行展示,确保媒体呈现效果。In this embodiment, the number of point cloud subframes can be indicated through the subframe number field, and the index information of the point cloud subframe can be indicated through the subframe index field, so that one or more point clouds corresponding to each data unit in the point cloud sample can be Subframes are jointly decoded as combined frames. On the one hand, it can reduce the waste of computing resources caused by separately decoding point cloud frames with less content. On the other hand, it can identify point cloud subframes with overlapping point cloud data, improving point cloud media. decoding efficiency, and the presentation time flag field in the media file data box indicates whether the presentation duration of the point cloud subframes is the same, and the subsample duration field indicates the presentation duration of the subsamples, so that each point cloud subframe can be The presentation duration of the sub-sample is indicated so that the display can be performed based on the presentation duration to ensure the media presentation effect.
针对以上实施例在具体应用场景中的实现方式,图15示出了本申请实施例在一个应用场景中通过扩展媒体文件数据盒标识子帧呈现时间信息的语法结构。Regarding the implementation of the above embodiment in a specific application scenario, Figure 15 shows the syntax structure of the extended media file data box identifying subframe presentation time information in an application scenario according to the embodiment of the present application.
如图15所示,本申请实施例中的媒体文件数据盒包括如下的字段:As shown in Figure 15, the media file data box in the embodiment of this application includes the following fields:
子帧数量字段nb_subframes,指示当前样本对应的子帧个数。The subframe number field nb_subframes indicates the number of subframes corresponding to the current sample.
呈现时间标志位字段with_unique_duration_flag,取值为0表示样本中对应的多个子帧具备相同的呈现时间,则根据样本本身的呈现时间和样本内的子帧个数即可计算每个子帧的呈现时长。此时该样本对应的sample_delta字段取值应为子帧个数的整数倍。取值为1表示样本中包含的多个子帧具备不同的呈现时长。For the presentation time flag field with_unique_duration_flag, a value of 0 indicates that multiple corresponding subframes in the sample have the same presentation time. The presentation duration of each subframe can be calculated based on the presentation time of the sample itself and the number of subframes in the sample. At this time, the value of the sample_delta field corresponding to the sample should be an integer multiple of the number of subframes. A value of 1 indicates that multiple subframes included in the sample have different presentation durations.
子帧索引字段subframe_index,指示点云子帧的序号。The subframe index field subframe_index indicates the sequence number of the point cloud subframe.
子样本时长字段sub_sample_duration,指示相应子帧的呈现时长。样本中对应的所有子帧的该字段取值之和应等于该样本对应的sample_delta字段取值。The subsample duration field sub_sample_duration indicates the presentation duration of the corresponding subframe. The sum of the values of this field for all corresponding subframes in the sample should be equal to the value of the sample_delta field corresponding to the sample.
基于本申请以上实施例中提供的通过媒体文件数据盒标识子帧相关信息的方案,可以隐含地获取点云子帧的空间信息。当多种子样本sub-sample划分方式共存的时候,可以根据每种子样本sub-sample划分方式中携带的信息找到点云子帧sub-frame和空间分块tile的对应关系。Based on the solution of identifying subframe related information through media file data boxes provided in the above embodiments of the present application, the spatial information of the point cloud subframe can be obtained implicitly. When multiple sub-sample division methods coexist, the corresponding relationship between the point cloud sub-frame and the spatial block tile can be found based on the information carried in each sub-sample division method.
图16示出了本申请一个实施例中基于数据单元、空间分块以及点云子帧三种子样本划分方式确定空间分块对应关系的结构框图。子样本sub-spample的标志值flag=0时,对应于一个G-PCC单元;子样本sub-spample的标志值flag=1时,对应于一个点云空间分块tile;子样本sub-spample的标志值flag=2时,对应于一个子帧sub-frame。Figure 16 shows a structural block diagram of determining the spatial block correspondence based on three sub-sample division methods of data unit, spatial block and point cloud subframe in one embodiment of the present application. When the flag value of the sub-sample sub-spample is flag=0, it corresponds to a G-PCC unit; when the flag value of the sub-sample sub-spample is flag=1, it corresponds to a point cloud space tile; the sub-sample of the sub-spample When the flag value flag=2, it corresponds to a sub-frame.
如图16所示,基于点云子帧可以划分两个点云子样本,子帧索引值为sub frame index=0的第一子帧以及子帧索引值为sub frame index=1的第二子帧。As shown in Figure 16, two point cloud subsamples can be divided based on the point cloud subframe, the first subframe with a subframe index value of subframe index=0 and the second subframe with a subframe index value of subframe index=1 frame.
基于空间分块进行子样本划分,可以确定对应于第一子帧的第一空间分块tile0和第二空间分块tile1,以及对应于第二子帧的一个第三空间分块tile2。By performing sub-sample division based on spatial blocks, a first spatial block tile0 and a second spatial block tile1 corresponding to the first subframe, and a third spatial block tile2 corresponding to the second subframe may be determined.
基于数据单元进行子样本划分,可以确定对应于第一空间分块tile0的多个数据单元,如图中所示的几何片Geo slice0、几何片Geo slice1、颜色属性片Attr color slice0、颜色属性片Attr color slice1、帧索引属性片Attr frameIdx slice0、帧索引属性片Attr frameIdx slice1。By dividing subsamples based on data units, multiple data units corresponding to the first spatial block tile0 can be determined, such as the geometric slice Geo slice0, the geometric slice Geo slice1, the color attribute slice Attr color slice0, and the color attribute slice as shown in the figure. Attr color slice1, frame index attribute slice Attr frameIdx slice0, frame index attribute slice Attr frameIdx slice1.
同时可以确定对应于第二空间分块tile1的多个数据单元,如图中所示的几何片Geo slice2、几何片Geo slice3、颜色属性片Attr color slice2、颜色属性片Attr color slice3、帧索引属性片Attr frameIdx slice2、帧索引属性片Attr frameIdx slice3。At the same time, multiple data units corresponding to the second space partition tile1 can be determined, such as the geometric slice Geo slice2, the geometric slice Geo slice3, the color attribute slice Attr color slice2, the color attribute slice Attr color slice3, and the frame index attribute as shown in the figure. Slice Attr frameIdx slice2, frame index attribute slice Attr frameIdx slice3.
另外还可以确定对应于第三空间分块tile2的多个数据单元,如图中所示的几何片Geo slice4、几何片Geo slice5、颜色属性片Attr color slice4、颜色属性片Attr color slice5、帧索引属性片Attr frameIdx slice4、帧索引属性片Attr frameIdx slice5。In addition, multiple data units corresponding to the third space partition tile2 can also be determined, such as the geometric slice Geo slice4, the geometric slice Geo slice5, the color attribute slice Attr color slice4, the color attribute slice Attr color slice5, and the frame index as shown in the figure. Attr frameIdx slice4, frame index attribute slice Attr frameIdx slice5.
图17示出了本申请一个实施例中基于数据单元、空间分块两种子样本划分方式确定空间分块对应关系的结构框图。Figure 17 shows a structural block diagram for determining the spatial block correspondence based on two sub-sample division methods of data unit and spatial block in an embodiment of the present application.
如图17所示,基于空间分块进行子样本划分,可以确定第一空间分块tile0和第二空间分块tile1。As shown in Figure 17, by performing subsample division based on spatial tiles, the first spatial tile tile0 and the second spatial tile tile1 can be determined.
基于数据单元进行子样本划分,可以确定对应于第一空间分块tile0的多个数据单元,如图中所示的对应于子帧sub-frame0的几何片Geo slice0、颜色属性片Attr color slice1和帧索引属性片Attr frameIdx slice0,以及对应于子帧sub-frame0和sub-frame1的几何片Geo slice1、颜色属性片Attr color slice1和帧索引属性片Attr frameIdx slice1。By dividing subsamples based on data units, multiple data units corresponding to the first spatial block tile0 can be determined, such as the geometric slice Geo slice0, the color attribute slice Attr color slice1 and the corresponding sub-frame sub-frame0 as shown in the figure. The frame index attribute slice Attr frameIdx slice0, and the geometry slice Geo slice1, color attribute slice Attr color slice1 and frame index attribute slice Attr frameIdx slice1 corresponding to sub-frame0 and sub-frame1.
同时可以确定对应于第二空间分块tile1的多个数据单元,如图中所示的对应于子帧sub-frame0和sub-frame1的几何片Geo slice2、颜色属性片Attr color slice2、和帧索引属性片Attr frameIdx slice2,以及对应于子帧sub-frame1的几何片Geo slice3、颜色属性片Attr color slice3和帧索引属性片Attr frameIdx slice3。At the same time, multiple data units corresponding to the second spatial block tile1 can be determined, such as the geometric slice Geo slice2, the color attribute slice Attr color slice2, and the frame index corresponding to the sub-frames sub-frame0 and sub-frame1 as shown in the figure. The attribute slice Attr frameIdx slice2, as well as the geometry slice Geo slice3, the color attribute slice Attr color slice3 and the frame index attribute slice Attr frameIdx slice3 corresponding to the sub-frame sub-frame1.
在本申请的一个实施例中,也可以在媒体文件数据盒中显性地指示点云子帧与空间分块之间的对应关系。在本申请实施例中,点云样本的媒体文件数据盒包括:In one embodiment of the present application, the correspondence between the point cloud subframes and the spatial blocks can also be explicitly indicated in the media file data box. In the embodiment of this application, the media file data box of the point cloud sample includes:
空间分块标志位字段,用于指示当前样本中的点云子帧是否对应于一个或者多个不同的空间分块;The spatial block flag field is used to indicate whether the point cloud subframe in the current sample corresponds to one or more different spatial blocks;
子帧索引字段,用于指示当前点云子帧的索引信息;Subframe index field, used to indicate the index information of the current point cloud subframe;
空间分块数量字段,用于指示当前点云子帧对应的空间分块的数量;及The spatial block number field is used to indicate the number of spatial blocks corresponding to the current point cloud subframe; and
空间分块标识字段,用于指示当前空间分块的标识符。Space block identification field, used to indicate the identifier of the current space block.
本实施子帧索引字段例中,可以通过空间分块标志位字段指示点云子帧对应空间分块的多少,通过子帧索引字段指示点云子帧的索引信息,通过空间分块数量字段指示空间分块的数量,通过空间分块标识字段指示当前空间分块的标识符,以便基于点云样本的媒体文件数据盒中各字段的信息,将点云样本中各个数据单元对应的一个或者多个点云子帧作为组合帧进行共同解码,一方面可以减少对于内容较少的点云帧进行单独解码而产生的计算资源浪费。In the example of the subframe index field in this implementation, the number of spatial blocks corresponding to the point cloud subframe can be indicated through the spatial block flag field, the index information of the point cloud subframe can be indicated through the subframe index field, and the index information of the point cloud subframe can be indicated through the spatial block quantity field. The number of spatial blocks, indicating the identifier of the current spatial block through the spatial block identification field, so that based on the information in each field in the media file data box of the point cloud sample, one or more corresponding data units in the point cloud sample can be Point cloud sub-frames are jointly decoded as combined frames. On the one hand, it can reduce the waste of computing resources caused by individually decoding point cloud frames with less content.
针对以上实施例的具体实现方式,图18示出了本申请实施例在一个应用场景中通过媒体文件数据盒指示点云子帧与空间分块之间的对应关系的语法结构。Regarding the specific implementation of the above embodiments, Figure 18 shows the syntax structure of the embodiment of the present application for indicating the correspondence between point cloud subframes and spatial blocks through media file data boxes in an application scenario.
如图18所示,本申请实施例中的媒体文件数据盒可以包括如下的字段:As shown in Figure 18, the media file data box in this embodiment of the present application may include the following fields:
空间分块标志位字段with_tile_info_flag,取值为1时表示当前样本中的子帧分别对应一个或多个不同的点云空间分块,取值为0时表示当前样本中的子帧无法按照点云空间分块进行区分。Spatial tile flag field with_tile_info_flag. When the value is 1, it means that the subframes in the current sample correspond to one or more different point cloud spatial tiles. When the value is 0, it means that the subframes in the current sample cannot be divided according to the point cloud. Divide space into blocks.
子帧索引字段subframe_index,指示点云子帧的序号The subframe index field subframe_index indicates the sequence number of the point cloud subframe.
空间分块数量字段num_tiles,指示相应点云子帧对应的点云空间分块个数。The spatial tile number field num_tiles indicates the number of point cloud spatial tiles corresponding to the corresponding point cloud subframe.
空间分块标识字段tile_id,指示相应点云空间分块的标识符。The spatial tile identification field tile_id indicates the identifier of the corresponding point cloud spatial tile.
图19示出了本申请一个实施例中点云媒体的编码方法的步骤流程图,该方法可以应用于点云媒体系统的服务器、客户端以及中间节点等环节的电子设备,本申请实施例以安装有点云编码装置的客户端设备执行点云媒体的编码方法作为示例。如图19所示,该点云媒体的编码方法包括如下的步骤S1910至S1930。Figure 19 shows a step flow chart of the point cloud media encoding method in one embodiment of the present application. This method can be applied to electronic devices in the server, client, intermediate node and other links of the point cloud media system. The embodiment of the present application is based on A client device installed with a point cloud encoding device executes a point cloud media encoding method as an example. As shown in Figure 19, the point cloud media encoding method includes the following steps S1910 to S1930.
在步骤S1910中,获取点云源数据,点云源数据包括具有一个或者多个点云子帧的点云帧。In step S1910, point cloud source data is obtained, and the point cloud source data includes a point cloud frame having one or more point cloud subframes.
在步骤S1920中,对点云帧进行编码处理,得到至少一个数据单元。In step S1920, the point cloud frame is encoded to obtain at least one data unit.
在步骤S1930中,对至少一个数据单元进行封装处理,得到点云媒体文件,点云媒体文件包括封装于一个或者多个轨道中的点云样本;点云样本中的各个子样本的媒体文件数据盒包括子帧索引字段;子帧索引字段用于指示与子样本中各个数据单元相对应的一个或者多个点云子帧的索引信息;当子样本中的一个数据单元对应至少两个点云子帧的索引信息时,至少两个点云子帧具有重叠的点云数据。In step S1930, at least one data unit is encapsulated to obtain a point cloud media file. The point cloud media file includes point cloud samples encapsulated in one or more tracks; media file data of each subsample in the point cloud sample. The box includes a subframe index field; the subframe index field is used to indicate the index information of one or more point cloud subframes corresponding to each data unit in the subsample; when one data unit in the subsample corresponds to at least two point clouds When the index information of the subframe is included, at least two point cloud subframes have overlapping point cloud data.
点云源数据包括表示位于各种3D空间(例如,表示真实环境的3D空间、表示虚拟环境的3D空间等)中的对象和/或环境的点云视频(图像和/或视频)。Point cloud source data includes point cloud videos (images and/or videos) representing objects and/or environments located in various 3D spaces (eg, 3D spaces representing real environments, 3D spaces representing virtual environments, etc.).
在本申请的一个实施例中,数据源可以使用一个或多个相机(例如,能够对深度信息进行保护的红外相机、能够提取与深度信息对应的颜色信息的RGB相机等)、投影仪(例如,用于对深度信息进行保护的红外图案投影仪)、LiDRA等采集设备来捕获点云源数据。从点云源数据的深度信息中可以提取由3D空间中的点构成的几何结构的形状,并可以从点云源数据的颜色信息中提取每个点的属性以对点云源数据进行保护。In one embodiment of the present application, the data source may use one or more cameras (for example, an infrared camera capable of protecting depth information, an RGB camera capable of extracting color information corresponding to depth information, etc.), a projector (such as , infrared pattern projectors used to protect depth information), LiDRA and other acquisition devices to capture point cloud source data. The shape of the geometric structure composed of points in the 3D space can be extracted from the depth information of the point cloud source data, and the attributes of each point can be extracted from the color information of the point cloud source data to protect the point cloud source data.
以点云视频数据为例,点云视频可以包括一个或多个点云帧,一个点云帧可以表示一帧点云图像。在本申请的一个实施例中,可以基于面向内技术和面向外技术中的至少一种来捕获点云视频数据。Taking point cloud video data as an example, a point cloud video can include one or more point cloud frames, and one point cloud frame can represent one frame of point cloud image. In one embodiment of the present application, point cloud video data may be captured based on at least one of inward-facing technology and outward-facing technology.
面向内技术是指用设置在中心对象周围的一个或更多个相机(或相机传感器)捕获中心对象的图像的技术。可以使用面向内技术生成向用户提供关键对象的360度图像的点云内容(例如,向用户提供对象(例如,诸如角色、玩家、对象或演员这样的关键对象)的 360度图像的VR/AR内容)。Inward-facing technology refers to a technology that captures images of a central object with one or more cameras (or camera sensors) arranged around the central object. Inward-facing techniques can be used to generate point cloud content that provides the user with 360-degree images of key objects (e.g., VR/AR that provides the user with 360-degree images of key objects such as characters, players, objects, or actors). content).
面向外技术是指用设置在中心对象周围的一个或更多个相机(或相机传感器)捕获中心对象的环境而非中心对象的图像的技术。可以使用面向外技术生成用于提供从用户的角度出现的周围环境的点云内容(例如,表示可以提供给自驾驶车辆的用户的外部环境的内容)。Outward-facing technology refers to a technology that uses one or more cameras (or camera sensors) arranged around the central object to capture the environment of the central object rather than the image of the central object. Point cloud content that provides the surrounding environment as it appears from the user's perspective may be generated using outward-facing techniques (eg, content representing the external environment that may be provided to a user of a self-driving vehicle).
当基于一个或更多个相机的捕获操作来生成点云内容时,坐标系在每个相机当中是不同的,因此,数据源可以在捕获操作之前校准一个或更多个相机以设置全局坐标系。另外,数据源可以通过将任意图像和/或视频与通过上述捕获技术捕获的图像和/或视频进行合成来生成点云内容。数据源可以对所捕获的图像和/或视频执行后处理,例如可以去除不需要的区域(例如背景),识别所捕获的图像和/或视频连接到的空间,并且当存在空间孔时执行填充空间孔的操作等等。When generating point cloud content based on capture operations from one or more cameras, the coordinate system is different within each camera. Therefore, the data source can calibrate one or more cameras to set the global coordinate system prior to the capture operation. . Additionally, the data source may generate point cloud content by compositing arbitrary images and/or videos with images and/or videos captured via the capture techniques described above. The data source may perform post-processing on the captured images and/or videos, which may, for example, remove unwanted areas (such as background), identify the spaces to which the captured images and/or videos are connected, and perform filling when spatial holes are present The operation of space holes and so on.
数据源可以通过对从每个相机保护的点云视频的点执行坐标变换来生成一条点云内容。数据源可以基于每个相机位置的坐标对点执行坐标变换。因此,数据源可以生成一个表示宽泛的空间范围的点云内容,或可以生成具有高密度点的点云内容。The data source can generate a piece of point cloud content by performing coordinate transformations on the points of the point cloud video secured from each camera. The data source can perform coordinate transformations on points based on the coordinates of each camera location. Therefore, the data source can generate a point cloud content that represents a broad spatial extent, or it can generate point cloud content with a high density of points.
本实施例中,通过点云样本中各个子样本的媒体文件数据盒,指示子样本中的各个数据单元与一个或者多个点云子帧的索引信息之间的对应关系,由此可以实现对点云样本中各个数据单元对应的一个或者多个点云子帧作为组合帧进行共同编码,一方面可以减少对于内容较少的点云帧进行单独编码而产生的计算资源浪费,另一方面可以标识点云数据重叠的点云子帧,提高点云媒体的编码效率。In this embodiment, the corresponding relationship between each data unit in the sub-sample and the index information of one or more point cloud sub-frames is indicated through the media file data box of each sub-sample in the point cloud sample, so that the corresponding relationship can be achieved. One or more point cloud subframes corresponding to each data unit in the point cloud sample are jointly encoded as a combined frame. On the one hand, it can reduce the waste of computing resources caused by separately encoding point cloud frames with less content. On the other hand, it can Identify point cloud subframes with overlapping point cloud data to improve the coding efficiency of point cloud media.
图20示出了本申请实施例在流媒体传输应用场景中进行点云数据编解码的流程图。如图20所示,服务器作为生产点云媒体文件的数据源,可以将点云数据编码并发送至用户所在的客户端,通过客户端对点云媒体文件进行解码后可以得到点云数据以供用户消费。具体的点云数据编解码过程可以包括如下步骤。Figure 20 shows a flow chart of encoding and decoding point cloud data in a streaming media transmission application scenario according to an embodiment of the present application. As shown in Figure 20, the server, as the data source for producing point cloud media files, can encode and send the point cloud data to the user's client. After decoding the point cloud media files through the client, the point cloud data can be obtained for use. User consumption. The specific point cloud data encoding and decoding process may include the following steps.
步骤S2010:服务器根据点云码流中各几何slice对应的子帧索引号,确定各个几何slice对应的一个或多个子帧。Step S2010: The server determines one or more subframes corresponding to each geometric slice according to the subframe index number corresponding to each geometric slice in the point cloud code stream.
若每个几何slice仅对应1个子帧,即每个子帧所包含的点云片互不重叠,则在对点云子帧进行子样本封装时,采用flags=2的子样本划分方式并指示相应的子帧索引号。If each geometric slice only corresponds to one subframe, that is, the point cloud slices contained in each subframe do not overlap with each other, then when encapsulating the subsamples of the point cloud subframe, use the subsample division method of flags=2 and indicate the corresponding subframe index number.
若存在几何slice对应对应多个子帧,即每个子帧所包含的点云片存在重叠,则在对点云子帧进行子样本封装时,采用flags=0的子样本划分方式并通过元数据指示相应的子帧索引号。If there are geometric slices corresponding to multiple subframes, that is, the point cloud slices contained in each subframe overlap, then when encapsulating the subsamples of the point cloud subframes, use the subsample division method of flags=0 and indicate it through metadata The corresponding subframe index number.
步骤S2020:服务器将点云码流封装为点云文件,其中点云子帧以子样本的形式进行划分和指示。Step S2020: The server encapsulates the point cloud code stream into a point cloud file, in which the point cloud subframes are divided and indicated in the form of subsamples.
服务器对点云码流的封装可以是单轨封装或者是基于组件的多轨封装。The server's encapsulation of point cloud code streams can be single-track encapsulation or component-based multi-track encapsulation.
当进行单轨封装时,在flags=2的子样本划分方式中,子样本中包含相应子帧对应的所有几何和属性信息。在flags=0的子样本划分方式中,几何数据类型、属性数据类型、参数数据类型的子样本均可以对应到相应的子帧。When performing single-track encapsulation, in the sub-sample division method with flags=2, the sub-samples contain all geometric and attribute information corresponding to the corresponding sub-frame. In the subsample division method with flags=0, subsamples of geometric data type, attribute data type, and parameter data type can all correspond to corresponding subframes.
当进行多轨封装时,在flags=2的子样本划分方式中,仅对几何轨道进行子样本划分,子样本中包含相应子帧对应的所有几何信息。在flags=0的子样本划分方式中,仅几何数据类型、参数数据类型的子样本可以对应到相应的子帧。When multi-track packaging is performed, in the sub-sample division method with flags=2, only the geometric track is divided into sub-samples, and the sub-samples contain all geometric information corresponding to the corresponding sub-frame. In the subsample division method with flags=0, only subsamples of geometric data type and parameter data type can be mapped to the corresponding subframe.
步骤S2030:对于文件中存在的点云子帧,指示这些样本内包含的点云子帧的呈现时间信息。Step S2030: For the point cloud subframes existing in the file, indicate the presentation time information of the point cloud subframes contained in these samples.
步骤S2040:对于点云子帧的空间信息,指示点云子帧和tile的对应关系。Step S2040: For the spatial information of the point cloud subframe, indicate the corresponding relationship between the point cloud subframe and the tile.
步骤S2050:服务器将点云文件传输给客户端。Step S2050: The server transmits the point cloud file to the client.
步骤S2060:客户端在对点云文件进行解封装和解码时,根据点云子帧相关的信息,提取各个点云子帧。Step S2060: When the client decapsulates and decodes the point cloud file, it extracts each point cloud subframe based on the information related to the point cloud subframe.
步骤S2070:客户端对点云序列进行重排序后,结合点云子帧的呈现时间信息进行呈现。Step S2070: After the client reorders the point cloud sequence, it combines the presentation time information of the point cloud subframes for presentation.
本申请实施例针对GPCC点云媒体,提出了一种针对点云子帧的文件封装方法。该文件封装方法在文件封装层面,定义不同场景下点云子帧在样本中的封装方式,指示点云子帧的标识和持续时间,指示点云子帧和点云空间分块的对应关系。本申请可以更灵活地支持点云子帧在文件中的封装,从而支持更多的应用场景,最大化利用点云子帧带来的编码效率提升。The embodiment of this application proposes a file encapsulation method for point cloud subframes for GPCC point cloud media. At the file encapsulation level, this file encapsulation method defines the way point cloud subframes are encapsulated in samples under different scenarios, indicates the identity and duration of point cloud subframes, and indicates the correspondence between point cloud subframes and point cloud spatial blocks. This application can more flexibly support the encapsulation of point cloud subframes in files, thereby supporting more application scenarios and maximizing the coding efficiency improvement brought by point cloud subframes.
本申请实施例在点云子帧互不重叠的应用场景中的一种实现方案如下。An implementation scheme of the embodiment of the present application in an application scenario where point cloud sub-frames do not overlap is as follows.
(1)服务器根据点云码流中各几何slice对应的子帧索引号,确定各个几何slice对应的子帧。(1) The server determines the subframe corresponding to each geometric slice based on the subframe index number corresponding to each geometric slice in the point cloud code stream.
假设每个子帧所包含的点云片互不重叠,则在对点云子帧进行子样本封装时,采用flags=2的子样本划分方式并指示相应的子帧索引号。Assuming that the point cloud slices contained in each subframe do not overlap with each other, when encapsulating subsamples of the point cloud subframe, the subsample division method of flags=2 is used and the corresponding subframe index number is indicated.
服务器S1将点云码流以单轨方式封装为点云文件F1,其文件封装结果如图21所示。在存在点云子帧的样本中,点云子帧以子样本的形式进行划分和指示。SubSampleInformationBox数据盒中的flags字段取值为2,在sub-sample0和sub-sample1中分别指示sub-frame的索引为1和2。各个点云样本对应各自的点云帧frame,包括点云样本Sample 0、点云样本Sample1和点云样本SampleN,其中,点云样本Sample2对应于点云子帧sub-frame1和点云子帧sub-frame2。Server S1 encapsulates the point cloud code stream into point cloud file F1 in a single-track manner, and the file encapsulation result is shown in Figure 21. In samples where point cloud subframes exist, the point cloud subframes are divided and indicated in the form of subsamples. The flags field in the SubSampleInformationBox data box has a value of 2, indicating that the sub-frame indexes are 1 and 2 in sub-sample0 and sub-sample1 respectively. Each point cloud sample corresponds to its own point cloud frame, including point cloud sample Sample 0, point cloud sample Sample1 and point cloud sample SampleN. Among them, point cloud sample Sample2 corresponds to point cloud sub-frame sub-frame1 and point cloud sub-frame sub -frame2.
服务器S2对将点云码流以基于组件的多轨方式封装为点云文件F2,此时对几何轨道进行子样本划分。对于属性轨道,可选地,可以通过几何轨道和属性轨道之间的索引关系,对应找得属性轨道中所属的数据信息,其封装结果如图22所示。在几何轨道和属性轨道中,各个点云样本对应各自的点云帧frame,各个轨道中点云样本Sample 0、点云样本Sample 1和点云样本Sample N均与相应的点云帧frame对应;在几何轨道中,点云样本Sample2对应于点云子帧sub-frame1和点云子帧sub-frame2;而在属性轨道中,点云样本Sample2也与点云帧frame对应。The server S2 encapsulates the point cloud code stream into a point cloud file F2 in a component-based multi-track manner. At this time, the geometric track is divided into sub-samples. For attribute tracks, optionally, the data information belonging to the attribute track can be found through the index relationship between the geometry track and the attribute track. The encapsulation result is shown in Figure 22. In the geometry track and attribute track, each point cloud sample corresponds to its own point cloud frame. In each track, point cloud sample Sample 0, point cloud sample Sample 1 and point cloud sample Sample N all correspond to the corresponding point cloud frame; In the geometry track, the point cloud sample Sample2 corresponds to the point cloud sub-frame sub-frame1 and the point cloud sub-frame sub-frame2; while in the attribute track, the point cloud sample Sample2 also corresponds to the point cloud frame frame.
另外,也可以对属性轨道对应划分子样本,其封装结果如图23所示。在几何轨道和属性轨道中,各个点云样本对应各自的点云帧frame,各个轨道中点云样本Sample 0、点云样本Sample 1和点云样本Sample N均与相应的点云帧frame对应;在几何轨道和属性轨道中,各自的点云样本Sample2对应于点云子帧sub-frame1和点云子帧sub-frame2。在存在点云子帧的样本中,点云子帧以子样本的形式进行划分和指示。SubSampleInformationBox数据盒中的flags字段取值为2,在sub-sample0和sub-sample1中分别指示sub-frame的索引为1和2。In addition, the attribute track can also be divided into sub-samples correspondingly, and the encapsulation result is shown in Figure 23. In the geometry track and attribute track, each point cloud sample corresponds to its own point cloud frame. In each track, point cloud sample Sample 0, point cloud sample Sample 1 and point cloud sample Sample N all correspond to the corresponding point cloud frame; In the geometry track and attribute track, the respective point cloud sample Sample2 corresponds to the point cloud sub-frame sub-frame1 and the point cloud sub-frame sub-frame2. In samples where point cloud subframes exist, the point cloud subframes are divided and indicated in the form of subsamples. The flags field in the SubSampleInformationBox data box has a value of 2, indicating that the sub-frame indexes are 1 and 2 in sub-sample0 and sub-sample1 respectively.
(2)对于文件中存在的点云子帧,指示这些样本内包含的点云子帧的呈现时间信息。(2) For point cloud subframes that exist in the file, indicate the presentation time information of the point cloud subframes contained within these samples.
SubFrameConfigurationGroupEntry:SubFrameConfigurationGroupEntry:
{nb_subframes=2;{nb_subframes=2;
with_unique_duration_flag=1;with_unique_duration_flag=1;
{subframe_index=1;sub_sample_duration=10}{subframe_index=1;sub_sample_duration=10}
{subframe_index=2;sub_sample_duration=20}{subframe_index=2;sub_sample_duration=20}
}}
通过样本组本身的特性,可以索引到sample2是存在sub-frame的样本,再通过本申请实施例提供字段,则可以知道每一个sub-frame的呈现时长。分别是10个单位timescale(由定义)。Through the characteristics of the sample group itself, it can be indexed that sample2 is a sample with a sub-frame, and then by providing fields in the embodiment of this application, the presentation duration of each sub-frame can be known. Respectively are 10 units of timescale (defined by).
(3)对于点云子帧的空间信息,指示点云子帧和tile的对应关系。(3) For the spatial information of the point cloud subframe, indicate the correspondence between the point cloud subframe and the tile.
SubFrameConfigurationGroupEntry:SubFrameConfigurationGroupEntry:
{nb_subframes=2;{nb_subframes=2;
with_tile_info_flag=1;with_tile_info_flag=1;
{subframe_index=1;num_tiles=2;{tile_id=0,1}}{subframe_index=1;num_tiles=2;{tile_id=0,1}}
{subframe_index=2;num_tiles=1;{tile_id=2}}{subframe_index=2;num_tiles=1;{tile_id=2}}
}}
通过样本组本身的特性,可以索引到sample2是存在sub-frame的样本,再通过本申请实施例提供字段,则可以知道每一个sub-frame对应的tile id信息。Through the characteristics of the sample group itself, it can be indexed that sample2 is a sample with a sub-frame, and then by providing fields in the embodiment of this application, the tile id information corresponding to each sub-frame can be known.
最后结合对于tile id和空间信息的关联指示,即可知道每一个子帧对应的空间信息。Finally, combined with the association indication of tile id and spatial information, the spatial information corresponding to each subframe can be known.
(4)服务器将点云文件传输给客户端。(4) The server transmits the point cloud file to the client.
(5)客户端在对点云文件进行解封装和解码时,根据点云子帧相关的信息,提取各个点云子帧,对点云序列进行重排序后,结合点云子帧的呈现时间信息、空间信息进行呈现。(5) When the client decapsulates and decodes the point cloud file, it extracts each point cloud subframe based on the information related to the point cloud subframe, reorders the point cloud sequence, and combines the presentation time of the point cloud subframe Information and spatial information are presented.
在客户端实现上,可选地,可以在解封装阶段就对点云序列进行重排序,再进行解码。也可以先解封装以及解码后,再根据子帧的信息进行重排序。In client implementation, optionally, the point cloud sequence can be reordered during the decapsulation stage and then decoded. It can also be decapsulated and decoded first, and then reordered according to the subframe information.
本申请实施例在点云子帧存在重叠的应用场景中的一种实现方案如下。An implementation scheme of the embodiment of this application in an application scenario where point cloud subframes overlap is as follows.
(1)服务器根据点云码流中各几何slice对应的子帧索引号,确定各个几何slice对应的子帧。(1) The server determines the subframe corresponding to each geometric slice based on the subframe index number corresponding to each geometric slice in the point cloud code stream.
假设每个子帧所包含的点云片存在重叠,则在对点云子帧进行子样本封装时,采用flags=0的子样本划分方式进行划分。Assuming that the point cloud slices contained in each subframe overlap, when encapsulating subsamples of the point cloud subframes, the subsample division method with flags=0 is used for division.
服务器S1将点云码流以单轨方式封装为点云文件F1,在存在点云子帧的样本中,SubSampleInformationBox数据盒中的flags字段取值为0,每个sub-sample中为一个G-PCC数据单元。其封装结果如图24所示。各个点云样本对应各自的点云帧frame,包括点云样本Sample 0、点云样本Sample1和点云样本SampleN,其中,点云样本Sample2对应于gpcc单元,即对应于gpcc unit。Server S1 encapsulates the point cloud code stream into point cloud file F1 in a single-track manner. In samples with point cloud subframes, the flags field in the SubSampleInformationBox data box has a value of 0, and each sub-sample contains a G-PCC. data unit. The packaging result is shown in Figure 24. Each point cloud sample corresponds to its own point cloud frame, including point cloud sample Sample 0, point cloud sample Sample1 and point cloud sample SampleN. Among them, point cloud sample Sample2 corresponds to the gpcc unit, that is, to the gpcc unit.
结合SubsampleSubframeInfoGroupEntry数据盒中的信息,可以对sample2中各个sub-sample中每个G-PCC数据单元所属的子帧信息进行指示。Combined with the information in the SubsampleSubframeInfoGroupEntry data box, the subframe information to which each G-PCC data unit in each sub-sample in sample2 belongs can be indicated.
{subsample_count=4{subsample_count=4
{related_subframe_num=1;subframe_index=1}{related_subframe_num=1; subframe_index=1}
{related_subframe_num=1;subframe_index=1}{related_subframe_num=1; subframe_index=1}
{related_subframe_num=2;subframe_index=1,2}{related_subframe_num=2; subframe_index=1, 2}
{related_subframe_num=1;subframe_index=2}{related_subframe_num=1; subframe_index=2}
}}
按照sample2中各个sub-sample的顺序,即可将sub-sample和对应的子帧信息进行关联。According to the order of each sub-sample in sample2, the sub-sample can be associated with the corresponding subframe information.
多轨模式的封装在sub-sample的划分上,同前一实施例的处理方式。可以只在几何轨道中划分sub-sample,也可以在几何、属性轨道中都划分sub-sample。The multi-track mode is encapsulated in sub-sample division, which is the same as the processing method in the previous embodiment. Sub-samples can be divided only in the geometry track, or sub-samples can be divided in both the geometry and attribute tracks.
(2)对于文件中存在的点云子帧,指示这些样本内包含的点云子帧的呈现时间信息。(2) For point cloud subframes that exist in the file, indicate the presentation time information of the point cloud subframes contained within these samples.
SubFrameConfigurationGroupEntry:SubFrameConfigurationGroupEntry:
{nb_subframes=2;{nb_subframes=2;
with_unique_duration_flag=0;with_unique_duration_flag=0;
}}
通过中样本组本身的特性,可以索引到sample2是存在sub-frame的样本,再通过本申请实施例提供字段,则可以知道每一个sub-frame的呈现时长均相同。假设sample2的时长是20个单位timescale(由定义),则每个sub-frame为10个单位timescale。Through the characteristics of the sample group itself, it can be indexed that sample2 is a sample with a sub-frame, and then by providing fields in the embodiment of this application, it can be known that the presentation duration of each sub-frame is the same. Assuming that the duration of sample2 is 20 units of timescale (defined by), then each sub-frame is 10 units of timescale.
(3)对于点云子帧的空间信息,指示点云子帧和tile的对应关系。(3) For the spatial information of the point cloud subframe, indicate the correspondence between the point cloud subframe and the tile.
SubFrameConfigurationGroupEntry:SubFrameConfigurationGroupEntry:
{nb_subframes=2;{nb_subframes=2;
with_tile_info_flag=1;with_tile_info_flag=1;
{subframe_index=1;num_tiles=2;{tile_id=0,1}}{subframe_index=1;num_tiles=2;{tile_id=0,1}}
{subframe_index=2;num_tiles=2;{tile_id=1,2}}{subframe_index=2;num_tiles=2;{tile_id=1,2}}
}}
通过中样本组本身的特性,可以索引到sample2是存在sub-frame的样本,再通过本申请实施例提供字段,则可以知道每一个sub-frame对应的tile id信息。Through the characteristics of the sample group itself, it can be indexed that sample2 is a sample with a sub-frame, and then by providing fields in the embodiment of this application, the tile id information corresponding to each sub-frame can be known.
最后结合中对于tile id和空间信息的关联指示,即可知道每一个子帧对应的空间信息。Finally, combined with the correlation indication between tile id and spatial information, the spatial information corresponding to each subframe can be known.
(4)服务器将点云文件传输给客户端。(4) The server transmits the point cloud file to the client.
(5)客户端在对点云文件进行解封装和解码时,根据点云子帧相关的信息,提取各个点云子帧,对点云序列进行重排序后,结合点云子帧的呈现时间信息、空间信息进行呈现。(5) When the client decapsulates and decodes the point cloud file, it extracts each point cloud subframe based on the information related to the point cloud subframe, reorders the point cloud sequence, and combines the presentation time of the point cloud subframe Information and spatial information are presented.
在客户端实现上,可选地,可以在解封装阶段就对点云序列进行重排序,再进行解码。也可以先解封装以及解码后,再根据子帧的信息进行重排序。In client implementation, optionally, the point cloud sequence can be reordered during the decapsulation stage and then decoded. It can also be decapsulated and decoded first, and then reordered according to the subframe information.
应当注意,尽管在附图中以特定顺序描述了本申请中方法的各个步骤,但是,这并非要求或者暗示必须按照该特定顺序来执行这些步骤,或是必须执行全部所示的步骤才能实现期望的结果。附加的或备选的,可以省略某些步骤,将多个步骤合并为一个步骤执行,以及/或者将一个步骤分解为多个步骤执行等。It should be noted that although the various steps of the methods in this application are described in a specific order in the drawings, this does not require or imply that these steps must be performed in that specific order, or that all of the steps shown must be performed to achieve the desired results. the result of. Additionally or alternatively, certain steps may be omitted, multiple steps may be combined into one step for execution, and/or one step may be decomposed into multiple steps for execution, etc.
以下介绍本申请的装置实施例,可以用于执行本申请上述实施例中的点云媒体的编解码方法。图25示意性地示出了本申请实施例提供的点云媒体的解码装置的结构框图。如图25所示,点云媒体的解码装置2500包括:The following describes device embodiments of the present application, which can be used to perform the point cloud media encoding and decoding methods in the above embodiments of the present application. Figure 25 schematically shows a structural block diagram of a point cloud media decoding device provided by an embodiment of the present application. As shown in Figure 25, the point cloud media decoding device 2500 includes:
获取模块2510,被配置为获取点云媒体文件,所述点云媒体文件包括封装于一个或者多个轨道中的点云样本;The acquisition module 2510 is configured to acquire point cloud media files, where the point cloud media files include point cloud samples encapsulated in one or more tracks;
解析模块2520,被配置为解析所述点云样本中的各个子样本的媒体文件数据盒,得到子样本标志位字段的取值;The parsing module 2520 is configured to parse the media file data box of each sub-sample in the point cloud sample to obtain the value of the sub-sample flag field;
索引模块2530,被配置为根据所述子样本标志位字段的取值获取与所述子样本中各个数据单元相对应的一个或者多个点云子帧的索引信息;当所述子样本中的一个数据单元对应至少两个点云子帧的索引信息时,所述至少两个点云子帧具有重叠的点云数据;及The index module 2530 is configured to obtain the index information of one or more point cloud subframes corresponding to each data unit in the subsample according to the value of the subsample flag field; when the subsample When one data unit corresponds to the index information of at least two point cloud subframes, the at least two point cloud subframes have overlapping point cloud data; and
解码模块2540,被配置为根据所述一个或者多个点云子帧的索引信息对所述点云媒体文件进行解封装和解码处理,得到点云数据。The decoding module 2540 is configured to decapsulate and decode the point cloud media file according to the index information of the one or more point cloud subframes to obtain point cloud data.
图26示意性地示出了本申请实施例提供的点云媒体的编码装置的结构框图。如图26所示,点云媒体的编码装置2600包括:Figure 26 schematically shows a structural block diagram of a point cloud media encoding device provided by an embodiment of the present application. As shown in Figure 26, the point cloud media encoding device 2600 includes:
获取模块2610,被配置为获取点云源数据,所述点云源数据包括具有一个或者多个点云子帧的点云帧;The acquisition module 2610 is configured to acquire point cloud source data, where the point cloud source data includes a point cloud frame having one or more point cloud subframes;
编码模块2620,被配置为对所述点云帧进行编码处理,得到至少一个数据单元;及The encoding module 2620 is configured to encode the point cloud frame to obtain at least one data unit; and
封装模块2630,被配置为对所述至少一个数据单元进行封装处理,得到点云媒体文件,所述点云媒体文件包括封装于一个或者多个轨道中的点云样本;所述点云样本中的各个子样本的媒体文件数据盒包括子帧索引字段;所述子帧索引字段用于指示与所述子样本中各个数据单元相对应的一个或者多个点云子帧的索引信息;当所述子样本中的一个数据单元对应至少两个点云子帧的索引信息时,所述至少两个点云子帧具有重叠的点云数据。The encapsulating module 2630 is configured to encapsulate the at least one data unit to obtain a point cloud media file. The point cloud media file includes a point cloud sample encapsulated in one or more tracks; in the point cloud sample The media file data box of each subsample includes a subframe index field; the subframe index field is used to indicate the index information of one or more point cloud subframes corresponding to each data unit in the subsample; when the When one data unit in the subsample corresponds to the index information of at least two point cloud subframes, the at least two point cloud subframes have overlapping point cloud data.
本申请各实施例中提供的点云媒体的编码装置和解码装置的具体细节已经在对应的方法实施例中进行了详细的描述,此处不再赘述。The specific details of the point cloud media encoding device and decoding device provided in each embodiment of the present application have been described in detail in the corresponding method embodiments, and will not be described again here.
图27示意性地示出了用于实现本申请实施例的电子设备的计算机系统结构框图。Figure 27 schematically shows a block diagram of a computer system used to implement an electronic device according to an embodiment of the present application.
需要说明的是,图27示出的电子设备的计算机系统2700仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。It should be noted that the computer system 2700 of the electronic device shown in FIG. 27 is only an example, and should not impose any restrictions on the functions and scope of use of the embodiments of the present application.
如图27所示,计算机系统2700包括中央处理器2701(Central Processing Unit,CPU),其可以根据存储在只读存储器2702(Read-Only Memory,ROM)中的计算机可读指令或者从存储部分2708加载到随机访问存储器2703(Random Access Memory,RAM)中的计算机可读指令而执行各种适当的动作和处理。在随机访问存储器2703中,还存储有系统操作所需的各种计算机可读指令和数据。中央处理器2701、在只读存储器2702以及随机访问存储器2703通过总线2704彼此相连。输入/输出接口2705(Input/Output接口,即I/O接口)也连接至总线2704。As shown in Figure 27, the computer system 2700 includes a central processing unit 2701 (Central Processing Unit, CPU), which can process data according to computer readable instructions stored in a read-only memory 2702 (Read-Only Memory, ROM) or from a storage portion 2708 The computer-readable instructions loaded into the random access memory 2703 (Random Access Memory, RAM) perform various appropriate actions and processes. In the random access memory 2703, various computer readable instructions and data required for system operation are also stored. The central processing unit 2701, the read-only memory 2702 and the random access memory 2703 are connected to each other through a bus 2704. The input/output interface 2705 (Input/Output interface, ie, I/O interface) is also connected to the bus 2704.
以下部件连接至输入/输出接口2705:包括键盘、鼠标等的输入部分2706;包括诸如阴极射线管(Cathode Ray Tube,CRT)、液晶显示器(Liquid Crystal Display,LCD)等以及扬声器等的输出部分2707;包括硬盘等的存储部分2708;以及包括诸如局域网卡、调制解调器等的网络接口卡的通信部分2709。通信部分2709经由诸如因特网的网络执行通信处理。驱动器2710也根据需要连接至输入/输出接口2705。可拆卸介质2711,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器2710上,以便于从其上读出的计算机可读指令根据需要被安装入存储部分2708。The following components are connected to the input/output interface 2705: an input part 2706 including a keyboard, a mouse, etc.; an output part 2707 including a cathode ray tube (Cathode Ray Tube, CRT), a liquid crystal display (Liquid Crystal Display, LCD), etc., and a speaker, etc. ; A storage section 2708 including a hard disk, etc.; and a communication section 2709 including a network interface card such as a LAN card, a modem, etc. The communication section 2709 performs communication processing via a network such as the Internet. Driver 2710 is also connected to input/output interface 2705 as needed. Removable media 2711, such as magnetic disks, optical disks, magneto-optical disks, semiconductor memories, etc., are installed on the drive 2710 as needed so that computer readable instructions read therefrom are installed into the storage portion 2708 as needed.
特别地,根据本申请的实施例,各个方法流程图中所描述的过程可以被实现为计算机可读指令。例如,本申请的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机可读指令,该计算机可读指令包含用于执行流程图所示的方法的计算机可读指令代码。在这样的实施例中,该计算机可读指令可以通过通信部分2709从网络上被下载和安装,和/或从可拆卸介质2711被安装。在该计算机可读指令被中央处理器2701执行时,执行本申请的系统中限定的各种功能。In particular, according to embodiments of the present application, the processes described in each method flowchart may be implemented as computer-readable instructions. For example, embodiments of the present application include a computer program product including computer-readable instructions carried on a computer-readable medium, the computer-readable instructions including computer-readable instruction code for performing the method illustrated in the flowchart . In such embodiments, the computer readable instructions may be downloaded and installed from the network via communications portion 2709 and/or installed from removable media 2711. When the computer readable instructions are executed by the central processor 2701, various functions defined in the system of the present application are performed.
需要说明的是,本申请实施例所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、闪存、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请中,计算机可读存储介质可以是任何包含或存储计算机可读指令的有形介质,该计算机可读指令可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本申请中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的计算机可读指令。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、有线等等,或者上述的任意合适的组合。It should be noted that the computer-readable medium shown in the embodiments of the present application may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. The computer-readable storage medium may be, for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination thereof. More specific examples of computer readable storage media may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard drive, random access memory (RAM), read only memory (ROM), removable Programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), flash memory, optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or any of the above suitable The combination. As used herein, a computer-readable storage medium may be any tangible medium that contains or stores computer-readable instructions that may be used by or in connection with an instruction execution system, apparatus, or device. In this application, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, in which computer-readable program code is carried. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that can be sent, propagated, or transmitted for use by or in connection with an instruction execution system, apparatus, or device. Read instructions. Program code embodied on a computer-readable medium may be transmitted using any suitable medium, including but not limited to: wireless, wired, etc., or any suitable combination of the above.
附图中的流程图和框图,图示了按照本申请各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉 及的功能而定。也要注意的是,框图或流程图中的每个方框、以及框图或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operations of possible implementations of systems, methods, and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logic functions that implement the specified executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved. It will also be noted that each block in the block diagram or flowchart illustration, and combinations of blocks in the block diagram or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or operations, or may be implemented by special purpose hardware-based systems that perform the specified functions or operations. Achieved by a combination of specialized hardware and computer instructions.
应当注意,尽管在上文详细描述中提及了用于动作执行的设备的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本申请的实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。It should be noted that although several modules or units of equipment for action execution are mentioned in the above detailed description, this division is not mandatory. In fact, according to the embodiments of the present application, the features and functions of two or more modules or units described above may be embodied in one module or unit. Conversely, the features and functions of one module or unit described above may be further divided into being embodied by multiple modules or units.
通过以上的实施方式的描述,本领域的技术人员易于理解,这里描述的示例实施方式可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本申请实施方式的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、触控终端、或者网络设备等)执行根据本申请实施方式的方法。Through the above description of the embodiments, those skilled in the art can easily understand that the example embodiments described here can be implemented by software, or can be implemented by software combined with necessary hardware. Therefore, the technical solution according to the embodiment of the present application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.) or on the network , including several instructions to cause a computing device (which can be a personal computer, server, touch terminal, or network device, etc.) to execute the method according to the embodiment of the present application.
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本申请的其它实施方案。本申请旨在涵盖本申请的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本申请的一般性原理并包括本申请未公开的本技术领域中的公知常识或惯用技术手段。Other embodiments of the present application will be readily apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of this application that follow the general principles of this application and include common knowledge or customary technical means in the technical field that are not disclosed in this application. .
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本申请的范围仅由所附的权利要求来限制。It is to be understood that the present application is not limited to the precise structures described above and illustrated in the accompanying drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (18)

  1. 一种点云媒体的解码方法,由电子设备执行,其特征在于,包括:A method for decoding point cloud media, executed by electronic equipment, characterized by including:
    获取点云媒体文件,所述点云媒体文件包括封装于一个或者多个轨道中的点云样本;Obtaining a point cloud media file, the point cloud media file including point cloud samples encapsulated in one or more tracks;
    解析所述点云样本中的各个子样本的媒体文件数据盒,得到子样本标志位字段的取值;Analyze the media file data box of each sub-sample in the point cloud sample to obtain the value of the sub-sample flag field;
    根据所述子样本标志位字段的取值获取与所述子样本中各个数据单元相对应的一个或者多个点云子帧的索引信息;当所述子样本中的一个数据单元对应至少两个点云子帧的索引信息时,所述至少两个点云子帧具有重叠的点云数据;及Obtain the index information of one or more point cloud subframes corresponding to each data unit in the subsample according to the value of the subsample flag field; when one data unit in the subsample corresponds to at least two When the index information of point cloud sub-frames is provided, the at least two point cloud sub-frames have overlapping point cloud data; and
    根据所述一个或者多个点云子帧的索引信息对所述点云媒体文件进行解封装和解码处理,得到点云数据。The point cloud media file is decapsulated and decoded according to the index information of the one or more point cloud subframes to obtain point cloud data.
  2. 根据权利要求1所述的点云媒体的解码方法,其特征在于,所述子样本标志位字段还用于指示所述子样本的划分方式;所述点云样本中的各个子样本的划分方式包括:The decoding method of point cloud media according to claim 1, characterized in that the sub-sample flag field is also used to indicate the division method of the sub-sample; the division method of each sub-sample in the point cloud sample include:
    当所述子样本标志位字段的取值为第一数值时,基于数据单元划分子样本,以使一个子样本中包含一个数据单元;When the value of the sub-sample flag field is the first value, divide the sub-samples based on the data unit so that one sub-sample contains one data unit;
    当所述子样本标志位字段的取值为第二数值时,基于空间分块划分子样本,以使一个子样本包含对应于一个第一划分对象的一个或多个连续的数据单元,所述第一划分对象包括空间分块、参数集合、空间分块集合信息或者帧边界标识中的至少一个;及When the value of the sub-sample flag field is the second value, the sub-samples are divided based on the spatial block, so that one sub-sample contains one or more continuous data units corresponding to a first division object, the The first division object includes at least one of a spatial block, a parameter set, a spatial block set information, or a frame boundary identifier; and
    当所述子样本标志位字段的取值为第三数值时,基于点云子帧划分子样本,以使一个子样本中包含对应于一个第二划分对象的一个或多个连续的数据单元,所述第二划分对象包括一个完整的点云子帧。When the value of the sub-sample flag field is a third value, the sub-samples are divided based on the point cloud sub-frame, so that one sub-sample contains one or more continuous data units corresponding to a second division object, The second division object includes a complete point cloud subframe.
  3. 根据权利要求2所述的点云媒体的解码方法,其特征在于,当所述子样本标志位字段的取值为第三数值时,所述子样本的媒体文件数据盒包括:The decoding method of point cloud media according to claim 2, characterized in that when the value of the sub-sample flag field is a third value, the media file data box of the sub-sample includes:
    子帧索引字段,所述子帧索引字段用于指示当前子样本中包含的所述点云子帧的索引信息。Subframe index field, the subframe index field is used to indicate the index information of the point cloud subframe contained in the current subsample.
  4. 根据权利要求1所述的点云媒体的解码方法,其特征在于,所述子样本标志位字段还用于指示所述子样本的划分方式;所述点云样本中的各个子样本的划分方式包括:The decoding method of point cloud media according to claim 1, characterized in that the sub-sample flag field is also used to indicate the division method of the sub-sample; the division method of each sub-sample in the point cloud sample include:
    当所述子样本标志位字段的取值为第一数值时,基于数据单元划分子样本,以使一个子样本中包含一个数据单元;When the value of the sub-sample flag field is the first value, divide the sub-samples based on the data unit so that one sub-sample contains one data unit;
    当所述子样本标志位字段的取值为第二数值时,基于空间分块划分子样本,以使一个子样本包含对应于一个第一划分对象的一个或多个连续的数据单元,所述第一划分对象包括空间分块、参数集合、空间分块集合信息或者帧边界标识中的至少一个;及When the value of the sub-sample flag field is the second value, the sub-samples are divided based on the spatial block, so that one sub-sample contains one or more continuous data units corresponding to a first division object, the The first division object includes at least one of a spatial block, a parameter set, a spatial block set information, or a frame boundary identifier; and
    当所述子样本标志位字段的取值为第三数值时,基于点云子帧划分子样本,以使一个子样本中包含对应于一个第二划分对象的一个或多个连续的数据单元,所述第二划分对象包括一个或者多个点云子帧。When the value of the sub-sample flag field is a third value, the sub-samples are divided based on the point cloud sub-frame, so that one sub-sample contains one or more continuous data units corresponding to a second division object, The second division object includes one or more point cloud subframes.
  5. 根据权利要求4所述的点云媒体的解码方法,其特征在于,当所述子样本标志位字段的取值为第三数值时,所述子样本的媒体文件数据盒包括:The decoding method of point cloud media according to claim 4, characterized in that when the value of the sub-sample flag field is a third value, the media file data box of the sub-sample includes:
    子帧完整标志位字段,用于指示当前子样本是否包含构成点云子帧的所有数据;The subframe complete flag field is used to indicate whether the current subsample contains all data that constitutes the point cloud subframe;
    子帧数量字段,用于指示当前子样本对应的点云子帧的数量;及The number of subframes field is used to indicate the number of point cloud subframes corresponding to the current subsample; and
    子帧索引字段,用于指示当前子样本对应的点云子帧的索引信息。The subframe index field is used to indicate the index information of the point cloud subframe corresponding to the current subsample.
  6. 根据权利要求5所述的点云媒体的解码方法,其特征在于,当所述点云样本封装于一个轨道时,所述构成点云子帧的所有数据包括所有的几何数据和所有的属性数据;当所述点云样本封装于多个轨道时,所述构成点云子帧的所有数据包括所有的几何数据或者所有的属性数据。The decoding method of point cloud media according to claim 5, characterized in that when the point cloud sample is encapsulated in a track, all data constituting the point cloud subframe include all geometric data and all attribute data. ; When the point cloud samples are encapsulated in multiple tracks, all data constituting the point cloud subframe includes all geometric data or all attribute data.
  7. 根据权利要求1所述的点云媒体的解码方法,其特征在于,当所述子样本标志位字段的取值为第一数值时,所述点云子帧的媒体文件数据盒包括:The decoding method of point cloud media according to claim 1, characterized in that when the value of the sub-sample flag field is a first value, the media file data box of the point cloud sub-frame includes:
    相关子帧数量字段,用于指示当前子样本对应的点云子帧的数量;及The related subframe number field is used to indicate the number of point cloud subframes corresponding to the current subsample; and
    子帧索引字段,用于指示当前子样本对应的点云子帧的索引信息。The subframe index field is used to indicate the index information of the point cloud subframe corresponding to the current subsample.
  8. 根据权利要求7所述的点云媒体的解码方法,其特征在于,当所述子样本标志位字段的取值为第一数值时,所述点云子帧的媒体文件数据盒还包括:The decoding method of point cloud media according to claim 7, characterized in that when the value of the sub-sample flag field is a first value, the media file data box of the point cloud sub-frame further includes:
    子样本数量字段,用于指示当前样本中包含的子样本的数量。Number of subsamples field, used to indicate the number of subsamples contained in the current sample.
  9. 根据权利要求8所述的点云媒体的解码方法,其特征在于,当所述子样本标志位字段的取值为第一数值时,所述点云子帧的媒体文件数据盒还包括:The decoding method of point cloud media according to claim 8, characterized in that when the value of the sub-sample flag field is a first value, the media file data box of the point cloud sub-frame further includes:
    子帧相关样本数量字段,用于指示包含多个点云子帧的点云样本的数量;及The subframe-related sample number field is used to indicate the number of point cloud samples that contain multiple point cloud subframes; and
    样本序号差值字段,用于指示在解码顺序下,当前包含多个点云子帧的点云样本与前一个包含多个点云子帧的点云样本之间的序号差值。The sample serial number difference field is used to indicate the serial number difference between the current point cloud sample containing multiple point cloud subframes and the previous point cloud sample containing multiple point cloud subframes in the decoding order.
  10. 根据权利要求1至9中任意一项所述的点云媒体的解码方法,其特征在于,当所述点云样本中的各个子样本的划分方式为基于点云子帧划分子样本时,所述子样本的媒体文件数据盒包括:The decoding method of point cloud media according to any one of claims 1 to 9, characterized in that when each sub-sample in the point cloud sample is divided into sub-samples based on point cloud sub-frames, The media file data box for the subsample described above includes:
    呈现时间标志位字段,用于指示所述点云样本中包含的各个点云子帧是否具有相同的呈现时长;及The presentation time flag field is used to indicate whether each point cloud subframe included in the point cloud sample has the same presentation duration; and
    子样本时长字段,用于指示当所述点云样本中包含的各个点云子帧具有不同的呈现时长时,当前子样本的呈现时长。The subsample duration field is used to indicate the presentation duration of the current subsample when each point cloud subframe included in the point cloud sample has different presentation durations.
  11. 根据权利要求1至9中任意一项所述的点云媒体的解码方法,其特征在于,所述点云样本的媒体文件数据盒包括:The method for decoding point cloud media according to any one of claims 1 to 9, characterized in that the media file data box of the point cloud sample includes:
    子帧数量字段,用于指示当前点云样本中包含的点云子帧的数量;The number of subframes field is used to indicate the number of point cloud subframes contained in the current point cloud sample;
    呈现时间标志位字段,用于指示所述点云样本中包含的各个点云子帧是否具有相同的呈现时长;The presentation time flag field is used to indicate whether each point cloud subframe included in the point cloud sample has the same presentation duration;
    子帧索引字段,用于指示当所述点云样本中包含的各个点云子帧具有不同的呈现时长时,当前子样本对应的点云子帧的索引信息;及The subframe index field is used to indicate the index information of the point cloud subframe corresponding to the current subsample when each point cloud subframe included in the point cloud sample has different presentation duration; and
    子样本时长字段,用于指示当所述点云样本中包含的各个点云子帧具有不同的呈现时长时,当前子样本的呈现时长。The subsample duration field is used to indicate the presentation duration of the current subsample when each point cloud subframe included in the point cloud sample has different presentation durations.
  12. 根据权利要求1至9中任意一项所述的点云媒体的解码方法,其特征在于,所述点云样本的媒体文件数据盒包括:The method for decoding point cloud media according to any one of claims 1 to 9, characterized in that the media file data box of the point cloud sample includes:
    空间分块标志位字段,用于指示当前样本中的点云子帧是否对应于一个或者多个不同的空间分块;The spatial block flag field is used to indicate whether the point cloud subframe in the current sample corresponds to one or more different spatial blocks;
    子帧索引字段,用于指示当前点云子帧的索引信息;Subframe index field, used to indicate the index information of the current point cloud subframe;
    空间分块数量字段,用于指示当前点云子帧对应的空间分块的数量;及The spatial block number field is used to indicate the number of spatial blocks corresponding to the current point cloud subframe; and
    空间分块标识字段,用于指示当前空间分块的标识符。Space block identification field, used to indicate the identifier of the current space block.
  13. 一种点云媒体的编码方法,由电子设备执行,其特征在于,包括:A point cloud media encoding method, executed by electronic equipment, is characterized by including:
    获取点云源数据,所述点云源数据包括具有一个或者多个点云子帧的点云帧;Obtaining point cloud source data, the point cloud source data includes a point cloud frame having one or more point cloud subframes;
    对所述点云帧进行编码处理,得到至少一个数据单元;及Encoding the point cloud frame to obtain at least one data unit; and
    对所述至少一个数据单元进行封装处理,得到点云媒体文件,所述点云媒体文件包括封装于一个或者多个轨道中的点云样本;所述点云样本中的各个子样本的媒体文件数据盒包括子帧索引字段;所述子帧索引字段用于指示与所述子样本中各个数据单元相对应的一个或者多个点云子帧的索引信息;当所述子样本中的一个数据单元对应至少两个点云子帧的索引信息时,所述至少两个点云子帧具有重叠的点云数据。The at least one data unit is encapsulated to obtain a point cloud media file. The point cloud media file includes point cloud samples encapsulated in one or more tracks; media files for each subsample in the point cloud sample. The data box includes a subframe index field; the subframe index field is used to indicate index information of one or more point cloud subframes corresponding to each data unit in the subsample; when one data unit in the subsample When the unit corresponds to the index information of at least two point cloud subframes, the at least two point cloud subframes have overlapping point cloud data.
  14. 一种点云媒体的解码装置,其特征在于,包括:A point cloud media decoding device, characterized by including:
    获取模块,被配置为获取点云媒体文件,所述点云媒体文件包括封装于一个或者多个轨道中的点云样本;An acquisition module configured to acquire point cloud media files, where the point cloud media files include point cloud samples encapsulated in one or more tracks;
    解析模块,被配置为解析所述点云样本中的各个子样本的媒体文件数据盒,得到子 样本标志位字段的取值;The parsing module is configured to parse the media file data box of each sub-sample in the point cloud sample and obtain the value of the sub-sample flag field;
    索引模块,被配置为根据所述子样本标志位字段的取值获取与所述子样本中各个数据单元相对应的一个或者多个点云子帧的索引信息;当所述子样本中的一个数据单元对应至少两个点云子帧的索引信息时,所述至少两个点云子帧具有重叠的点云数据;及An index module configured to obtain the index information of one or more point cloud subframes corresponding to each data unit in the subsample according to the value of the subsample flag field; when one of the subsamples When the data unit corresponds to the index information of at least two point cloud subframes, the at least two point cloud subframes have overlapping point cloud data; and
    解码模块,被配置为根据所述一个或者多个点云子帧的索引信息对所述点云媒体文件进行解封装和解码处理,得到点云数据。The decoding module is configured to decapsulate and decode the point cloud media file according to the index information of the one or more point cloud subframes to obtain point cloud data.
  15. 一种点云媒体的编码装置,其特征在于,包括:A point cloud media encoding device, characterized by including:
    获取模块,被配置为获取点云源数据,所述点云源数据包括具有一个或者多个点云子帧的点云帧;An acquisition module configured to acquire point cloud source data, where the point cloud source data includes a point cloud frame having one or more point cloud subframes;
    编码模块,被配置为对所述点云帧进行编码处理,得到至少一个数据单元;及An encoding module configured to encode the point cloud frame to obtain at least one data unit; and
    封装模块,被配置为对所述至少一个数据单元进行封装处理,得到点云媒体文件,所述点云媒体文件包括封装于一个或者多个轨道中的点云样本;所述点云样本中的各个子样本的媒体文件数据盒包括子帧索引字段;所述子帧索引字段用于指示与所述子样本中各个数据单元相对应的一个或者多个点云子帧的索引信息;当所述子样本中的一个数据单元对应至少两个点云子帧的索引信息时,所述至少两个点云子帧具有重叠的点云数据。An encapsulation module, configured to encapsulate the at least one data unit to obtain a point cloud media file, where the point cloud media file includes point cloud samples encapsulated in one or more tracks; The media file data box of each subsample includes a subframe index field; the subframe index field is used to indicate the index information of one or more point cloud subframes corresponding to each data unit in the subsample; when the When one data unit in the subsample corresponds to the index information of at least two point cloud subframes, the at least two point cloud subframes have overlapping point cloud data.
  16. 一种计算机可读介质,其特征在于,所述计算机可读介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现权利要求1至13中任意一项所述的方法。A computer-readable medium, characterized in that computer-readable instructions are stored on the computer-readable medium, and when the computer-readable instructions are executed by a processor, the method of any one of claims 1 to 13 is implemented. .
  17. 一种电子设备,其特征在于,包括:An electronic device, characterized by including:
    处理器;以及processor; and
    存储器,用于存储所述处理器的计算机可读指令;memory for storing computer readable instructions for the processor;
    其中,所述处理器配置为经由执行所述计算机可读指令使得所述电子设备执行权利要求1至13中任意一项所述的方法。Wherein, the processor is configured to cause the electronic device to perform the method of any one of claims 1 to 13 via execution of the computer readable instructions.
  18. 一种计算机程序产品,包括计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现权利要求1至13中任意一项所述的方法。A computer program product comprising computer readable instructions, characterized in that when the computer readable instructions are executed by a processor, the method of any one of claims 1 to 13 is implemented.
PCT/CN2022/137764 2022-04-22 2022-12-09 Point cloud media encoding method and apparatus, point cloud media decoding method and apparatus, and electronic device and storage medium WO2023202095A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210428152.6A CN114697668B (en) 2022-04-22 2022-04-22 Encoding and decoding method of point cloud media and related products
CN202210428152.6 2022-04-22

Publications (1)

Publication Number Publication Date
WO2023202095A1 true WO2023202095A1 (en) 2023-10-26

Family

ID=82145147

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/137764 WO2023202095A1 (en) 2022-04-22 2022-12-09 Point cloud media encoding method and apparatus, point cloud media decoding method and apparatus, and electronic device and storage medium

Country Status (2)

Country Link
CN (2) CN116744007A (en)
WO (1) WO2023202095A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2613853B (en) * 2021-12-16 2024-01-24 Canon Kk Method, device, and computer program for optimizing encapsulation of point cloud data
CN116744007A (en) * 2022-04-22 2023-09-12 腾讯科技(深圳)有限公司 Encoding and decoding method of point cloud media and related products
CN115396647B (en) * 2022-08-22 2024-04-26 腾讯科技(深圳)有限公司 Data processing method, device and equipment for immersion medium and storage medium
CN115834857B (en) * 2022-11-24 2024-03-19 腾讯科技(深圳)有限公司 Point cloud data processing method, device, equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210105492A1 (en) * 2019-10-02 2021-04-08 Nokia Technologies Oy Method and apparatus for storage and signaling of sub-sample entry descriptions
CN112740282A (en) * 2018-09-18 2021-04-30 Vid拓展公司 Method and apparatus for point cloud compressed bitstream format
US20210168400A1 (en) * 2018-09-21 2021-06-03 Panasonic Intellectual Property Corporation Of America Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device
CN112997498A (en) * 2018-11-13 2021-06-18 松下电器(美国)知识产权公司 Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device
US20210329052A1 (en) * 2020-04-13 2021-10-21 Lg Electronics Inc. Point cloud data transmission apparatus, point cloud data transmission method, point cloud data reception apparatus and point cloud data reception method
CN114697668A (en) * 2022-04-22 2022-07-01 腾讯科技(深圳)有限公司 Encoding and decoding method of point cloud media and related product

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102217313B (en) * 2011-05-26 2013-10-02 华为技术有限公司 Method, device and system for resetting and extracting media data in fragments
US11012713B2 (en) * 2018-07-12 2021-05-18 Apple Inc. Bit stream structure for compressed point cloud data
GB2580602A (en) * 2019-01-14 2020-07-29 Vividq Ltd Holographic display system and method
CN114079781B (en) * 2020-08-18 2023-08-22 腾讯科技(深圳)有限公司 Data processing method, device and equipment of point cloud media and storage medium
CN114241119A (en) * 2020-09-07 2022-03-25 深圳荆虹科技有限公司 Game model generation method, device and system and computer storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112740282A (en) * 2018-09-18 2021-04-30 Vid拓展公司 Method and apparatus for point cloud compressed bitstream format
US20210168400A1 (en) * 2018-09-21 2021-06-03 Panasonic Intellectual Property Corporation Of America Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device
CN112997498A (en) * 2018-11-13 2021-06-18 松下电器(美国)知识产权公司 Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device
US20210105492A1 (en) * 2019-10-02 2021-04-08 Nokia Technologies Oy Method and apparatus for storage and signaling of sub-sample entry descriptions
US20210329052A1 (en) * 2020-04-13 2021-10-21 Lg Electronics Inc. Point cloud data transmission apparatus, point cloud data transmission method, point cloud data reception apparatus and point cloud data reception method
CN114697668A (en) * 2022-04-22 2022-07-01 腾讯科技(深圳)有限公司 Encoding and decoding method of point cloud media and related product

Also Published As

Publication number Publication date
CN114697668A (en) 2022-07-01
CN116744007A (en) 2023-09-12
CN114697668B (en) 2023-06-30

Similar Documents

Publication Publication Date Title
WO2023202095A1 (en) Point cloud media encoding method and apparatus, point cloud media decoding method and apparatus, and electronic device and storage medium
US11070893B2 (en) Method and apparatus for encoding media data comprising generated content
US11805304B2 (en) Method, device, and computer program for generating timed media data
US11606576B2 (en) Method and apparatus for generating media file comprising 3-dimensional video content, and method and apparatus for replaying 3-dimensional video content
JP2020503792A (en) Information processing method and apparatus
CN113891117B (en) Immersion medium data processing method, device, equipment and readable storage medium
CN115379189A (en) Data processing method of point cloud media and related equipment
WO2021209044A1 (en) Multimedia data transmission and reception methods, system, processor, and player
WO2024041238A1 (en) Point cloud media data processing method and related device
JP7402976B2 (en) File format for point cloud data
WO2023226504A1 (en) Media data processing methods and apparatuses, device, and readable storage medium
CN115396647B (en) Data processing method, device and equipment for immersion medium and storage medium
KR20150045349A (en) Method and apparatus for constructing sensory effect media data file, method and apparatus for playing sensory effect media data file and structure of the sensory effect media data file
WO2023169003A1 (en) Point cloud media decoding method and apparatus and point cloud media coding method and apparatus
WO2024114519A1 (en) Point cloud encapsulation method and apparatus, point cloud de-encapsulation method and apparatus, and medium and electronic device
US20220368876A1 (en) Multi-track based immersive media playout
WO2022134962A1 (en) Method and apparatus for presenting point cloud window, computer-readable medium, and electronic device
WO2023169001A1 (en) Data processing method and apparatus for immersive media, and device and storage medium
CN116347118A (en) Data processing method of immersion medium and related equipment
JP2023551010A (en) Multi-atlas encapsulation of immersive media
CN115150368A (en) Media file association processing method, device, medium and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22938313

Country of ref document: EP

Kind code of ref document: A1