WO2022111343A1 - 非时序点云媒体的处理方法、装置、设备及存储介质 - Google Patents

非时序点云媒体的处理方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2022111343A1
WO2022111343A1 PCT/CN2021/131037 CN2021131037W WO2022111343A1 WO 2022111343 A1 WO2022111343 A1 WO 2022111343A1 CN 2021131037 W CN2021131037 W CN 2021131037W WO 2022111343 A1 WO2022111343 A1 WO 2022111343A1
Authority
WO
WIPO (PCT)
Prior art keywords
gpcc
point cloud
area
target
entry
Prior art date
Application number
PCT/CN2021/131037
Other languages
English (en)
French (fr)
Inventor
胡颖
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to JP2023530295A priority Critical patent/JP7508710B2/ja
Priority to KR1020237021494A priority patent/KR20230110790A9/ko
Priority to EP21896844.4A priority patent/EP4254351A4/en
Publication of WO2022111343A1 publication Critical patent/WO2022111343A1/zh
Priority to US17/969,627 priority patent/US20230048474A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
    • H04N19/23Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding with coding of regions that are present throughout a whole video segment, e.g. sprites, background or mosaic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/23614Multiplexing of additional data and video streams
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/164Feedback from the receiver or from the transmission channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/85406Content authoring involving a specific file format, e.g. MP4 format
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/04Indexing scheme for image data processing or generation, in general involving 3D image data

Definitions

  • the embodiments of the present application relate to the field of computer technologies, and in particular, to non-sequential point cloud media.
  • the point cloud data of objects can be obtained in many ways, and the video production device can transmit the point cloud data to the video playback device in the form of point cloud media, that is, point cloud media files, so that the video playback device can play the point cloud media.
  • point cloud media that is, point cloud media files
  • point cloud data for the same object can be encapsulated into different point cloud media, for example: some point cloud media are the entire point cloud media of the object, while some point cloud media are only part of the object's point cloud. media.
  • the present application provides a processing method, device, device and storage medium for time-series point cloud media, so that users can request non-sequential point cloud media of the same static object in multiple times, so as to improve processing efficiency and user experience.
  • the present application provides a method for processing non-sequential point cloud media.
  • the method is executed by a video production device, and the method includes: acquiring non-sequential point cloud data of a static object; The data is processed to obtain a GPCC bit stream; the GPCC bit stream is encapsulated to generate at least one entry of the GPCC area, and the entry of the GPCC area is used to represent the GPCC component of the three-dimensional 3D space area corresponding to the GPCC area; Encapsulate the entries in the GPCC area, and generate at least one non-sequential point cloud media of static objects, and the non-sequential point cloud media includes the identifier of the static object; send the MPD signaling of at least one non-sequential point cloud media to the video playback device ; receive the first request message sent by the video playback device according to the MPD signaling, the first request message is used to request the first non-sequential point cloud media in the at least one non-s
  • the present application provides a method for processing non-sequential point cloud media.
  • the method is performed by a video playback device.
  • the method includes: receiving MPD signaling of at least one non-sequential point cloud media, the The cloud media includes the identifier of the static object; sends a first request message to the video production device according to the MPD signaling, where the first request message is used to request the first non-sequential point cloud media in the at least one non-sequential point cloud media point cloud media; receiving first non-sequential point cloud media from the video production device; playing the first non-sequential point cloud media; wherein at least one non-sequential point cloud media encapsulates at least one entry of point cloud compression GPCC area Generated, the entry of the at least one GPCC area is generated by encapsulating the GPCC bit stream, and the GPCC bit stream is obtained by processing the non-sequential point cloud data of the static object by GPCC encoding; for at least
  • the present application provides a processing device for non-sequential point cloud media, including: a processing unit and a communication unit; the processing unit is used for: acquiring non-sequential point cloud data of static objects; The data is processed to obtain a GPCC bit stream; the GPCC bit stream is encapsulated to generate at least one entry of the GPCC area, and the entry of the GPCC area is used to represent the GPCC component of the three-dimensional 3D space area corresponding to the GPCC area; Encapsulate the entries in the GPCC area, and generate at least one non-sequential point cloud media of static objects, and the non-sequential point cloud media includes the identifier of the static object; send the MPD signaling of at least one non-sequential point cloud media to the video playback device
  • the communication unit is configured to: receive a first request message sent by the video playback device according to the MPD signaling, where the first request message is used to request the first non-sequential point cloud
  • the present application provides an apparatus for processing non-sequential point cloud media, comprising: a processing unit and a communication unit; the communication unit is configured to: receive MPD signaling of at least one non-sequential point cloud media, the non-sequential point cloud media The media includes the identifier of the static object; according to the MPD signaling, a first request message is sent to the video production device, where the first request message is used to request the first non-sequential point in the at least one non-sequential point cloud media cloud media; receiving the first non-sequential point cloud media from the video production device; the processing unit is configured to play the first non-sequential point cloud media; wherein, at least one non-sequential point cloud media is compressed GPCC area for at least one point cloud The entry is obtained by encapsulating and generating, and the entry of the at least one GPCC area is generated by encapsulating the GPCC bit stream, and the GPCC bit stream is obtained by processing the non
  • a video production apparatus comprising: a processor and a memory for storing a computer program, the processor for invoking and running the computer program stored in the memory to perform the method of the above aspect.
  • a video playback device comprising: a processor and a memory, where the memory is used for storing a computer program, and the processor is used for calling and running the computer program stored in the memory to execute the method of the above aspect.
  • a computer-readable storage medium for storing a computer program, the computer program causing a computer to perform the method of the above aspect.
  • the embodiments of the present application provide a computer program product including instructions, which, when executed on a computer, cause the computer to perform the method of the above aspect.
  • the identification of the static object can be carried in the non-sequential point cloud media, so that the user can divide it multiple times and request the same Non-temporal point cloud media of static objects to improve user experience.
  • the 3D space area corresponding to the entry of the GPCC area can be divided into multiple subspace areas. Combined with the independent encoding and decoding characteristics of GPCC tiles, users can decode and present non-sequential point cloud media more efficiently. delay lower.
  • the video production device can flexibly combine the entries of multiple GPCC regions to form different non-sequential point cloud media, wherein the non-sequential point cloud media can constitute a complete GPCC frame or a partial GPCC frame.
  • the flexibility of video production can be improved.
  • FIG. 1 shows a schematic diagram of the architecture of a processing system for non-sequential point cloud media provided by an exemplary embodiment of the present application
  • FIG. 2A shows a schematic structural diagram of a processing architecture of a non-sequential point cloud media provided by an exemplary embodiment of the present application
  • FIG. 2B shows a schematic structural diagram of a sample provided by an exemplary embodiment of the present application
  • FIG. 2C shows a schematic structural diagram of a container including multiple file tracks provided by an exemplary embodiment of the present application
  • FIG. 2D shows a schematic structural diagram of a sample provided by another exemplary embodiment of the present application.
  • FIG. 3 shows an interactive flowchart of a method for processing non-sequential point cloud media provided by an embodiment of the present application
  • FIG. 4A shows a schematic diagram of encapsulation of a point cloud media provided by an embodiment of the present application
  • FIG. 4B shows a schematic diagram of encapsulation of another point cloud media provided by an embodiment of the present application
  • FIG. 5 shows a schematic diagram of an apparatus 500 for processing non-sequential point cloud media provided by an embodiment of the present application
  • FIG. 6 shows a schematic diagram of an apparatus 600 for processing non-sequential point cloud media provided by an embodiment of the present application
  • FIG. 7 shows a schematic block diagram of a video production device 700 provided by an embodiment of the present application.
  • FIG. 8 shows a schematic block diagram of a video playback device 800 provided by an embodiment of the present application.
  • Point cloud data is a specific recording form of point cloud.
  • the point cloud data of each point in the point cloud can include geometric information (ie, three-dimensional position information) and attribute information.
  • the geometry of each point in the point cloud The information refers to the Cartesian three-dimensional coordinate data of the point, and the attribute information of each point in the point cloud may include, but is not limited to, at least one of the following: color information, material information, and laser reflection intensity information.
  • each point in the point cloud has the same amount of attribute information; for example, each point in the point cloud has both color information and laser reflection intensity; or each point in the point cloud has color information
  • the way to obtain point cloud data may include but not limited to at least one of the following : (1) Generated by computer equipment.
  • the computer device can generate point cloud data according to the virtual three-dimensional object and the virtual three-dimensional scene.
  • the visual scene of the real world is acquired through a 3D photography device (ie, a set of cameras or a camera device with multiple lenses and sensors) to obtain point cloud data of the visual scene of the real world, and dynamic real-world three-dimensional objects can be obtained through 3D photography. or point cloud data of a 3D scene.
  • a 3D photography device ie, a set of cameras or a camera device with multiple lenses and sensors
  • point cloud data of a 3D scene (4) Obtain point cloud data of biological tissues and organs through medical equipment.
  • medical equipment such as Magnetic Resonance Imaging (MRI), Computed Tomography (CT), and electromagnetic positioning information.
  • point cloud media refers to point cloud media files formed by point cloud data.
  • the point cloud media includes multiple media frames, and each media frame in the point cloud media is composed of point cloud data.
  • Point cloud media can express the spatial structure and surface properties of 3D objects or 3D scenes flexibly and conveniently, so it is widely used in virtual reality (Virtual Reality, VR) games, computer aided design (Computer Aided Design, CAD), geographic information systems ( Geography Information System, GIS), Automatic Navigation System (ANS), digital cultural heritage, free viewpoint broadcasting, 3D immersive telepresence, 3D reconstruction of biological tissues and organs, etc.
  • VR Virtual Reality
  • CAD Computer Aided Design
  • GIS Geographic Information System
  • ANS Automatic Navigation System
  • non-sequential point cloud media is aimed at the same static object, that is, for the same static object, its corresponding point cloud media is non-sequential.
  • FIG. 1 shows a schematic diagram of the architecture of a non-sequential point cloud media processing system provided by an exemplary embodiment of the present application.
  • the non-sequential point cloud media processing system 10 includes video playback.
  • Device 101 and Video Production Device 102.
  • the video production equipment refers to the computer equipment used by the provider of the non-sequential point cloud media (for example, the content producer of the non-sequential point cloud media).
  • Smart mobile devices such as smart phones, etc.), servers, etc.
  • video playback devices refer to computer devices used by users (such as users) of non-sequential point cloud media, which can be terminals (such as PCs), smart mobile devices devices (e.g. smartphones), VR devices (e.g. VR headsets, VR glasses, etc.).
  • the video production device and the video playback device may be directly or indirectly connected through wired communication or wireless communication, which is not limited in this embodiment of the present application.
  • FIG. 2A shows a schematic structural diagram of a processing architecture for non-sequential point cloud media provided by an exemplary embodiment of the present application.
  • the following will combine the processing system for non-sequential point cloud media shown in FIG. 1 and the processing system shown in FIG. 2A .
  • the processing architecture of the non-sequential point cloud media, the processing solution of the non-sequential point cloud media provided by the embodiment of the present application is introduced.
  • the processing process of the non-sequential point cloud media includes the processing process on the video production device side and the processing process on the video playback device side. , the specific processing process is as follows:
  • the acquisition method of point cloud data can be divided into two methods: acquiring point cloud data by collecting real-world visual scenes by a capture device, and generating by computer equipment.
  • the capture device may be a hardware component provided in the video production device, for example, the capture device is a camera, a sensor, or the like of a terminal.
  • the capture device may also be a hardware device connected to the content production device, such as a camera connected to a server. The capture device is used to provide the point cloud data acquisition service for the video production device.
  • the capture device may include but is not limited to any of the following: a camera device, a sensor device, and a scanning device; wherein, the camera device may include a common camera, a stereo camera, Light field cameras, etc.; sensing devices may include laser devices, radar devices, etc.; scanning devices may include 3D laser scanning devices, etc.
  • the number of capture devices can be multiple, and these capture devices are deployed in some specific positions in the real space to capture point cloud data from different angles in the space at the same time, and the captured point cloud data are synchronized in time and space.
  • the computer device may generate point cloud data according to the virtual three-dimensional object and the virtual three-dimensional scene. Due to the different acquisition methods of point cloud data, the corresponding compression coding methods of point cloud data acquired by different methods may also be different.
  • the video production device adopts a geometry-based point cloud compression (Geometry-Based Point Cloud Compression, GPCC) encoding method or a traditional video encoding-based point cloud compression (Video-Based Point Cloud Compression, VPCC) encoding method to obtain the
  • GPCC geometry-based point cloud compression
  • VPCC Video-Based Point Cloud Compression
  • the video production device uses a file track to encapsulate the GPCC bit stream of the encoded point cloud data; the so-called file track refers to the GPCC bit stream of the encoded point cloud data.
  • Encapsulation container GPCC bitstreams can be encapsulated in a single file track, GPCC bitstreams can also be encapsulated into multiple file tracks, GPCC bitstreams are encapsulated in a single file track and GPCC bitstreams are encapsulated in multiple file tracks. as follows:
  • the GPCC bitstream is encapsulated in a single file track.
  • the GPCC bitstream is required to be declared and represented according to the transport rules of the single file track.
  • GPCC bitstreams encapsulated in a single file track do not require further processing and can be encapsulated by the International Organization for Standardization Base Media File Format (ISOBMFF).
  • ISOBMFF International Organization for Standardization Base Media File Format
  • each sample (Sample) packaged in a single file track contains one or more GPCC components, which are also called GPCC components, and the GPCC components may be GPCC geometric components or GPCC attribute components.
  • the so-called sample refers to a set of encapsulation structures of one or more point clouds, that is, each sample consists of one or more Type-Length-Value ByteStream Format (TLV) encapsulation structures .
  • Fig. 2B shows a schematic structural diagram of a sample provided by an exemplary embodiment of the present application. As shown in Fig. 2B, when a single file track is transmitted, the samples in the file track are composed of GPCC parameter set TLV, geometric bit stream Composed of TLV and attribute bitstream TLV, the sample is packed into a single file track.
  • the GPCC bitstream is encapsulated in multiple file tracks.
  • each sample in the file track contains at least one TLV encapsulation structure that carries a single GPCC component data, And the TLV encapsulation structure does not contain the encoded GPCC geometry bit stream and the encoded GPCC attribute bit stream at the same time.
  • FIG. 2C shows a schematic structural diagram of a container including multiple file tracks provided by an exemplary embodiment of the present application. As shown in FIG.
  • the package 1 transmitted in the file track 1 includes an encoded GPCC geometric bit stream , does not contain the coded GPCC attribute bitstream; the package 2 transmitted in the file track 2 contains the coded GPCC attribute bitstream and does not contain the coded GPCC geometry bitstream. Since the video playback device should first decode the encoded GPCC geometry bitstream when decoding, and the decoding of the encoded GPCC attribute bitstream depends on the decoded geometry information, different GPCC component bitstreams are encapsulated in separate file tracks , so that the video playback device can access the file track carrying the encoded GPCC geometry bitstream before the encoded GPCC attribute bitstream. Fig.
  • FIG. 2D shows a schematic structural diagram of a sample provided by another exemplary embodiment of the present application.
  • the encoded GPCC geometry bit stream and the encoded GPCC attribute bit stream Transmission is performed in different file tracks.
  • the samples in the file track are composed of GPCC parameter set TLV and geometric bitstream TLV.
  • the sample does not contain attribute bitstream TLV.
  • the sample is encapsulated in any one of the multiple file tracks. in orbit.
  • the acquired point cloud data is encoded and packaged by the video production device to form non-sequential point cloud media
  • the non-sequential point cloud media may be the entire media file of the object, or may be a media segment of the object
  • the video production device uses the media presentation description information (that is, the description signaling file) (Media presentation description, MPD) according to the file format requirements of the non-sequential point cloud media to record the metadata of the encapsulation file of the non-sequential point cloud media.
  • Metadata is a general term for information related to the presentation of non-sequential point cloud media.
  • the metadata may include description information for non-sequential point cloud media, description information for viewing windows, and signaling related to the presentation of non-sequential point cloud media. information and more.
  • the video production device delivers the MPD to the video playback device, so that the video playback device requests to acquire point cloud media according to the relevant description information in the MPD.
  • point cloud media and MPD are delivered by the video production device to the video playback device through a transmission mechanism (such as Dynamic Adaptive Streaming over HTTP (DASH), Smart Media Transport (SMT)) .
  • DASH Dynamic Adaptive Streaming over HTTP
  • SMT Smart Media Transport
  • the video playback device may obtain non-sequential point cloud media through MPD signaling delivered by the video production device.
  • the file decapsulation process on the video playback device side is opposite to the file encapsulation process on the video production device side.
  • the video playback device decapsulates the encapsulated files of the non-sequential point cloud media according to the file format requirements of the non-sequential point cloud media, and obtains the encoding Bitstream (ie GPCC bitstream or VPCC bitstream).
  • the decoding process on the video playback device side is opposite to the encoding process on the video production device side.
  • the video playback device decodes the encoded bit stream and restores the point cloud data.
  • the video playback device renders the point cloud data obtained by decoding the GPCC bitstream according to the metadata related to rendering and viewport in the MPD. After rendering, the visual scene corresponding to the point cloud data is presented.
  • the point cloud data for the same object can be encapsulated into different point cloud media, for example: some point cloud media are the entire point cloud media of the object, and some point cloud media are part of the object's point cloud media.
  • the user can request to play different point cloud media.
  • the user requests he does not know whether the different point cloud media is the point cloud media of the same object, resulting in blind requests. This problem also exists for non-temporal point cloud media of static objects.
  • the present application carries the identification of the static object in the non-sequential point cloud media, so that the user can request the non-sequential point cloud media of the same static object for multiple times and purposefully.
  • FIG. 3 is an interactive flowchart of a method for processing non-sequential point cloud media provided by an embodiment of the present application.
  • the main body of the method is a video production device and a video playback device. As shown in FIG. 3 , the method includes the following steps:
  • the video production device obtains the non-sequential point cloud data of the static object.
  • the video production device processes the non-sequential point cloud data through GPCC encoding to obtain a GPCC bit stream.
  • the video production device encapsulates the GPCC bit stream to generate at least one entry of the GPCC area.
  • the video production device encapsulates the entry of at least one GPCC area, and generates at least one non-sequential point cloud medium of the static object, and each non-sequential point cloud medium includes an identifier of the static object.
  • the video production device sends at least one MPD signaling of non-sequential point cloud media to the video playback device.
  • the video playback device sends a first request message.
  • the first request message is sent by the video playback device according to MPD signaling, and the first request message is used to request the first non-sequential point cloud media in the at least one non-sequential point cloud media.
  • the video production device sends the first non-sequential point cloud media to the video playback device according to the first request message.
  • the video playback device plays the first non-sequential point cloud media.
  • the entry of the GPCC area is used to represent the GPCC component of the 3D space area corresponding to the GPCC area.
  • Each GPCC area corresponds to a 3D space area of the above-mentioned static object, and the 3D space area may be the whole or part of the 3D space area of the static object.
  • a GPCC component is also called a GPCC component, and the GPCC component can be a GPCC geometry component or an attribute component.
  • the identifier of the static object can be defined by the following code:
  • ObjectInfoProperty indicates the property of the content corresponding to the entry, and both the GPCC geometric component and the property component can contain this property. If only the GPCC geometry component contains this attribute, the ObjectInfoProperty of all attribute components associated with the GPCC geometry component is the same.
  • the object_ID indicates the identification of the static object, and the object_IDs of the entries in different GPCC areas of the same static object are the same.
  • the identifier of the above-mentioned static object can be carried in the entry related to the GPCC geometric component in the point cloud medium, or carried in the entry related to the GPCC attribute component in the point cloud medium, or carried in the point cloud medium.
  • Items related to GPCC geometric components and items related to GPCC attribute components are not limited in this application.
  • FIG. 4A is a schematic diagram of encapsulation of a point cloud media provided by an embodiment of the present application.
  • the point cloud media includes items related to GPCC geometric components and items related to GPCC attribute components. Among them, these items can be associated through the GPCC item group box in the point cloud media.
  • items related to GPCC geometric components are associated with items related to GPCC attribute components.
  • the entries related to the GPCC geometric components may include the following entry attributes: such as GPCC configuration (GPCC Configuration), 3D spatial region attributes (3D spatial region or ItemSpatialInfoProperty), and the identification of static objects.
  • Items related to GPCC attribute components may include the following item attributes: such as GPCC Configuration (GPCC Configuration), identification of static objects, and the like.
  • the GPCC configuration indicates the configuration information of the decoder required to decode the corresponding entry and information related to each GPCC component, but is not limited thereto.
  • the items related to the GPCC attribute components may also include: 3D space area attributes, which are not limited in this application.
  • FIG. 4B is a schematic diagram of encapsulation of another point cloud media provided by an embodiment of the present application.
  • the point cloud media includes: an entry related to a GPCC geometric component , and the entry is associated with two GPCC attribute component related entries.
  • FIG. 4A For the other attributes included in the entries related to the GPCC geometric components, and the attributes included in the entries related to the GPCC attribute components, reference may be made to FIG. 4A , which will not be repeated in this application.
  • the identifier of the above-mentioned static object is not limited to be carried in the attribute corresponding to the entry of each GPCC area.
  • the non-sequential point cloud media may be the whole or part of the point cloud media of the static object.
  • the video playback device may send the first request message according to the above MPD signaling to request the first non-sequential point cloud media.
  • the identification of the static object can be carried in the non-sequential point cloud media, so that the user can divide it multiple times and request the same Non-temporal point cloud media of static objects to improve user experience.
  • each GPCC area corresponds to only one 3D space area, but in this application, the 3D space area can be further divided.
  • the signaling has been updated accordingly, as follows:
  • the entry of the target GPCC area includes: a 3D space area entry attribute, and the 3D space area entry attribute includes: a first identifier and a second identifier.
  • the target GPCC area is one GPCC area in at least one GPCC area.
  • the first identifier (Sub_region_contained) is used to identify whether the target 3D space region corresponding to the target GPCC region is divided into multiple subspace regions.
  • the second identifier (tile_id_present) is used to identify whether the target GPCC area adopts the GPCC tile coding mode.
  • tile_id_present 1
  • the video production end must use the GPCC tile encoding method.
  • the 3D space area entry attributes also include, but are not limited to: the respective information of the multiple subspace areas and the information of the target 3D space area. .
  • the information of the subspace region includes at least one of the following, but is not limited to this: the identification of the subspace region, the location information of the subspace region, the target GPCC region adopts When GPCC tile is encoded, the tile (block) identifier in the subspace area.
  • the location information of the subspace region includes, but is not limited to, the location information of an anchor point of the subspace region, and the lengths of the subspace region along the X axis, the Y axis, and the Z axis, respectively.
  • the location information of the subspace region includes, but is not limited to, the location information of two anchor points of the subspace region.
  • the information of the target 3D space area includes at least one of the following, but is not limited thereto: an identifier of the target 3D space area, location information of the target 3D space area, and the number of subspace areas included in the target 3D space area.
  • the position information of the target 3D space area includes, but is not limited to: the position information of an anchor point of the target 3D space area, and the target 3D space area along the X-axis, Y-axis, Z-axis respectively. length.
  • the location information of the target 3D space area includes, but is not limited to: location information of two anchor points of the target 3D space area.
  • the 3D space region entry attribute further includes: a third identifier (initial_region_id).
  • a third identifier initial_region_id.
  • the value of the third identifier is the first value or empty, it indicates that the entry corresponding to the target GPCC area is the entry initially presented by the video playback device, and for the target 3D space area and the subspace area of the target 3D space area, when the video playback The device initially renders the target 3D space area.
  • the value of the third identifier is the second value, it indicates that the entry corresponding to the target GPCC area is an entry initially presented by the video playback device, for the target 3D space area and the subspace area of the target 3D space area, at the initial stage of the video playback device What is presented is the subspace area corresponding to the second value in the target 3D space area.
  • the above-mentioned first numerical value is 0, and the second numerical value is an identifier of a subspace area in the target 3D space area that needs to be initially presented.
  • the 3D space region entry attribute further includes: information of the target 3D space region.
  • the information of the target 3D space area includes at least one of the following items, but is not limited to this: the identification of the target 3D space area, the location information of the target 3D space area, and when the target GPCC area adopts GPCC tile encoding, the target 3D space area is in the target 3D space area.
  • the tile identifier is not limited to this: the identification of the target 3D space area, the location information of the target 3D space area, and when the target GPCC area adopts GPCC tile encoding, the target 3D space area is in the target 3D space area.
  • ItemSpatialInfoProperty represents the 3D spatial area property of the item of the GPCC area. If the entry is an entry corresponding to a geometric component, the attribute must be included; if the entry is an entry corresponding to an attribute component, the 3D space area attribute may not be included.
  • sub_region_contained is 1, indicating that the 3D space region can be further divided into multiple subspace regions.
  • tile_id_present When the value of this field is 1, the value of tile_id_present must be 1.
  • the sub_region_contained value is 0, indicating that there is no further subspace region division in the 3D space.
  • a value of 1 for tile_id_present indicates that the non-sequential point cloud data is encoded by GPCC tiles, and the tile id corresponding to the non-sequential point cloud is given in this attribute.
  • inital_region_id indicates the ID of the space region that is initially presented in the overall space of the item when the current item is an item that is initially consumed or played. If the value of this field is 0 or the field does not exist, the area in which the entry is initially presented is the overall 3D space area. If the value of this field is the identifier of the subspace area, the area initially presented by the entry is the subspace area corresponding to the identifier.
  • 3DSpatialRegionStruct represents a 3D space region
  • the first 3DSpatialRegionStruct in the ItemSpatialInfoProperty indicates the 3D space region corresponding to the entry corresponding to the ItemSpatialInfoProperty
  • the remaining 3DSpatialRegionStruct indicates each subspace region in the 3D space region corresponding to the entry.
  • num_sub_regions indicates the number of subspace regions divided in the 3D space region corresponding to the entry.
  • num_tiles indicates the number of tiles in the 3D space region corresponding to this entry, or the number of tiles corresponding to its subspace region.
  • tile_id indicates the identifier of the GPCC tile.
  • anchor_x, anchor_y, and anchor_z respectively represent the x, y, and z coordinates of the anchor point of the 3D space region or the subspace region of the region.
  • region_dx, region_dy, and region_dz respectively represent the lengths of the 3D space region or the subspace regions of the region along the X axis, the Y axis, and the Z axis, respectively.
  • the 3D space area can be divided into multiple subspace areas. Combined with the independent encoding and decoding characteristics of GPCC tiles, users can decode and present non-sequential point cloud media with higher efficiency and lower latency.
  • the video production device may encapsulate the entry of at least one GPCC area to generate at least one non-sequential point cloud media of the static object.
  • the entry of at least one GPCC area is one
  • the entry of one GPCC area is encapsulated into one non-sequential point cloud media.
  • the number of entries of at least one GPCC area is N
  • the entries of the N GPCC areas are encapsulated into M non-sequential point cloud media.
  • N is an integer greater than 1
  • the value range of M is [1, N]
  • M is an integer.
  • the entries of the N GPCC areas can be encapsulated into one non-sequential point cloud medium.
  • N non-sequential point cloud media In this case of encapsulation, each non-sequential point cloud media includes one entry.
  • the second non-sequential point cloud media is any one of the at least one non-sequential point cloud media including entries of multiple GPCC regions.
  • the second non-sequential point cloud media includes:
  • GPCC item group box (GPCCItemGroupBox).
  • the GPCC entry group box is used to associate entries of multiple GPCC areas, as shown in Figures 4A and 4B.
  • the GPCC entry group box includes: identifiers of entries of multiple GPCC areas.
  • the GPCC item group box includes: a fourth identification (initial_item_ID).
  • the fourth identifier is the identifier of the entry initially presented by the video playback device in the entries of the multiple GPCC areas.
  • the GPCC item group box includes: a fifth flag (partial_item_flag). If the fifth identifier takes the value of the third value, it indicates that the entries of the multiple GPCC areas constitute a complete GPCC frame of the static object. If the value of the fifth identifier is the fourth value, it indicates that the entries of the multiple GPCC areas constitute part of the GPCC frame of the static object.
  • the third value may be 0, and the fourth value may be 1, but not limited thereto.
  • the GPCC entry group box includes: location information of a GPCC area formed by a plurality of GPCC areas.
  • the GPCC entry group box includes location information of the R1+R2 area.
  • the items contained in the GPCCItemGroupBox are items that belong to the same static object, and items that have an associated relationship when presenting and consuming. All items contained in the GPCCItemGroupBox may constitute a complete GPCC frame, or may be part of a GPCC frame.
  • initial_item_ID indicates the identification of the item initially consumed within an item group.
  • the initial_item_ID is only valid when the current item group is the item group requested by the user for the first time.
  • the same static object corresponds to two point cloud media, which are F1 and F2 respectively.
  • partial_item_flag value When the partial_item_flag value is 0, it means that all items contained in the GPCCItemGroupBox and their associated items constitute a complete GPCC frame, and when the value is 1, it means that all the items contained in the GPCCItemGroupBox and their associated items constitute only a partial GPCC frame.
  • the extension is as follows:
  • the GPCC entry descriptor is used to describe the elements and attributes related to the GPCC entry, and the descriptor is a SupplementalProperty element.
  • @schemeIdUri attribute is equal to "urn:mpeg:mpegI:gpcc:2020:gpsr".
  • the descriptor can be located at the Adaptation Set level or the Representation level.
  • Representation In DASH, a combination of one or more media components, such as a video file of a certain resolution, can be regarded as a Representation (description).
  • Adaptation Sets In DASH, a collection of one or more video streams.
  • One Adaptation Sets can contain multiple Representations.
  • Table 1 GPCC entry description sub-elements and attributes
  • the video production equipment can flexibly combine the entries of multiple GPCC regions to form different non-sequential point cloud media, wherein the non-sequential point cloud media can constitute a complete GPCC frame or a part of it. GPCC frame.
  • the video production device can also improve the entries of the initial presentation.
  • the non-sequential point cloud data has 4 versions of point cloud media on the video production equipment side: point cloud media F0 corresponding to all non-sequential point cloud data, corresponding to Some point cloud media F1 to F3 of non-sequential point cloud data, wherein F1 to F3 correspond to 3D space regions R1 to R3 respectively.
  • point cloud media package contents of F0 ⁇ F3 are as follows:
  • tile_id[] (3,4)
  • inital_region_id 0;
  • inital_region_id 0;
  • tile_id[] (3,4)
  • the video production device sends the MPD signaling of F0 to F3 to the user, and the Object_ID, space region, subspace region, and tile identification information therein are the same as those in the file encapsulation, and are not repeated here.
  • user U1 Since user U1 has good network conditions and low data transmission delay, it can request F0; user U2 has poor network conditions and high data transmission delay, so it can request F1.
  • the video production device transmits F0 to the video playback device corresponding to the user U1, and transmits F1 to the video playback device corresponding to the user U2.
  • the initial viewing area is the SR1 area, and the corresponding tile ID is 1.
  • U1 decodes and consumes it can decode tile '1' from the overall code stream for direct consumption and presentation, without the need to decode the overall file and present it, which improves the decoding efficiency and reduces the time required for rendering and presentation.
  • the corresponding tile ID is 2, and the part corresponding to tile '2' in the overall code stream is directly decoded for presentation and consumption.
  • the video playback device corresponding to user U2 After the video playback device corresponding to user U2 receives F1, it decodes F1 for consumption, and according to the area that the user may consume in the next step, combined with the information in the MPD file, that is, Object_ID and spatial area information, requests F2 or F3 for caching in advance.
  • the video playback device can also make a purposeful request to the video production device again based on the user's consumption needs and possible consumption areas.
  • Non-temporal point cloud media of the same static object can also make a purposeful request to the video production device again based on the user's consumption needs and possible consumption areas.
  • the video playback device Since the video playback device obtains the identification of the static object in the point cloud media above, when it needs to obtain other point cloud media corresponding to the static object again, it can request the same static object multiple times in a targeted manner based on the identification of the static object.
  • the object's point cloud media Since the video playback device obtains the identification of the static object in the point cloud media above, when it needs to obtain other point cloud media corresponding to the static object again, it can request the same static object multiple times in a targeted manner based on the identification of the static object.
  • the object's point cloud media Since the video playback device obtains the identification of the static object in the point cloud media above, when it needs to obtain other point cloud media corresponding to the static object again, it can request the same static object multiple times in a targeted manner based on the identification of the static object.
  • the object's point cloud media Since the video playback device obtains the identification of the static object in the point cloud media above, when it needs to obtain other point cloud media corresponding to the static object
  • the non-sequential point cloud data exists in two versions of point cloud media on the video production device: F1 and F2, F1 contains item1 ⁇ item2, and F2 contains item3 to item4.
  • the point cloud media package contents of F1 and F2 are as follows:
  • inital_region_id 0;
  • inital_region_id 0;
  • GPCCItemGroupBox
  • inital_region_id 0;
  • inital_region_id 0;
  • GPCCItemGroupBox
  • the video production device sends the MPD signaling of F1 to F2 to the user, and the Object_ID, spatial area, and tile ID information are the same as those in the point cloud media encapsulation, and are not repeated here.
  • User U1 requests F1 consumption; user U2 requests F2 consumption.
  • the video production device transmits F1 to the video playback device corresponding to the user U1, and transmits F2 to the video playback device corresponding to the user U2.
  • the video playback device corresponding to U1 After the video playback device corresponding to U1 receives F1, it initially watches item1.
  • the initial viewing area of item1 is the entire viewing space of item1. Therefore, U1 consumes the entirety of item1. Since F1 contains item1 and item2, which correspond to tile1 and tile2 respectively, U1 can directly decode part of the code stream corresponding to tile1 for presentation when consuming item1. If U1 continues to consume, and the corresponding tile ID is 2 when viewing the item2 area, it will directly decode the part corresponding to tile'2' in the overall code stream for presentation and consumption. If U1 continues to consume and needs to watch the area corresponding to item3, it will request F2 according to the MPD file. After receiving F2, present consumption directly according to the area viewed by the user, and no longer judge the initial consumption item information and initial viewing area information in F2.
  • the video playback device corresponding to U2 After the video playback device corresponding to U2 receives F2, it initially watches item3, and the initial viewing area of item3 is the overall viewing space of item3, so U2 consumes the entire item3. Since F2 contains item3 and item4, which correspond to tile3 and tile4 respectively, U2 can directly decode part of the code stream corresponding to tile3 for presentation when consuming item3.
  • FIG. 5 is a schematic diagram of an apparatus 500 for processing non-sequential point cloud media according to an embodiment of the present application.
  • the apparatus 500 includes a processing unit 510 and a communication unit 520 .
  • the processing unit 510 is configured to: acquire non-sequential point cloud data of the static object.
  • the non-sequential point cloud data is processed through the GPCC encoding method to obtain the GPCC bit stream.
  • the GPCC bitstream is encapsulated to generate at least one entry of the GPCC area, where the entry of the GPCC area is used to represent the GPCC component of the three-dimensional 3D space area corresponding to the GPCC area.
  • the communication unit 520 is configured to: receive a first request message sent by the video playback device according to the MPD signaling, where the first request message is used to request a first non-sequential point cloud media in the at least one non-sequential point cloud media . According to the first request message, the first non-sequential point cloud media is sent to the video playback device.
  • the entry of the target GPCC area includes: a 3D space area entry attribute, and the 3D space area entry attribute includes: a first identifier and a second identifier.
  • the target GPCC area is one GPCC area in at least one GPCC area.
  • the first identifier is used to identify whether the target 3D space area corresponding to the target GPCC area is divided into multiple subspace areas.
  • the second identifier is used to identify whether the target GPCC area adopts the GPCC tile encoding method.
  • the 3D space region entry attribute further includes: respective information of the multiple subspace regions and information of the target 3D space region.
  • the information of the subspace region includes at least one of the following: the identification of the subspace region, the location information of the subspace region, and when the target GPCC region adopts GPCC tile coding, The tile identifier in the subspace region.
  • the information of the target 3D space area includes at least one of the following items: an identifier of the target 3D space area, location information of the target 3D space area, and the number of subspace areas included in the target 3D space area.
  • the 3D space area entry attribute further includes: a third identifier.
  • the value of the third identifier is the first value or empty, it indicates that the entry corresponding to the target GPCC area is the entry initially presented by the video playback device, and for the target 3D space area and the subspace area of the target 3D space area, when the video playback The device initially renders the target 3D space area.
  • the value of the third identifier is the second value, it indicates that the entry corresponding to the target GPCC area is an entry initially presented by the video playback device, for the target 3D space area and the subspace area of the target 3D space area, at the initial stage of the video playback device What is presented is the subspace area corresponding to the second value in the target 3D space area.
  • the 3D space region entry attribute further includes: information of the target 3D space region.
  • the information of the target 3D space area includes at least one of the following: an identifier of the target 3D space area, location information of the target 3D space area, and a tile identifier in the target 3D space area when the target GPCC area adopts GPCC tile encoding.
  • the processing unit 510 is specifically configured to: if the entry of at least one GPCC area is one, encapsulate the entry of one GPCC area into one non-sequential point cloud media. If the number of entries of at least one GPCC area is N, the entries of the N GPCC areas are encapsulated into M non-sequential point cloud media.
  • N is an integer greater than 1, 1 ⁇ M ⁇ N, and M is an integer.
  • the second non-sequential point cloud media includes: a GPCC entry group box.
  • the second non-sequential point cloud media is any non-sequential point cloud media including entries of multiple GPCC areas in at least one non-sequential point cloud media, and the GPCC entry group box is used for associating entries of multiple GPCC areas.
  • the GPCC entry group box includes: a fourth identifier.
  • the fourth identifier is the identifier of the entry initially presented by the video playback device in the entries of the multiple GPCC areas.
  • the GPCC entry group box includes: a fifth identifier. If the fifth identifier is a third value, it indicates that the entries of multiple GPCC areas constitute a complete GPCC frame of the static object. If the value of the fifth identifier is the fourth value, it indicates that the entries of the multiple GPCC areas constitute part of the GPCC frame of the static object.
  • the GPCC entry group box includes: location information of a GPCC area formed by a plurality of GPCC areas.
  • the communication unit 520 is further configured to: receive a second request message sent by the video playback device based on the identifier of the static object, where the second request message is used to request the at least one non-sequential point cloud media.
  • the third non-temporal point cloud media According to the second request message, the third non-sequential point cloud media is sent to the video playback device.
  • the apparatus embodiments and the method embodiments may correspond to each other, and similar descriptions may refer to the method embodiments. To avoid repetition, details are not repeated here.
  • the apparatus 500 shown in FIG. 5 can execute the method embodiments corresponding to the video production equipment, and the aforementioned and other operations and/or functions of each module in the apparatus 500 are respectively for realizing the method embodiments corresponding to the video production equipment, in order to It is concise and will not be repeated here.
  • the apparatus 500 of the embodiments of the present application is described above from the perspective of functional modules with reference to the accompanying drawings.
  • the functional modules can be implemented in the form of hardware, can also be implemented by instructions in the form of software, and can also be implemented by a combination of hardware and software modules.
  • the steps of the method embodiments in the embodiments of the present application may be completed by hardware integrated logic circuits in the processor and/or instructions in the form of software, and the steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as hardware
  • the execution of the decoding processor is completed, or the execution is completed by a combination of hardware and software modules in the decoding processor.
  • the software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, and other storage media mature in the art.
  • the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps in the above method embodiments in combination with its hardware.
  • FIG. 6 is a schematic diagram of an apparatus 600 for processing non-sequential point cloud media according to an embodiment of the present application.
  • the apparatus 600 includes a processing unit 610 and a communication unit 620 .
  • the communication unit 620 is configured to: receive MPD signaling of at least one non-sequential point cloud media, where the non-sequential point cloud media includes an identifier of the static object.
  • a first non-temporal point cloud media is received from the video production device.
  • the processing unit 610 is configured to play the first non-sequential point cloud media.
  • At least one non-sequential point cloud media is generated by encapsulating at least one entry in the point cloud compressed GPCC area, and the entry in the at least one GPCC area is generated by encapsulating a GPCC bitstream, and the GPCC bitstream is It is obtained by processing the non-sequential point cloud data of static objects through GPCC encoding.
  • the entry of the GPCC area is used to represent the GPCC component of the 3D space area corresponding to the GPCC area.
  • the entry of the target GPCC area includes: a 3D space area entry attribute, and the 3D space area entry attribute includes: a first identifier and a second identifier.
  • the target GPCC area is one GPCC area in at least one GPCC area.
  • the first identifier is used to identify whether the target 3D space area corresponding to the target GPCC area is divided into multiple subspace areas.
  • the second identifier is used to identify whether the target GPCC area adopts the GPCC tile encoding method.
  • the 3D space region entry attribute further includes: respective information of the multiple subspace regions and information of the target 3D space region.
  • the information of the subspace region includes at least one of the following: the identification of the subspace region, the location information of the subspace region, and when the target GPCC region adopts GPCC tile coding, The tile identifier in the subspace region.
  • the information of the target 3D space area includes at least one of the following items: an identifier of the target 3D space area, location information of the target 3D space area, and the number of subspace areas included in the target 3D space area.
  • the 3D space area entry attribute further includes: a third identifier.
  • the value of the third identifier is the first value or empty, it indicates that the entry corresponding to the target GPCC area is the entry initially presented by the video playback device, and for the target 3D space area and the subspace area of the target 3D space area, when the video playback The device initially renders the target 3D space area.
  • the value of the third identifier is the second value, it indicates that the entry corresponding to the target GPCC area is an entry initially presented by the video playback device, for the target 3D space area and the subspace area of the target 3D space area, at the initial stage of the video playback device What is presented is the subspace area corresponding to the second value in the target 3D space area.
  • the 3D space region entry attribute further includes: information of the target 3D space region.
  • the information of the target 3D space area includes at least one of the following: an identifier of the target 3D space area, location information of the target 3D space area, and a tile identifier in the target 3D space area when the target GPCC area adopts GPCC tile coding.
  • the entry of at least one GPCC area is one, the entry of one GPCC area is encapsulated as one non-sequential point cloud media. If the number of entries of at least one GPCC area is N, then the entries of the N GPCC areas are packaged into M non-sequential point cloud media. Wherein, N is an integer greater than 1, 1 ⁇ M ⁇ N, and M is an integer.
  • the second non-sequential point cloud media includes: a GPCC entry group box.
  • the second non-sequential point cloud media is any one of the non-sequential point cloud media including entries of multiple GPCC regions in the at least one non-sequential point cloud media.
  • the GPCC Entry Group box is used to associate entries from multiple GPCC areas.
  • the GPCC entry group box includes: a fourth identifier.
  • the fourth identifier is the identifier of the entry initially presented by the video playback device in the entries of the multiple GPCC areas.
  • the GPCC entry group box includes: a fifth identification. If the fifth identifier takes the value of the third value, it indicates that the entries of the multiple GPCC areas constitute a complete GPCC frame of the static object. If the value of the fifth identifier is the fourth value, it indicates that the entries of the multiple GPCC areas constitute part of the GPCC frame of the static object.
  • the GPCC entry group box includes: location information of a GPCC area formed by a plurality of GPCC areas.
  • the communication unit 620 is further configured to send the second request message to the video production device according to the MPD signaling.
  • a second non-temporal point cloud media is received.
  • processing unit 610 is further configured to play the second non-sequential point cloud media.
  • the apparatus embodiments and the method embodiments may correspond to each other, and similar descriptions may refer to the method embodiments. To avoid repetition, details are not repeated here.
  • the apparatus 600 shown in FIG. 6 can execute the method embodiment corresponding to the video playback device, and the aforementioned and other operations and/or functions of each module in the apparatus 600 are for realizing the method embodiment corresponding to the video playback device, respectively, in order to It is concise and will not be repeated here.
  • the apparatus 600 of the embodiment of the present application is described above from the perspective of functional modules with reference to the accompanying drawings.
  • the functional modules can be implemented in the form of hardware, can also be implemented by instructions in the form of software, and can also be implemented by a combination of hardware and software modules.
  • the steps of the method embodiments in the embodiments of the present application may be completed by hardware integrated logic circuits in the processor and/or instructions in the form of software, and the steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as hardware
  • the execution of the decoding processor is completed, or the execution is completed by a combination of hardware and software modules in the decoding processor.
  • the software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, and other storage media mature in the art.
  • the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps in the above method embodiments in combination with its hardware.
  • FIG. 7 is a schematic block diagram of a video production device 700 provided by an embodiment of the present application.
  • the video production apparatus 700 may include:
  • a memory 710 and a processor 720 the memory 710 is used to store computer programs and transmit the program codes to the processor 720.
  • the processor 720 may call and run a computer program from the memory 710 to implement the method in the embodiments of the present application.
  • the processor 720 may be configured to execute the above method embodiments according to the instructions in the computer program.
  • the processor 720 may include, but is not limited to:
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the memory 710 includes but is not limited to:
  • Non-volatile memory may be a read-only memory (Read-Only Memory, ROM), a programmable read-only memory (Programmable ROM, PROM), an erasable programmable read-only memory (Erasable PROM, EPROM), an electrically programmable read-only memory (Erasable PROM, EPROM). Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory. Volatile memory may be Random Access Memory (RAM), which acts as an external cache.
  • RAM Random Access Memory
  • RAM Static RAM
  • DRAM Dynamic RAM
  • SDRAM Synchronous DRAM
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM DDR SDRAM
  • enhanced SDRAM ESDRAM
  • synchronous link dynamic random access memory SLDRAM
  • Direct Rambus RAM Direct Rambus RAM
  • the computer program can be divided into one or more modules, and the one or more modules are stored in the memory 710 and executed by the processor 720 to complete the steps provided by the present application.
  • the one or more modules may be a series of computer program instruction segments capable of performing specific functions, and the instruction segments are used to describe the execution process of the computer program in the video production apparatus.
  • the video production equipment may further include:
  • a transceiver 730 which can be connected to the processor 720 or the memory 710 .
  • the processor 720 can control the transceiver 730 to communicate with other devices, and specifically, can send information or data to other devices, or receive information or data sent by other devices.
  • Transceiver 730 may include a transmitter and a receiver.
  • the transceiver 730 may further include antennas, and the number of the antennas may be one or more.
  • bus system includes a power bus, a control bus and a status signal bus in addition to a data bus.
  • FIG. 8 is a schematic block diagram of a video playback device 800 provided by an embodiment of the present application.
  • the video playback device 800 may include:
  • a memory 810 and a processor 820 the memory 810 is used to store computer programs and transmit the program codes to the processor 820.
  • the processor 820 can call and run a computer program from the memory 810 to implement the methods in the embodiments of the present application.
  • the processor 820 may be configured to execute the above method embodiments according to the instructions in the computer program.
  • the processor 820 may include, but is not limited to:
  • the memory 810 includes but is not limited to:
  • Volatile memory and/or non-volatile memory may be ROM, PROM, EPROM, EEPROM or flash memory.
  • Volatile memory can be RAM, which acts as an external cache.
  • many forms of RAM are available, such as SRAM, DRAM, SDRAM, DDR SDRAM, ESDRAM, SLDRAM, and DR RAM.
  • the computer program may be divided into one or more modules, and the one or more modules are stored in the memory 810 and executed by the processor 820 to complete the steps provided by the present application.
  • the one or more modules may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program in the video playback device.
  • the video playback device may further include:
  • a transceiver 830 which can be connected to the processor 820 or the memory 810 .
  • the processor 820 may control the transceiver 830 to communicate with other devices, specifically, may send information or data to other devices, or receive information or data sent by other devices.
  • Transceiver 830 may include a transmitter and a receiver.
  • the transceiver 830 may further include antennas, and the number of the antennas may be one or more.
  • each component in the video playback device is connected through a bus system, wherein the bus system includes a power bus, a control bus and a status signal bus in addition to a data bus.
  • the present application also provides a computer storage medium on which a computer program is stored, and when the computer program is executed by a computer, enables the computer to execute the methods of the above method embodiments.
  • the embodiments of the present application further provide a computer program product including instructions, when the instructions are executed by a computer, the instructions cause the computer to execute the methods of the above method embodiments.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored on or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted over a wire from a website site, computer, server or data center (eg coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (eg infrared, wireless, microwave, etc.) means to another website site, computer, server or data center.
  • the computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that includes one or more available media integrated.
  • the available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, digital video disc (DVD)), or semiconductor media (eg, solid state disk (SSD)), and the like.
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the device embodiments described above are only illustrative.
  • the division of the modules is only a logical function division. In actual implementation, there may be other division methods.
  • multiple modules or components may be combined or Integration into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or modules, and may be in electrical, mechanical or other forms.
  • Modules described as separate components may or may not be physically separated, and components shown as modules may or may not be physical modules, that is, may be located in one place, or may be distributed to multiple network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment. For example, each functional module in each embodiment of the present application may be integrated into one processing module, or each module may exist physically alone, or two or more modules may be integrated into one module.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Computer Graphics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

一种非时序点云媒体的处理方法、装置、设备及存储介质,该方法包括:通过GPCC编码方式对静态物体的非时序点云数据进行处理,得到GPCC比特流(S302);对GPCC比特流进行封装,生成至少一个GPCC区域的条目(S303);对至少一个GPCC区域的条目进行封装,生成静态物体的至少一个非时序点云媒体(S304);发送至少一个非时序点云媒体的MDP信令(S305);接收视频播放设备发送的第一请求消息;发送第一非时序点云媒体;其中,GPCC区域的条目用于表示GPCC区域对应的三维3D空间区域的GPCC成分;非时序点云媒体包括:静态物体的标识,以使用户可以分多次,且具有目的性地请求同一静态物体的非时序点云媒体,以提高用户体验感。

Description

非时序点云媒体的处理方法、装置、设备及存储介质
本申请要求于2020年11月26日提交中国专利局、申请号为202011347626.1、申请名称为“非时序点云媒体的处理方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及计算机技术领域,尤其涉及非时序点云媒体。
背景技术
目前可以通过许多方式获取物体的点云数据,而视频制作设备可以将点云数据以点云媒体,即点云媒体文件的形式传输给视频播放设备,以供视频播放设备播放点云媒体。
值得一提的是,针对同一物体的点云数据,可以封装成不同的点云媒体,例如:有些点云媒体是该物体的整个点云媒体,而有些点云媒体只是该物体的部分点云媒体。
发明内容
本申请提供一种时序点云媒体的处理方法、装置、设备及存储介质,以使用户可以分多次,且具有目的性地请求同一静态物体的非时序点云媒体,以提高处理效率和用户体验感。
一方面,本申请提供一种非时序点云媒体的处理方法,所述方法由视频制作设备执行,所述方法包括:获取静态物体的非时序点云数据;通过GPCC编码方式对非时序点云数据进行处理,得到GPCC比特流;对GPCC比特流进行封装,生成至少一个GPCC区域的条目,所述GPCC区域的条目用于表示所述GPCC区域对应的三维3D空间区域的GPCC成分;对至少一个GPCC区域的条目进行封装,生成静态物体的至少一个非时序点云媒体,所述非时序点云媒体包括所述静态物体的标识;向视频播放设备发送至少一个非时序点云媒体的MPD信令;接收视频播放设备根据所述MPD信令发送的第一请求消息,所述第一请求消息用于请求所述至少一个非时序点云媒体中的第一非时序点云媒体;根据第一请求消息,向视频播放设备发送第一非时序点云媒体。
另一方面,本申请提供一种非时序点云媒体的处理方法,所述方法由视频播放设备执行,所述方法包括:接收至少一个非时序点云媒体的MPD信令,所述非时序点云媒体包括所述静态物体的标识;根据所述MPD信令向视频制作设备发送第一请求消息,所述第一请求消息用于请求所述至少一个非时序点云媒体中的第一非时序点云媒体;从所述视频制作设备接收第一非时序点云媒体;播放第一非时序点云媒体;其中,至少一个非时序点云媒体是对至少一个点云压缩GPCC区域的条目进行封装生成得到的,所述至少一个GPCC区域的条目是对GPCC比特流进行封装生成的,所述GPCC比特流是通过GPCC编码方式对静态物体的非时序点云数据进行处理得到的;针对至少一个GPCC区域的条目中的任一个GPCC区域的条目,GPCC区域的条目用于表示GPCC区域对应的3D空间区域的GPCC成分。
另一方面,本申请提供一种非时序点云媒体的处理装置,包括:处理单元和通信单元;处理单元用于:获取静态物体的非时序点云数据;通过GPCC编码方式对非时序点云数据进行处理,得到GPCC比特流;对GPCC比特流进行封装,生成至少一个GPCC区域的条目,所述GPCC区域的条目用于表示所述GPCC区域对应的三维3D空间区域的GPCC成分;对至 少一个GPCC区域的条目进行封装,生成静态物体的至少一个非时序点云媒体,所述非时序点云媒体包括所述静态物体的标识;向视频播放设备发送至少一个非时序点云媒体的MPD信令;通信单元用于:接收视频播放设备根据所述MPD信令发送的第一请求消息,所述第一请求消息用于请求所述至少一个非时序点云媒体中的第一非时序点云媒体;根据第一请求消息,向视频播放设备发送第一非时序点云媒体。
另一方面,本申请提供一种非时序点云媒体的处理装置,包括:处理单元和通信单元;通信单元用于:接收至少一个非时序点云媒体的MPD信令,所述非时序点云媒体包括所述静态物体的标识;根据所述MPD信令向视频制作设备发送第一请求消息,所述第一请求消息用于请求所述至少一个非时序点云媒体中的第一非时序点云媒体;从所述视频制作设备接收第一非时序点云媒体;处理单元用于播放第一非时序点云媒体;其中,至少一个非时序点云媒体是对至少一个点云压缩GPCC区域的条目进行封装生成得到的,所述至少一个GPCC区域的条目是对GPCC比特流进行封装生成的,所述GPCC比特流是通过GPCC编码方式对静态物体的非时序点云数据进行处理得到的;针对至少一个GPCC区域的条目中的任一个GPCC区域的条目,GPCC区域的条目用于表示GPCC区域对应的3D空间区域的GPCC成分。
又一方面,提供了一种视频制作设备,包括:处理器和存储器,该存储器用于存储计算机程序,该处理器用于调用并运行该存储器中存储的计算机程序,以执行以上方面的方法。
又一方面,提供了一种视频播放设备,包括:处理器和存储器,该存储器用于存储计算机程序,该处理器用于调用并运行该存储器中存储的计算机程序,以执行以上方面的方法。
又一方面,提供了一种计算机可读存储介质,用于存储计算机程序,该计算机程序使得计算机执行以上方面的方法。
又一方面,本申请实施例提供了一种包括指令的计算机程序产品,当其在计算机上运行时,使得所述计算机执行以上方面的方法。
综上,在本申请中,视频制作设备在封装非时序点云媒体时,可以将静态物体的标识携带在非时序点云媒体中,以使用户可以分多次,且具有目的性地请求同一静态物体的非时序点云媒体,以提高用户体验感。
进一步地,在本申请中,GPCC区域的条目对应的3D空间区域可以划分为多个子空间区域,结合GPCC tile独立编解码的特性,可以使用户解码呈现非时序点云媒体的效率更高,时延更低。
更进一步地,视频制作设备可以对多个GPCC区域的条目进行灵活组合,以形成不同的非时序点云媒体,其中非时序点云媒体可以构成完整的GPCC帧,也可以构成部分GPCC帧。从而可以提高视频制作的灵活性。
附图说明
图1示出了本申请一个示例性实施例提供的一种非时序点云媒体的处理系统的架构示意图;
图2A示出了本申请一个示例性实施例提供的一种非时序点云媒体的处理架构的架构 示意图;
图2B示出了本申请一个示例性实施例提供的一种样本的结构示意图;
图2C示出了本申请一个示例性实施例提供的一种包含多个文件轨道的容器的结构示意图;
图2D示出了本申请另一个示例性实施例提供的一种样本的结构示意图;
图3示出了本申请实施例提供的一种非时序点云媒体的处理方法的交互流程图;
图4A示出了本申请实施例提供的一种点云媒体的封装示意图;
图4B示出了本申请实施例提供的另一种点云媒体的封装示意图;
图5示出了本申请实施例提供的一种非时序点云媒体的处理装置500的示意图;
图6示出了本申请实施例提供的一种非时序点云媒体的处理装置600的示意图;
图7示出了本申请实施例提供的视频制作设备700的示意性框图;
图8示出了本申请实施例提供的视频播放设备800的示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本申请保护的范围。
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或服务器不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
在介绍本申请技术方案之前,下面先对本申请相关知识进行介绍:
所谓点云(Point Cloud)是指空间中一组无规则分布的、表达三维物体或三维场景的空间结构及表面属性的离散点集。点云数据(Point Cloud Data)是点云的具体记录形式,点云中每个点的点云数据可以包括几何信息(即三维位置信息)和属性信息,其中,点云中每个点的几何信息是指该点的笛卡尔三维坐标数据,点云中每个点的属性信息可以包括但不限于以下至少一种:色彩信息、材质信息、激光反射强度信息。通常,点云中的每个点都具有相同数量的属性信息;例如,点云中的每个点都具有色彩信息和激光反射强度两种属性信息;或者点云中的每个点都具有色彩信息、材质信息和激光反射强度信息三种属性信息。
随着科学技术的进步与发展,目前已经能够以较低的成本、在较短的时间周期内获得大量高精确度的点云数据,点云数据的获取途径可以包括但不限于以下至少一种:(1)计算机设备生成。计算机设备可以根据虚拟三维物体及虚拟三维场景的生成点云数据。(2)3D(3-Dimension,三维)激光扫描获取。通过3D激光扫描可以获取静态现实世界三维物 体或三维场景的点云数据,每秒可以获取百万级点云数据;(3)3D摄影测量获取。通过3D摄影设备(即一组摄像机或具有多个镜头和传感器的摄像机设备)对现实世界的视觉场景进行采集以获取现实世界的视觉场景的点云数据,通过3D摄影可以获得动态现实世界三维物体或三维场景的点云数据。(4)通过医学设备获取生物组织器官的点云数据。在医学领域可以通过磁共振成像(Magnetic Resonance Imaging,MRI)、电子计算机断层扫描(Computed Tomography,CT)、电磁定位信息等医学设备获取生物组织器官的点云数据。
所谓点云媒体是指由点云数据形成的点云媒体文件,点云媒体包括多个媒体帧,点云媒体中的每个媒体帧由点云数据组成。点云媒体可以灵活方便地表达三维物体或三维场景的空间结构及表面属性,因此被广泛应用在虚拟现实(Virtual Reality,VR)游戏、计算机辅助设计(Computer Aided Design,CAD)、地理信息系统(Geography Information System,GIS)、自动导航系统(Autonomous Navigation System,ANS)、数字文化遗产、自由视点广播、三维沉浸远程呈现、生物组织器官三维重建等项目中。
所谓非时序点云媒体针对的是同一静态物体,即对于同一静态物体,它对应的点云媒体是非时序的。
基于上述描述,请参见图1,图1示出了本申请一个示例性实施例提供的一种非时序点云媒体的处理系统的架构示意图,该非时序点云媒体的处理系统10包括视频播放设备101和视频制作设备102。其中,视频制作设备是指非时序点云媒体的提供者(例如非时序点云媒体的内容制作者)所使用的计算机设备,该计算机设备可以是终端(例如个人计算机(Personal Computer,PC)、智能移动设备(例如智能手机)等)、服务器等;视频播放设备是指非时序点云媒体的使用者(例如用户)所使用的计算机设备,该计算机设备可以是终端(例如PC)、智能移动设备(例如智能手机)、VR设备(例如VR头盔、VR眼镜)等)。视频制作设备和视频播放设备可以通过有线通信或者无线通信的方式进行直接或间接地连接,本申请实施例在此不做限制。
图2A示出了本申请一个示例性实施例提供的一种非时序点云媒体的处理架构的架构示意图,下面将结合图1所示的非时序点云媒体的处理系统以及图2A所示的非时序点云媒体的处理架构,对本申请实施例提供的非时序点云媒体的处理方案进行介绍,非时序点云媒体的处理过程包括视频制作设备侧的处理过程以及视频播放设备侧的处理过程,具体处理过程如下:
一、视频制作设备侧的处理过程:
(1)点云数据的获取过程。
在一种实现方式中,从点云数据的获取方式看,点云数据的获取方式可以分为通过捕获设备采集真实世界的视觉场景来获取点云数据,以及,通过计算机设备生成两种方式。在一种实现方式中,捕获设备可以是设置于视频制作设备中的硬件组件,例如捕获设备是终端的摄像头、传感器等。捕获设备也可以是与内容制作设备相连接的硬件装置,例如与服务器相连接摄像头等。捕获设备用于为视频制作设备提供点云数据的获取服务,捕获设备可以包括但不限于以下任一种:摄像设备、传感设备、扫描设备;其中,摄像设备可以包括普通摄像头、立体摄像头、光场摄像头等;传感设备可以包括激光设备、雷达设备等; 扫描设备可以包括3D激光扫描设备等。捕获设备的数量可以为多个,这些捕获设备被部署在现实空间中的一些特定位置以同时捕获该空间内不同角度的点云数据,捕获到的点云数据在时间上和空间上均保持同步。在另一种实现方式中,计算机设备可以根据虚拟三维物体及虚拟三维场景的生成点云数据。由于点云数据的获取方式不同,通过不同方式获取到的点云数据对应的压缩编码方式也可能有所区别。
(2)点云数据的编码及封装过程。
在一种实现方式中,视频制作设备采用基于几何的点云压缩(Geometry-Based Point Cloud Compression,GPCC)编码方式或者基于传统视频编码的点云压缩(Video-BasedPointCloudCompression,VPCC)编码方式对获取到的点云数据进行编码处理,得到点云数据的GPCC比特流或者VPCC比特流。
在一种实现方式中,以GPCC编码方式为例,视频制作设备采用文件轨道对编码后的点云数据的GPCC比特流进行封装;所谓文件轨道是指编码后的点云数据的GPCC比特流的封装容器;GPCC比特流可以封装在单个文件轨道中,GPCC比特流也可以封装到多个文件轨道中,GPCC比特流封装在单个文件轨道中和GPCC比特流封装在多个文件轨道中的具体情况如下:
1、GPCC比特流封装在单个文件轨道中。当GPCC比特流在单个文件轨道中传输时,要求GPCC比特流根据单个文件轨道的传输规则进行声明并表示。封装在单个文件轨道中的GPCC比特流无需进行进一步处理,可以通过国际标准化组织基本媒体文件格式(International Organization for Standardization Base Media File Format,ISOBMFF)进行封装。具体地,封装在单个文件轨道中的每个样本(Sample)都包含一个或多个GPCC组件,该GPCC组件也被称为GPCC成分,该GPCC成分可以是GPCC几何成分或者GPCC属性成分。所谓样本是指一个或多个点云的封装结构集合,也就是说,每个样本由一个或多个类型-长度-值字节流格式(Type-Length-Value ByteStream Format,TLV)封装结构组成。图2B示出了本申请一个示例性实施例提供的一种样本的结构示意图,如图2B所示,在进行单个文件轨道传输时,该文件轨道中的样本由GPCC参数集TLV、几何比特流TLV和属性比特流TLV组成,该样本被封装到单个文件轨道中。
2、GPCC比特流封装在多个文件轨道中。当编码的GPCC几何比特流和编码的GPCC属性比特流在不同的文件轨道中进行传输时,文件轨道中的每个样本都包含至少一个TLV封装结构,该TLV封装结构中携带单个GPCC成分数据,并且TLV封装结构中不同时包含编码的GPCC几何比特流和编码的GPCC属性比特流。图2C示出了本申请一个示例性实施例提供的一种包含多个文件轨道的容器的结构示意图,如图2C所示,在文件轨道1中传输的封装包1包含编码的GPCC几何比特流,不包含编码的GPCC属性比特流;在文件轨道2中传输的封装包2包含编码的GPCC属性比特流,不包含编码的GPCC几何比特流。由于视频播放设备在解码时首先应对编码的GPCC几何比特流进行解码,而编码的GPCC属性比特流的解码取决于解码后的几何信息,因此将不同的GPCC分量比特流封装在单独的文件轨道中,使得视频播放设备可以在编码的GPCC属性比特流之前访问承载编码的GPCC几何比特流的文件轨道。图2D示出了本申请另一个示例性实施例提供的一种样本的结构 示意图,如图2D所示,在进行多个文件轨道传输时,编码的GPCC几何比特流和编码的GPCC属性比特流在不同的文件轨道中进行传输,该文件轨道中的样本由GPCC参数集TLV、几何比特流TLV组成,样本中不包含属性比特流TLV,该样本被封装在多个文件轨道中的任一个文件轨道中。
在一种实现方式中,获取到的点云数据经视频制作设备编码、封装后形成非时序点云媒体,该非时序点云媒体可以是物体的整个媒体文件,也可以是物体的媒体片段;并且视频制作设备按照非时序点云媒体的文件格式要求采用媒体呈现描述信息(即描述信令文件)(Media presentation description,MPD)记录该非时序点云媒体的封装文件的元数据,此处的元数据是对与非时序点云媒体的呈现有关的信息的总称,该元数据可以包括对非时序点云媒体的描述信息、对视窗的描述信息以及对非时序点云媒体呈现相关的信令信息等等。视频制作设备将MPD下发至视频播放设备,以使视频播放设备根据该MPD中的相关描述信息请求获取点云媒体。具体地,点云媒体和MPD通过传输机制(例如动态自适应流媒体传输(Dynamic Adaptive Streaming over HTTP,DASH)、智能媒体传输(Smart Media Transport,SMT))由视频制作设备下发至视频播放设备。
二、视频播放设备侧的数据处理过程:
(1)点云数据的解封装及解码过程。
在一种实现方式中,视频播放设备可以通过视频制作设备下发的MPD信令获取非时序点云媒体。视频播放设备端的文件解封装的过程与视频制作设备端的文件封装过程是相逆的,视频播放设备按照非时序点云媒体的文件格式要求对非时序点云媒体的封装文件进行解封装,得到编码比特流(即GPCC比特流或VPCC比特流)。视频播放设备端的解码过程与视频制作设备端的编码过程是相逆的,视频播放设备对编码比特流进行解码,还原出点云数据。
(2)点云数据的渲染过程。
在一种实现方式中,视频播放设备根据MPD中与渲染、视窗相关的元数据对GPCC比特流解码得到的点云数据进行渲染,渲染完成即实现了对点云数据对应的视觉场景的呈现。
可以理解的是,本申请实施例描述的非时序点云媒体的处理系统是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,随着系统架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
如上所述,针对同一物体的点云数据,可以封装成不同的点云媒体,例如:有些点云媒体是该物体的整个点云媒体,有些点云媒体而是该物体的部分点云媒体。基于此,用户可以请求播放不同的点云媒体,然而,用户在请求时,却并不知道不同的点云媒体是否为同一物体的点云媒体,从而造成请求盲目的问题。对于静态物体的非时序点云媒体也存在这一问题。
为了解决上述技术问题,本申请通过在非时序点云媒体中携带静态物体的标识,以使用户可以分多次,且具有目的性地请求同一静态物体的非时序点云媒体。
下面将对本申请技术方案进行详细阐述:
图3为本申请实施例提供的一种非时序点云媒体的处理方法的交互流程图,该方法的执行主体是视频制作设备和视频播放设备,如图3所示,该方法包括如下步骤:
S301:视频制作设备获取静态物体的非时序点云数据。
S302:视频制作设备通过GPCC编码方式对非时序点云数据进行处理,得到GPCC比特流。
S303:视频制作设备对GPCC比特流进行封装,生成至少一个GPCC区域的条目。
S304:视频制作设备对至少一个GPCC区域的条目进行封装,生成静态物体的至少一个非时序点云媒体,每个非时序点云媒体包括静态物体的标识。
S305:视频制作设备向视频播放设备发送至少一个非时序点云媒体的MPD信令。
S306:视频播放设备发送第一请求消息。
其中,第一请求消息是视频播放设备根据MPD信令发送的,所述第一请求消息用于请求所述至少一个非时序点云媒体中的第一非时序点云媒体。
S307:视频制作设备根据第一请求消息,向视频播放设备发送第一非时序点云媒体。
S308:视频播放设备播放第一非时序点云媒体。
应理解的是,关于如何获取静态物体的非时序点云数据以及得到GPCC比特流,可参考上述相关知识,本申请对此不再赘述。
其中,针对至少一个GPCC区域的条目(Item)中的任一个GPCC区域的条目,GPCC区域的条目用于表示GPCC区域对应的3D空间区域的GPCC成分。
每个GPCC区域对应上述静态物体的一个3D空间区域,该3D空间区域可以是静态物体的整个或者部分3D空间区域。
如上所述,GPCC成分也被称为GPCC组件,该GPCC成分可以是GPCC几何成分或者属性成分。
应理解的是,在视频制作设备侧,可以通过如下代码定义静态物体的标识:
aligned(8)class ObjectInfoProperty extends ItemProperty('obif'){
unsigned int(32)object_ID;}
其中,ObjectInfoProperty指示条目对应的内容的属性,GPCC几何成分和属性成分均可包含该属性。若仅GPCC几何成分包含该属性,则该GPCC几何成分关联的所有属性成分的ObjectInfoProperty与其相同。
object_ID指示静态物体的标识,同一个静态物体的不同GPCC区域的条目,其object_ID相同。
可选地,上述静态物体的标识可以携带在点云媒体中的GPCC几何成分相关的条目中,或者携带在点云媒体中的GPCC属性成分相关的条目中,又或者携带在点云媒体中的GPCC几何成分相关的条目,以及,GPCC属性成分相关的条目中,本申请对此不做限制。
示例性地,图4A为本申请实施例提供的一种点云媒体的封装示意图,如图4A所示,该点云媒体包括:GPCC几何成分相关的条目,以及GPCC属性成分相关的条目。其中,可以通过该点云媒体中的GPCC条目组盒子将这些条目关联起来。如图4A所示,GPCC几何成分相关的条目,关联有GPCC属性成分相关的条目。其中,GPCC几何成分相关的条目可以包 括如下的条目属性:如GPCC配置(GPCC Configuration)、3D空间区域属性(3D spatial region或者ItemSpatialInfoProperty)、静态物体的标识。GPCC属性成分相关的条目可以包括如下的条目属性:如GPCC配置(GPCC Configuration)、静态物体的标识等。
可选地,GPCC配置指示了解码对应条目所需的解码器的配置信息以及每个GPCC成分相关的信息,但不限于此。
值得一提的是,GPCC属性成分相关的条目也可以包括:3D空间区域属性,本申请对此不做限制。
示例性地,图4B为本申请实施例提供的另一种点云媒体的封装示意图,图4B与图4A的区别在于:在图4B中,该点云媒体包括:一个GPCC几何成分相关的条目,且该条目关联两个GPCC属性成分相关的条目。其余的关于GPCC几何成分相关的条目所包括的各个属性,以及GPCC属性成分相关的条目所包括的各个属性可参考图4A,本申请对此不再赘述。
应理解的是,上述静态物体的标识不限于携带在每个GPCC区域的条目对应包括的属性中。
应理解的是,上述MPD信令可参考本申请上述的相关知识,本申请对此不再赘述。
可选地,针对上述至少一个非时序点云媒体中的任一个非时序点云媒体,该非时序点云媒体可以是上述静态物体的整个或者部分点云媒体。
应理解的是,视频播放设备可以根据上述MPD信令发送第一请求消息,以请求第一非时序点云媒体。
综上,在本申请中,视频制作设备在封装非时序点云媒体时,可以将静态物体的标识携带在非时序点云媒体中,以使用户可以分多次,且具有目的性地请求同一静态物体的非时序点云媒体,以提高用户体验感。
相关技术中,每个GPCC区域的条目只对应一个3D空间区域,而在本申请中,可以对该3D空间区域进行进一步划分,基于此,本申请对非时序点云媒体中的条目属性以及MPD信令进行了相应的更新,具体如下:
可选地,目标GPCC区域的条目包括:3D空间区域条目属性,3D空间区域条目属性包括:第一标识和第二标识。其中,目标GPCC区域为至少一个GPCC区域中的一个GPCC区域。第一标识(Sub_region_contained)用于标识目标GPCC区域对应的目标3D空间区域是否被划分为多个子空间区域。第二标识(tile_id_present)用于标识目标GPCC区域是否采用GPCC tile编码方式。
示例性地,当Sub_region_contained=0时,表示目标GPCC区域对应的目标3D空间区域未被划分为多个子空间区域,当Sub_region_contained=1时,表示目标GPCC区域对应的目标3D空间区域被划分为多个子空间区域。
示例性地,当tile_id_present=0时,表示目标GPCC区域未采用GPCC tile编码方式。当tile_id_present=1时,表示目标GPCC区域采用GPCC tile编码方式。
应理解的是,当Sub_region_contained=1时,tile_id_present=1,即当目标GPCC区域对应的目标3D空间区域被划分为多个子空间区域时,视频制作端必须采用GPCC tile编码方式。
可选地,若目标GPCC区域对应的目标3D空间区域被划分为多个子空间区域,则3D空 间区域条目属性还包括,但不限于此:多个子空间区域各自的信息和目标3D空间区域的信息。
可选地,针对多个子空间区域中的任一个子空间区域,子空间区域的信息包括以下至少一项,但不限于此:子空间区域的标识、子空间区域的位置信息、目标GPCC区域采用GPCC tile编码时,子空间区域中的tile(区块)标识。
可选地,子空间区域的位置信息包括,但不限于此:该子空间区域的一个锚点的位置信息,以及,该子空间区域分别沿着X轴、Y轴、Z轴的长度。或者,子空间区域的位置信息包括,但不限于此:该子空间区域的两个锚点的位置信息。
可选地,目标3D空间区域的信息包括以下至少一项,但不限于此:目标3D空间区域的标识、目标3D空间区域的位置信息、目标3D空间区域包括的子空间区域的数量。
可选地,目标3D空间区域的位置信息包括,但不限于此:该目标3D空间区域的一个锚点的位置信息,以及,该目标3D空间区域分别沿着X轴、Y轴、Z轴的长度。或者,目标3D空间区域的位置信息包括,但不限于此:该目标3D空间区域的两个锚点的位置信息。
可选地,若目标GPCC区域对应的目标3D空间区域被划分为多个子空间区域,则3D空间区域条目属性还包括:第三标识(initial_region_id)。当第三标识的取值为第一数值或者空时,表示目标GPCC区域对应的条目是视频播放设备初始呈现的条目时,针对目标3D空间区域和目标3D空间区域的子空间区域,在视频播放设备初始呈现的是目标3D空间区域。当第三标识的取值为第二数值时,表示目标GPCC区域对应的条目是视频播放设备初始呈现的条目时,针对目标3D空间区域和目标3D空间区域的子空间区域,在视频播放设备初始呈现的是目标3D空间区域中第二数值所对应的子空间区域。
可选地,上述第一数值是0,第二数值是目标3D空间区域中需要初始呈现的子空间区域的标识。
可选地,若目标GPCC区域对应的目标3D空间区域未被划分为多个子空间区域,则3D空间区域条目属性还包括:目标3D空间区域的信息。可选地,目标3D空间区域的信息包括以下至少一项,但不限于此:目标3D空间区域的标识、目标3D空间区域的位置信息、目标GPCC区域采用GPCC tile编码时,目标3D空间区域中的tile标识。
应理解的是,关于目标3D空间区域的位置信息的情况,可参考上述内容中目标3D空间区域的位置信息的解释,本申请对此不再赘述。
下面将通过代码形式说明本申请对对非时序点云媒体中的条目属性的更新情况:
Figure PCTCN2021131037-appb-000001
Figure PCTCN2021131037-appb-000002
其中各字段语义如下:
ItemSpatialInfoProperty表示GPCC区域的条目的3D空间区域属性。若该条目是几何成分对应的条目,则必须包含该属性;若条目是属性成分对应的条目,则可以不包含该3D空间区域属性。
sub_region_contained取值为1,表示3D空间区域内部还可进一步划分为多个子空间区域,当该字段取值为1时,tile_id_present必须取值为1。该sub_region_contained取值为0,表示3D空间内无进一步的子空间区域划分。
tile_id_present取值为1表示该非时序点云数据采用GPCC tile编码,且该非时序点云对应 的tile id在本属性中给出。
inital_region_id表示当前条目为初始消费或者播放的条目时,该条目整体空间内部初始呈现的空间区域的ID。若该字段取值为0或该字段不存在,则该条目初始呈现的区域为整体的3D空间区域。若该字段取值为子空间区域的标识时,则该条目初始呈现的区域为该标识对应的子空间区域。
3DSpatialRegionStruct表示3D空间区域,ItemSpatialInfoProperty中的第一个3DSpatialRegionStruct指示ItemSpatialInfoProperty对应条目对应的3D空间区域,其余3DSpatialRegionStruct指示该条目对应的3D空间区域中各个子空间区域。
num_sub_regions指示该条目对应的3D空间区域内划分的子空间区域个数。
num_tiles指示该条目对应的3D空间区域中的tile数量,或者其子空间区域对应的tile数量。
tile_id指示GPCC tile的标识符。
anchor_x、anchor_y、anchor_z分别表示3D空间区域或者该区域的子空间区域的锚点的x、y、z坐标。
region_dx、region_dy、region_dz分别表示3D空间区域或者该区域的子空间区域分别沿着X轴、Y轴、Z轴的长度。
综上,在本申请中,3D空间区域可以划分为多个子空间区域,结合GPCC tile独立编解码的特性,可以使用户解码呈现非时序点云媒体的效率更高,时延更低。
如上所述,视频制作设备可以对至少一个GPCC区域的条目进行封装,生成静态物体的至少一个非时序点云媒体。其中,若至少一个GPCC区域的条目是1个,则将1个GPCC区域的条目封装为1个非时序点云媒体。若至少一个GPCC区域的条目是N个,则将N个GPCC区域的条目封装为M个非时序点云媒体。其中,N为大于1的整数,M的取值范围为【1,N】,M为整数。例如:若至少一个GPCC区域的条目是N个,则可以将N个GPCC区域的条目封装为1个非时序点云媒体,该封装情况下,一个非时序点云媒体包括N个条目,或者封装为N个非时序点云媒体,该封装情况下,每个非时序点云媒体包括一个条目。
下面将针对第二非时序点云媒体中的字段进行说明,其中,第二非时序点云媒体是至少一个非时序点云媒体中包括多个GPCC区域的条目的任一个非时序点云媒体。
可选地,第二非时序点云媒体包括:
GPCC条目组盒子(GPCCItemGroupBox)。其中,GPCC条目组盒子用于关联多个GPCC区域的条目,如图4A和4B所示。
可选地,该GPCC条目组盒子包括:多个GPCC区域的条目的标识。
可选地,GPCC条目组盒子包括:第四标识(initial_item_ID)。其中,第四标识是多个GPCC区域的条目中,在视频播放设备初始呈现的条目的标识。
可选地,GPCC条目组盒子包括:第五标识(partial_item_flag)。若第五标识取值为第三数值,则表示多个GPCC区域的条目构成静态物体的完整GPCC帧。若第五标识取值为第四数值,则表示多个GPCC区域的条目构成静态物体的部分GPCC帧。
可选地,该第三数值可以是0,第四数值可以是1,但不限于此。
可选地,GPCC条目组盒子包括:多个GPCC区域构成的GPCC区域的位置信息。
示例性地,若多个GPCC区域是R1和R2两个区域,则GPCC条目组盒子包括R1+R2区域的位置信息。
下面将通过代码对上述GPCC条目组盒子中的各个字段进行说明:
Figure PCTCN2021131037-appb-000003
GPCCItemGroupBox包含的条目为同属一个静态物体的条目,在呈现消费时存在关联关系的条目。该GPCCItemGroupBox中包含的所有条目可能构成一个完整的GPCC帧,也可能为一个GPCC帧的一部分。
initial_item_ID指示在一个条目组内,初始消费的条目的标识。
需要说明的是,该initial_item_ID仅在当前条目组为用户初次请求的条目组时有效,例如:同一个静态物体对应了两个点云媒体,分别为F1和F2,当用户第一次请求F1时,则F1内的条目组中的initial_item_ID有效,对于第二次请求的F2,其内部的initial_item_ID无效。
partial_item_flag取值为0时,表示GPCCItemGroupBox包含的所有条目及其关联的条目构成一个完整的GPCC帧,取值为1时,表示GPCCItemGroupBox包含的所有条目及其关联的条目仅构成部分GPCC帧。
为支持本申请提出的技术,还需扩展对应的信令消息,以MPD信令为例,扩展如下:
GPCC条目描述子用于描述GPCC条目相关的元素和属性,该描述子为一个SupplementalProperty元素。
其@schemeIdUri属性等于"urn:mpeg:mpegI:gpcc:2020:gpsr"。该描述子可以位于Adaptation Set层级或者Representation层级。
其中,Representation:DASH中,一个或多个媒体成分的组合,比如某种分辨率的视频文件可以看作一个Representation(描述)。
Adaptation Sets:DASH中,一个或多个视频流的集合,一个Adaptation Sets中可以包含多个Representation。
表1:GPCC条目描述子元素及属性
Figure PCTCN2021131037-appb-000004
Figure PCTCN2021131037-appb-000005
Figure PCTCN2021131037-appb-000006
综上,在本申请中,视频制作设备可以对多个GPCC区域的条目进行灵活组合,以形成不同的非时序点云媒体,其中非时序点云媒体可以构成完整的GPCC帧,也可以构成部分GPCC帧。从而可以提高视频制作的灵活性。进一步地,当一个非时序点云媒体包括多个GPCC区域的条目时,视频制作设备还可以提高初始呈现的条目。
下面将通过下述实施例对前述图3对应的实施例进行举例说明:
假设视频制作设备获取到某静态物体的非时序点云数据,该非时序点云数据在视频制作设备端存在4个版本的点云媒体:对应全部非时序点云数据的点云媒体F0,对应部分非时序点云数据的点云媒体F1~F3,其中,F1~F3分别对应3D空间区域R1~R3。基于此,F0~F3的点云媒体封装内容如下:
F0:ObjectInfoProperty:object_ID=10;
ItemSpatialInfoProperty:sub_region_contained=1;tile_id_present=1
inital_region_id=1001;
R0:3d_region_id=100,anchor=(0,0,0),region=(200,200,200)
num_sub_regions=3;
SR1:3d_region_id=1001,anchor=(0,0,0),region=(100,100,200);
num_tiles=1,tile_id[]=(1);
SR2:3d_region_id=1002,anchor=(100,0,0),region=(100,100,200);
num_tiles=1,tile_id[]=(2);
SR3:3d_region_id=1003,anchor=(0,100,0),region=(200,100,200);
num_tiles=2,tile_id[]=(3,4);
F1:ObjectInfoProperty:object_ID=10;
ItemSpatialInfoProperty:sub_region_contained=0;tile_id_present=1;
inital_region_id=0;
R1:3d_region_id=101,anchor=(0,0,0),region=(100,100,200);
num_tiles=1,tile_id[]=(1);
F2:ObjectInfoProperty:object_ID=10;
ItemSpatialInfoProperty:sub_region_contained=0;tile_id_present=1
inital_region_id=0;
R2:3d_region_id=102,anchor=(100,0,0),region=(100,100,200);
num_tiles=1,tile_id[]=(2);
F3:ObjectInfoProperty:object_ID=10;
ItemSpatialInfoProperty:sub_region_contained=0;tile_id_present=1;inital_region_id=0;
R3:3d_region_id=103,anchor=(0,100,0),region=(200,100,200);
num_tiles=2,tile_id[]=(3,4);
进一步地,视频制作设备将F0~F3的MPD信令发送给用户,其中的Object_ID、空间区域、子空间区域、tile标识信息与文件封装中相同,在此不再赘述。
由于用户U1网络条件好,数据传输时延低,所以可以请求F0;用户U2网络条件较差,数据传输时延较高,所以可以请求F1。
视频制作设备向用户U1对应的视频播放设备传输F0,向用户U2对应的视频播放设备传输F1。
用户U1对应的视频播放设备收到F0后,初始观看区域为SR1区域,对应tile ID为1。U1在解码消费时,可从整体码流中单独解码tile’1’直接消费呈现,而不需要解码整体文件后呈现,提升了解码效率,降低了渲染呈现所需要的时间。当U1继续消费,观看到SR2区域时,对应tile ID为2,则直接解码整体码流中tile’2’对应的部分进行呈现消费。
用户U2对应的视频播放设备收到F1后,解码F1进行消费,并根据用户下一步可能消费的区域,结合MPD文件中的信息,即Object_ID以及空间区域信息,提前请求F2或F3进行缓存。
也就是说,在视频制作设备向视频播放设备发送第一非时序点云媒体之后,视频播放设备还可以基于用户的消费需求和可能的消费区域等情况,再次向视频制作设备有目的性地请求同一静态物体的非时序点云媒体。
在一种可能的实现方式中,在S307:根据所述第一请求消息,向所述视频播放设备发送所述第一非时序点云媒体之后,所述方法还包括:
接收所述视频播放设备基于所述静态物体的标识所发送的第二请求消息,所述第二请求消息用于请求所述至少一个非时序点云媒体中的第三非时序点云媒体;根据所述第二请求消息,向所述视频播放设备发送所述第三非时序点云媒体。
由于视频播放设备通过前述获取的点云媒体中静态物体的标识,在需要再次获取该静态物体对应的其他点云媒体时,可以基于该静态物体的标识有针对性的分多次请求同一个静态物体的点云媒体。
假设视频制作设备获取到某静态物体的非时序点云数据,该非时序点云数据在视频制作设备端存在2个版本的点云媒体:F1与F2,F1中包含item1~item2,F2中包含item3~item4。
F1与F2的点云媒体封装内容如下:
F1:
item1:ObjectInfoProperty:object_ID=10;item_ID=101
ItemSpatialInfoProperty:sub_region_contained=0;tile_id_present=1
inital_region_id=0;
R1:3d_region_id=1001,anchor=(0,0,0),region=(100,100,200);
num_tiles=1,tile_id[]=(1);
item2:ObjectInfoProperty:object_ID=10;item_ID=102
ItemSpatialInfoProperty:sub_region_contained=0;tile_id_present=1
inital_region_id=0;
R2:3d_region_id=1002,anchor=(100,0,0),region=(100,100,200);
num_tiles=1,tile_id[]=(2);
GPCCItemGroupBox:
initial_item_ID=101;partial_item_flag=1;
R1+R2:3d_region_id=0001,anchor=(0,0,0),region=(200,100,200);
F2:
item3:ObjectInfoProperty:object_ID=10;item_ID=103
ItemSpatialInfoProperty:sub_region_contained=0;tile_id_present=1
inital_region_id=0;
R3:3d_region_id=1003,anchor=(0,100,0),region=(100,100,200);
num_tiles=1,tile_id[]=(3);
item4:ObjectInfoProperty:object_ID=10;item_ID=104
ItemSpatialInfoProperty:sub_region_contained=0;tile_id_present=1
inital_region_id=0;
R4:3d_region_id=1004,anchor=(100,100,0),region=(100,100,200);
num_tiles=1,tile_id[]=(4);
GPCCItemGroupBox:
initial_item_ID=103;partial_item_flag=1;
R3+R4:3d_region_id=0002,anchor=(0,100,0),region=(200,100,200);
视频制作设备将F1~F2的MPD信令发送给用户,其中的Object_ID、空间区域、tile ID信息与点云媒体封装中相同,在此不再赘述。
用户U1请求F1消费;用户U2请求F2消费。
视频制作设备分别向用户U1对应的视频播放设备传输F1,并向用户U2对应的视频播放设备传输F2。
U1对应的视频播放设备收到F1后,初始观看item1,item1的初始观看区域为item1整体观看空间,因此U1消费item1整体。由于F1中包含item1与item2,分别对应tile1和tile2,U1在消费item1时可以直接解码tile1对应的部分码流进行呈现。若U1继续消费,观看到item2区域时,对应tile ID为2,则直接解码整体码流中tile’2’对应的部分进行呈现消费。若U1继续消费,需观看item3对应的区域时,则根据MPD文件请求F2。收到F2后,直接根据用户观看的区域进行呈现消费,不再判断F2中的初始消费item信息和初始观看区域信息。
U2对应的视频播放设备收到F2后,初始观看item3,item3的初始观看区域为item3整体观看空间,因此U2消费item3整体。由于F2中包含item3与item4,分别对应tile3和tile4,U2在消费item3时可以直接解码tile3对应的部分码流进行呈现。
图5为本申请实施例提供的一种非时序点云媒体的处理装置500的示意图,该装置500包括:处理单元510和通信单元520。处理单元510用于:获取静态物体的非时序点云数据。通过GPCC编码方式对非时序点云数据进行处理,得到GPCC比特流。对GPCC比特流进行封装,生成至少一个GPCC区域的条目,所述GPCC区域的条目用于表示所述GPCC区域对应的三维3D空间区域的GPCC成分。对至少一个GPCC区域的条目进行封装,生成静态物体的至少一个非时序点云媒体,所述非时序点云媒体包括所述静态物体的标识。向视频播放设备发送至少一个非时序点云媒体的MPD信令。通信单元520用于:接收视频播放设备根据所述MPD信令发送的第一请求消息,所述第一请求消息用于请求所述至少一个非时序点云媒体中的第一非时序点云媒体。根据第一请求消息,向视频播放设备发送第一非时序点云媒体。
可选地,目标GPCC区域的条目包括:3D空间区域条目属性,3D空间区域条目属性包括:第一标识和第二标识。其中,目标GPCC区域为至少一个GPCC区域中的一个GPCC区域。第一标识用于标识目标GPCC区域对应的目标3D空间区域是否被划分为多个子空间区域。第二标识用于标识目标GPCC区域是否采用GPCC tile编码方式。
可选地,若目标GPCC区域对应的目标3D空间区域被划分为多个子空间区域,则3D空间区域条目属性还包括:多个子空间区域各自的信息和目标3D空间区域的信息。
可选地,针对多个子空间区域中的任一个子空间区域,子空间区域的信息包括以下至少一项:子空间区域的标识、子空间区域的位置信息、目标GPCC区域采用GPCC tile编码时,子空间区域中的tile标识。目标3D空间区域的信息包括以下至少一项:目标3D空间区域的标识、目标3D空间区域的位置信息、目标3D空间区域包括的子空间区域的数量。
可选地,若目标GPCC区域对应的目标3D空间区域被划分为多个子空间区域,则3D空 间区域条目属性还包括:第三标识。当第三标识的取值为第一数值或者空时,表示目标GPCC区域对应的条目是视频播放设备初始呈现的条目时,针对目标3D空间区域和目标3D空间区域的子空间区域,在视频播放设备初始呈现的是目标3D空间区域。当第三标识的取值为第二数值时,表示目标GPCC区域对应的条目是视频播放设备初始呈现的条目时,针对目标3D空间区域和目标3D空间区域的子空间区域,在视频播放设备初始呈现的是目标3D空间区域中第二数值所对应的子空间区域。
可选地,若目标GPCC区域对应的目标3D空间区域未被划分为多个子空间区域,则3D空间区域条目属性还包括:目标3D空间区域的信息。
可选地,目标3D空间区域的信息包括以下至少一项:目标3D空间区域的标识、目标3D空间区域的位置信息、目标GPCC区域采用GPCC tile编码时,目标3D空间区域中的tile标识。
可选地,处理单元510具体用于:若至少一个GPCC区域的条目是1个,则将1个GPCC区域的条目封装为1个非时序点云媒体。若至少一个GPCC区域的条目是N个,则将N个GPCC区域的条目封装为M个非时序点云媒体。其中,N为大于1的整数,1≤M≤N,M为整数。
可选地,第二非时序点云媒体包括:GPCC条目组盒子。其中,第二非时序点云媒体是至少一个非时序点云媒体中包括多个GPCC区域的条目的任一个非时序点云媒体,GPCC条目组盒子用于关联多个GPCC区域的条目。
可选地,GPCC条目组盒子包括:第四标识。其中,第四标识是多个GPCC区域的条目中,在视频播放设备初始呈现的条目的标识。
可选地,GPCC条目组盒子包括:第五标识若第五标识取值为第三数值,则表示多个GPCC区域的条目构成静态物体的完整GPCC帧。若第五标识取值为第四数值,则表示多个GPCC区域的条目构成静态物体的部分GPCC帧。
可选地,GPCC条目组盒子包括:多个GPCC区域构成的GPCC区域的位置信息。
可选地,通信单元520还用于:接收视频播放设备基于所述静态物体的标识所发送的第二请求消息,所述第二请求消息用于请求所述至少一个非时序点云媒体中的第三非时序点云媒体。根据第二请求消息,向视频播放设备发送第三非时序点云媒体。
应理解的是,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,图5所示的装置500可以执行视频制作设备对应的方法实施例,并且装置500中的各个模块的前述和其它操作和/或功能分别为了实现视频制作设备对应的方法实施例,为了简洁,在此不再赘述。
上文中结合附图从功能模块的角度描述了本申请实施例的装置500。应理解,该功能模块可以通过硬件形式实现,也可以通过软件形式的指令实现,还可以通过硬件和软件模块组合实现。具体地,本申请实施例中的方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路和/或软件形式的指令完成,结合本申请实施例公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。可选地,软件模块可以位于随机存储器,闪存、只读存储器、可编程只读存储器、电可擦写可编程存储器、寄存器等本领域的成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法实施例中的步骤。
图6为本申请实施例提供的一种非时序点云媒体的处理装置600的示意图,该装置600包括:处理单元610和通信单元620。通信单元620用于:接收至少一个非时序点云媒体的MPD信令,所述非时序点云媒体包括所述静态物体的标识。根据所述MPD信令向视频制作设备发送第一请求消息,所述第一请求消息用于请求所述至少一个非时序点云媒体中的第一非时序点云媒体。从所述视频制作设备接收第一非时序点云媒体。处理单元610用于播放第一非时序点云媒体。其中,至少一个非时序点云媒体是对至少一个点云压缩GPCC区域的条目进行封装生成得到的,所述至少一个GPCC区域的条目是对GPCC比特流进行封装生成的,所述GPCC比特流是通过GPCC编码方式对静态物体的非时序点云数据进行处理得到的。针对至少一个GPCC区域的条目中的任一个GPCC区域的条目,GPCC区域的条目用于表示GPCC区域对应的3D空间区域的GPCC成分。
可选地,目标GPCC区域的条目包括:3D空间区域条目属性,3D空间区域条目属性包括:第一标识和第二标识。其中,目标GPCC区域为至少一个GPCC区域中的一个GPCC区域。第一标识用于标识目标GPCC区域对应的目标3D空间区域是否被划分为多个子空间区域。第二标识用于标识目标GPCC区域是否采用GPCC tile编码方式。
可选地,若目标GPCC区域对应的目标3D空间区域被划分为多个子空间区域,则3D空间区域条目属性还包括:多个子空间区域各自的信息和目标3D空间区域的信息。
可选地,针对多个子空间区域中的任一个子空间区域,子空间区域的信息包括以下至少一项:子空间区域的标识、子空间区域的位置信息、目标GPCC区域采用GPCC tile编码时,子空间区域中的tile标识。目标3D空间区域的信息包括以下至少一项:目标3D空间区域的标识、目标3D空间区域的位置信息、目标3D空间区域包括的子空间区域的数量。
可选地,若目标GPCC区域对应的目标3D空间区域被划分为多个子空间区域,则3D空间区域条目属性还包括:第三标识。当第三标识的取值为第一数值或者空时,表示目标GPCC区域对应的条目是视频播放设备初始呈现的条目时,针对目标3D空间区域和目标3D空间区域的子空间区域,在视频播放设备初始呈现的是目标3D空间区域。当第三标识的取值为第二数值时,表示目标GPCC区域对应的条目是视频播放设备初始呈现的条目时,针对目标3D空间区域和目标3D空间区域的子空间区域,在视频播放设备初始呈现的是目标3D空间区域中第二数值所对应的子空间区域。
可选地,若目标GPCC区域对应的目标3D空间区域未被划分为多个子空间区域,则3D空间区域条目属性还包括:目标3D空间区域的信息。
可选地,目标3D空间区域的信息包括以下至少一项:目标3D空间区域的标识、目标3D空间区域的位置信息、目标GPCC区域采用GPCC tile编码时,目标3D空间区域中的tile标识。
可选地,若至少一个GPCC区域的条目是1个,则1个GPCC区域的条目被封装为1个非时序点云媒体。若至少一个GPCC区域的条目是N个,则N个GPCC区域的条目被封装为M个非时序点云媒体。其中,N为大于1的整数,1≤M≤N,M为整数。
可选地,第二非时序点云媒体包括:GPCC条目组盒子。其中,第二非时序点云媒体是至少一个非时序点云媒体中包括多个GPCC区域的条目的任一个非时序点云媒体。GPCC条目组盒子用于关联多个GPCC区域的条目。
可选地,GPCC条目组盒子包括:第四标识。其中,第四标识是多个GPCC区域的条目中,在视频播放设备初始呈现的条目的标识。
可选地,GPCC条目组盒子包括:第五标识。若第五标识取值为第三数值,则表示多个GPCC区域的条目构成静态物体的完整GPCC帧。若第五标识取值为第四数值,则表示多个GPCC区域的条目构成静态物体的部分GPCC帧。
可选地,GPCC条目组盒子包括:多个GPCC区域构成的GPCC区域的位置信息。
可选地,通信单元620还用于根据MPD信令,向视频制作设备发送第二请求消息。接收第二非时序点云媒体。
可选地,处理单元610还用于播放第二非时序点云媒体。
应理解的是,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,图6所示的装置600可以执行视频播放设备对应的方法实施例,并且装置600中的各个模块的前述和其它操作和/或功能分别为了实现视频播放设备对应的方法实施例,为了简洁,在此不再赘述。
上文中结合附图从功能模块的角度描述了本申请实施例的装置600。应理解,该功能模块可以通过硬件形式实现,也可以通过软件形式的指令实现,还可以通过硬件和软件模块组合实现。具体地,本申请实施例中的方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路和/或软件形式的指令完成,结合本申请实施例公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。可选地,软件模块可以位于随机存储器,闪存、只读存储器、可编程只读存储器、电可擦写可编程存储器、寄存器等本领域的成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法实施例中的步骤。
实施例8
图7是本申请实施例提供的视频制作设备700的示意性框图。
如图7所示,该视频制作设备700可包括:
存储器710和处理器720,该存储器710用于存储计算机程序,并将该程序代码传输给该处理器720。换言之,该处理器720可以从存储器710中调用并运行计算机程序,以实现本申请实施例中的方法。
例如,该处理器720可用于根据该计算机程序中的指令执行上述方法实施例。
在本申请的一些实施例中,该处理器720可以包括但不限于:
通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等等。
在本申请的一些实施例中,该存储器710包括但不限于:
易失性存储器和/或非易失性存储器。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically  EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synch link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。
在本申请的一些实施例中,该计算机程序可以被分割成一个或多个模块,该一个或者多个模块被存储在该存储器710中,并由该处理器720执行,以完成本申请提供的方法。该一个或多个模块可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述该计算机程序在该视频制作设备中的执行过程。
如图7所示,该视频制作设备还可包括:
收发器730,该收发器730可连接至该处理器720或存储器710。
其中,处理器720可以控制该收发器730与其他设备进行通信,具体地,可以向其他设备发送信息或数据,或接收其他设备发送的信息或数据。收发器730可以包括发射机和接收机。收发器730还可以进一步包括天线,天线的数量可以为一个或多个。
应当理解,该视频制作设备中的各个组件通过总线系统相连,其中,总线系统除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。
实施例9
图8是本申请实施例提供的视频播放设备800的示意性框图。
如图8所示,该视频播放设备800可包括:
存储器810和处理器820,该存储器810用于存储计算机程序,并将该程序代码传输给该处理器820。换言之,该处理器820可以从存储器810中调用并运行计算机程序,以实现本申请实施例中的方法。
例如,该处理器820可用于根据该计算机程序中的指令执行上述方法实施例。
在本申请的一些实施例中,该处理器820可以包括但不限于:
通用处理器、DSP、ASIC、FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等等。
在本申请的一些实施例中,该存储器810包括但不限于:
易失性存储器和/或非易失性存储器。其中,非易失性存储器可以是ROM、PROM、EPROM、EEPROM或闪存。易失性存储器可以是RAM,其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如SRAM、DRAM、SDRAM、DDR SDRAM、ESDRAM、SLDRAM和DR RAM。
在本申请的一些实施例中,该计算机程序可以被分割成一个或多个模块,该一个或者多个模块被存储在该存储器810中,并由该处理器820执行,以完成本申请提供的方法。该一个或多个模块可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述该计算机程序在该视频播放设备中的执行过程。
如图8所示,该视频播放设备还可包括:
收发器830,该收发器830可连接至该处理器820或存储器810。
其中,处理器820可以控制该收发器830与其他设备进行通信,具体地,可以向其他设备发送信息或数据,或接收其他设备发送的信息或数据。收发器830可以包括发射机和接收机。收发器830还可以进一步包括天线,天线的数量可以为一个或多个。
应当理解,该视频播放设备中的各个组件通过总线系统相连,其中,总线系统除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。
本申请还提供了一种计算机存储介质,其上存储有计算机程序,该计算机程序被计算机执行时使得该计算机能够执行上述方法实施例的方法。或者说,本申请实施例还提供一种包含指令的计算机程序产品,该指令被计算机执行时使得计算机执行上述方法实施例的方法。
当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行该计算机程序指令时,全部或部分地产生按照本申请实施例该的流程或功能。该计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。该计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,该计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。该计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。该可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如数字视频光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的模块及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,该模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。例如, 在本申请各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。
以上该,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以该权利要求的保护范围为准。

Claims (32)

  1. 一种非时序点云媒体的处理方法,所述方法由视频制作设备执行,所述方法包括:
    获取静态物体的非时序点云数据;
    通过基于几何模型的点云压缩GPCC编码方式对所述非时序点云数据进行处理,得到GPCC比特流;
    对所述GPCC比特流进行封装,生成至少一个GPCC区域的条目,所述GPCC区域的条目用于表示所述GPCC区域对应的三维3D空间区域的GPCC成分;
    对所述至少一个GPCC区域的条目进行封装,生成所述静态物体的至少一个非时序点云媒体,所述非时序点云媒体包括所述静态物体的标识;
    向视频播放设备发送所述至少一个非时序点云媒体的媒体演示描述MPD信令;
    接收所述视频播放设备根据所述MPD信令发送的第一请求消息,所述第一请求消息用于请求所述至少一个非时序点云媒体中的第一非时序点云媒体;
    根据所述第一请求消息,向所述视频播放设备发送所述第一非时序点云媒体。
  2. 根据权利要求1所述的方法,目标GPCC区域的条目包括:3D空间区域条目属性,所述3D空间区域条目属性包括:第一标识和第二标识;
    其中,所述目标GPCC区域为所述至少一个GPCC区域中的一个GPCC区域;所述第一标识用于标识所述目标GPCC区域对应的目标3D空间区域是否被划分为多个子空间区域;所述第二标识用于标识所述目标GPCC区域是否采用GPCC tile编码方式。
  3. 根据权利要求2所述的方法,若所述目标GPCC区域对应的目标3D空间区域被划分为多个子空间区域,则所述3D空间区域条目属性还包括:所述多个子空间区域各自的信息和所述目标3D空间区域的信息。
  4. 根据权利要求3所述的方法,针对所述多个子空间区域中的任一个子空间区域,所述子空间区域的信息包括以下至少一项:所述子空间区域的标识、所述子空间区域的位置信息、所述目标GPCC区域采用GPCC tile编码时,所述子空间区域中的tile标识;
    所述目标3D空间区域的信息包括以下至少一项:所述目标3D空间区域的标识、所述目标3D空间区域的位置信息、所述目标3D空间区域包括的子空间区域的数量。
  5. 根据权利要求2-4任一项所述的方法,若所述目标GPCC区域对应的目标3D空间区域被划分为多个子空间区域,则所述3D空间区域条目属性还包括:第三标识;
    当所述第三标识的取值为第一数值或者空时,表示所述目标GPCC区域对应的条目是所述视频播放设备初始呈现的条目时,针对所述目标3D空间区域和所述目标3D空间区域的子空间区域,在所述视频播放设备初始呈现的是所述目标3D空间区域;
    当所述第三标识的取值为第二数值时,表示所述目标GPCC区域对应的条目是所述视频播放设备初始呈现的条目时,针对所述目标3D空间区域和所述目标3D空间区域的子空间区域,在所述视频播放设备初始呈现的是所述目标3D空间区域中所述第二数值所对应的子空间区域。
  6. 根据权利要求2所述的方法,若所述目标GPCC区域对应的目标3D空间区域未被划 分为多个子空间区域,则所述3D空间区域条目属性还包括:所述目标3D空间区域的信息。
  7. 根据权利要求6所述的方法,所述目标3D空间区域的信息包括以下至少一项:所述目标3D空间区域的标识、所述目标3D空间区域的位置信息、所述目标GPCC区域采用GPCC tile编码时,所述目标3D空间区域中的tile标识。
  8. 根据权利要求1-4任一项所述的方法,所述对所述至少一个GPCC区域的条目进行封装,生成所述静态物体的至少一个非时序点云媒体,包括:
    若所述至少一个GPCC区域的条目是1个,则将1个GPCC区域的条目封装为1个非时序点云媒体;
    若所述至少一个GPCC区域的条目是N个,则将N个GPCC区域的条目封装为M个非时序点云媒体;
    其中,N为大于1的整数,1≤M≤N,M为整数。
  9. 根据权利要求1-4任一项所述的方法,第二非时序点云媒体包括:GPCC条目组盒子;
    其中,所述第二非时序点云媒体是所述至少一个非时序点云媒体中包括多个GPCC区域的条目的任一个非时序点云媒体,所述GPCC条目组盒子用于关联所述多个GPCC区域的条目。
  10. 根据权利要求9所述的方法,所述GPCC条目组盒子包括:第四标识;
    其中,所述第四标识是所述多个GPCC区域的条目中,在所述视频播放设备初始呈现的条目的标识。
  11. 根据权利要求9所述的方法,所述GPCC条目组盒子包括:第五标识;
    若所述第五标识取值为第三数值,则表示所述多个GPCC区域的条目构成所述静态物体的完整GPCC帧;
    若所述第五标识取值为第四数值,则表示所述多个GPCC区域的条目构成所述静态物体的部分GPCC帧。
  12. 根据权利要求9所述的方法,所述GPCC条目组盒子包括:所述多个GPCC区域构成的GPCC区域的位置信息。
  13. 根据权利要求1-4任一项所述的方法,在所述根据所述第一请求消息,向所述视频播放设备发送所述第一非时序点云媒体之后,所述方法还包括:
    接收所述视频播放设备基于所述静态物体的标识所发送的第二请求消息,所述第二请求消息用于请求所述至少一个非时序点云媒体中的第三非时序点云媒体;
    根据所述第二请求消息,向所述视频播放设备发送所述第三非时序点云媒体。
  14. 一种非时序点云媒体的处理方法,所述方法由视频播放设备执行,所述方法包括:
    接收至少一个非时序点云媒体的MPD信令,所述非时序点云媒体包括所述静态物体的标识;
    根据所述MPD信令向视频制作设备发送第一请求消息,所述第一请求消息用于请求所述至少一个非时序点云媒体中的第一非时序点云媒体;
    从所述视频制作设备接收所述第一非时序点云媒体;
    播放所述第一非时序点云媒体;
    其中,所述至少一个非时序点云媒体是对至少一个点云压缩GPCC区域的条目进行封装生成得到的,所述至少一个GPCC区域的条目是对GPCC比特流进行封装生成的,所述GPCC比特流是通过GPCC编码方式对静态物体的非时序点云数据进行处理得到的;
    针对所述至少一个GPCC区域的条目中的任一个GPCC区域的条目,所述GPCC区域的条目用于表示所述GPCC区域对应的3D空间区域的GPCC成分。
  15. 根据权利要求14所述的方法,目标GPCC区域的条目包括:3D空间区域条目属性,所述3D空间区域条目属性包括:第一标识和第二标识;
    其中,所述目标GPCC区域为所述至少一个GPCC区域中的一个GPCC区域;所述第一标识用于标识所述目标GPCC区域对应的目标3D空间区域是否被划分为多个子空间区域;所述第二标识用于标识所述目标GPCC区域是否采用GPCC tile编码方式。
  16. 根据权利要求15所述的方法,若所述目标GPCC区域对应的目标3D空间区域被划分为多个子空间区域,则所述3D空间区域条目属性还包括:所述多个子空间区域各自的信息和所述目标3D空间区域的信息。
  17. 根据权利要求16所述的方法,针对所述多个子空间区域中的任一个子空间区域,所述子空间区域的信息包括以下至少一项:所述子空间区域的标识、所述子空间区域的位置信息、所述目标GPCC区域采用GPCC tile编码时,所述子空间区域中的tile标识;
    所述目标3D空间区域的信息包括以下至少一项:所述目标3D空间区域的标识、所述目标3D空间区域的位置信息、所述目标3D空间区域包括的子空间区域的数量。
  18. 根据权利要求15-17任一项所述的方法,若所述目标GPCC区域对应的目标3D空间区域被划分为多个子空间区域,则所述3D空间区域条目属性还包括:第三标识;
    当所述第三标识的取值为第一数值或者空时,表示所述目标GPCC区域对应的条目是所述视频播放设备初始呈现的条目时,针对所述目标3D空间区域和所述目标3D空间区域的子空间区域,在所述视频播放设备初始呈现的是所述目标3D空间区域;
    当所述第三标识的取值为第二数值时,表示所述目标GPCC区域对应的条目是所述视频播放设备初始呈现的条目时,针对所述目标3D空间区域和所述目标3D空间区域的子空间区域,在所述视频播放设备初始呈现的是所述目标3D空间区域中所述第二数值所对应的子空间区域。
  19. 根据权利要求15所述的方法,若所述目标GPCC区域对应的目标3D空间区域未被划分为多个子空间区域,则所述3D空间区域条目属性还包括:所述目标3D空间区域的信息。
  20. 根据权利要求19所述的方法,所述目标3D空间区域的信息包括以下至少一项:所述目标3D空间区域的标识、所述目标3D空间区域的位置信息、所述目标GPCC区域采用GPCC tile编码时,所述目标3D空间区域中的tile标识。
  21. 根据权利要求14-17任一项所述的方法,若所述至少一个GPCC区域的条目是1个,则1个GPCC区域的条目被封装为1个非时序点云媒体;
    若所述至少一个GPCC区域的条目是N个,则N个GPCC区域的条目被封装为M个非时序点云媒体;
    其中,N为大于1的整数,1≤M≤N,M为整数。
  22. 根据权利要求14-17任一项所述的方法,第二非时序点云媒体包括:GPCC条目组盒子;
    其中,所述第二非时序点云媒体是所述至少一个非时序点云媒体中包括多个GPCC区域的条目的任一个非时序点云媒体;所述GPCC条目组盒子用于关联所述多个GPCC区域的条目。
  23. 根据权利要求22所述的方法,所述GPCC条目组盒子包括:第四标识;
    其中,所述第四标识是所述多个GPCC区域的条目中,在所述视频播放设备初始呈现的条目的标识。
  24. 根据权利要求22所述的方法,所述GPCC条目组盒子包括:第五标识;
    若所述第五标识取值为第三数值,则表示所述多个GPCC区域的条目构成所述静态物体的完整GPCC帧;
    若所述第五标识取值为第四数值,则表示所述多个GPCC区域的条目构成所述静态物体的部分GPCC帧。
  25. 根据权利要求22所述的方法,所述GPCC条目组盒子包括:所述多个GPCC区域构成的GPCC区域的位置信息。
  26. 根据权利要求14-17任一项所述的方法,在所述从所述视频制作设备接收所述第一非时序点云媒体之后,所述方法还包括:
    根据所述MPD信令,基于所述静态物体的标识向视频制作设备发送第二请求消息,所述第二请求消息用于请求所述至少一个非时序点云媒体中的第三非时序点云媒体;
    从所述视频制作设备接收所述第三非时序点云媒体;
    播放所述第三非时序点云媒体。
  27. 一种非时序点云媒体的处理装置,包括:处理单元和通信单元;
    所述处理单元用于:
    获取静态物体的非时序点云数据;
    通过GPCC编码方式对所述非时序点云数据进行处理,得到GPCC比特流;
    对所述GPCC比特流进行封装,生成至少一个GPCC区域的条目,所述GPCC区域的条目用于表示所述GPCC区域对应的三维3D空间区域的GPCC成分;
    对所述至少一个GPCC区域的条目进行封装,生成所述静态物体的至少一个非时序点云媒体,所述非时序点云媒体包括所述静态物体的标识;
    向视频播放设备发送所述至少一个非时序点云媒体的MPD信令;
    所述通信单元用于:
    接收所述视频播放设备根据所述MPD信令发送的第一请求消息,所述第一请求消息用于请求所述至少一个非时序点云媒体中的第一非时序点云媒体;
    根据所述第一请求消息,向所述视频播放设备发送所述第一非时序点云媒体。
  28. 一种非时序点云媒体的处理装置,包括:处理单元和通信单元;
    所述通信单元用于:
    接收至少一个非时序点云媒体的MPD信令,所述非时序点云媒体包括所述静态物体的 标识;
    根据所述MPD信令向视频制作设备发送第一请求消息,所述第一请求消息用于请求所述至少一个非时序点云媒体中的第一非时序点云媒体;
    从所述视频制作设备接收所述第一非时序点云媒体;
    所述处理单元用于播放所述第一非时序点云媒体;
    其中,所述至少一个非时序点云媒体是对至少一个点云压缩GPCC区域的条目进行封装生成得到的,所述至少一个GPCC区域的条目是对GPCC比特流进行封装生成的,所述GPCC比特流是通过GPCC编码方式对静态物体的非时序点云数据进行处理得到的;
    针对所述至少一个GPCC区域的条目中的任一个GPCC区域的条目,所述GPCC区域的条目用于表示所述GPCC区域对应的3D空间区域的GPCC成分。
  29. 一种视频制作设备,包括:
    处理器和存储器,所述存储器用于存储计算机程序,所述处理器用于调用并运行所述存储器中存储的计算机程序,以执行权利要求1至13中任一项所述的方法。
  30. 一种视频播放设备,包括:
    处理器和存储器,所述存储器用于存储计算机程序,所述处理器用于调用并运行所述存储器中存储的计算机程序,以执行权利要求14至26中任一项所述的方法。
  31. 一种计算机可读存储介质,用于存储计算机程序,所述计算机程序使得计算机执行如权利要求1至13中任一项所述的方法,或者执行如权利要求14至26中任一项所述的方法。
  32. 一种包括指令的计算机程序产品,当其在计算机上运行时,使得所述计算机执行权利要求1至13中任一项所述的方法,或者执行如权利要求14至26中任一项所述的方法。
PCT/CN2021/131037 2020-11-26 2021-11-17 非时序点云媒体的处理方法、装置、设备及存储介质 WO2022111343A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP2023530295A JP7508710B2 (ja) 2020-11-26 2021-11-17 非時系列ポイントクラウドメディアの処理方法、装置、機器、及びコンピュータプログラム
KR1020237021494A KR20230110790A9 (ko) 2020-11-26 2021-11-17 비-순차적 포인트 클라우드 매체를 프로세싱하기 위한 방법 및 장치, 디바이스 및 스토리지 매체
EP21896844.4A EP4254351A4 (en) 2020-11-26 2021-11-17 METHOD AND DEVICE FOR PROCESSING NON-SEQUENTIAL POINT CLOUD MEDIA, DEVICE AND STORAGE MEDIUM
US17/969,627 US20230048474A1 (en) 2020-11-26 2022-10-19 Method and apparatus for processing non-sequential point cloud media, device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011347626.1A CN114549778A (zh) 2020-11-26 2020-11-26 非时序点云媒体的处理方法、装置、设备及存储介质
CN202011347626.1 2020-11-26

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/969,627 Continuation US20230048474A1 (en) 2020-11-26 2022-10-19 Method and apparatus for processing non-sequential point cloud media, device, and storage medium

Publications (1)

Publication Number Publication Date
WO2022111343A1 true WO2022111343A1 (zh) 2022-06-02

Family

ID=81660424

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/131037 WO2022111343A1 (zh) 2020-11-26 2021-11-17 非时序点云媒体的处理方法、装置、设备及存储介质

Country Status (5)

Country Link
US (1) US20230048474A1 (zh)
EP (1) EP4254351A4 (zh)
KR (1) KR20230110790A9 (zh)
CN (1) CN114549778A (zh)
WO (1) WO2022111343A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230276053A1 (en) * 2020-06-22 2023-08-31 Interdigital Patent Holdings, Inc. Adaptive streaming of geometry-based point clouds

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781894A (zh) * 2019-09-29 2020-02-11 腾讯科技(深圳)有限公司 点云语义分割方法、装置及电子设备
WO2020060813A1 (en) * 2018-09-18 2020-03-26 Vid Scale, Inc. Methods and apparatus for point cloud compression bitstream format
CN111277904A (zh) * 2020-02-17 2020-06-12 腾讯科技(深圳)有限公司 一种视频的播放控制方法、装置及计算设备
TW202029757A (zh) * 2019-01-10 2020-08-01 新加坡商聯發科技(新加坡)私人有限公司 發信點雲多媒體資料的視埠以及興趣區域的方法及裝置
TW202041020A (zh) * 2019-03-15 2020-11-01 新加坡商 聯發科技(新加坡)私人有限公司 用信號通知點雲多媒體資料軌道的空間關係的方法和裝置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020060813A1 (en) * 2018-09-18 2020-03-26 Vid Scale, Inc. Methods and apparatus for point cloud compression bitstream format
TW202029757A (zh) * 2019-01-10 2020-08-01 新加坡商聯發科技(新加坡)私人有限公司 發信點雲多媒體資料的視埠以及興趣區域的方法及裝置
TW202041020A (zh) * 2019-03-15 2020-11-01 新加坡商 聯發科技(新加坡)私人有限公司 用信號通知點雲多媒體資料軌道的空間關係的方法和裝置
CN110781894A (zh) * 2019-09-29 2020-02-11 腾讯科技(深圳)有限公司 点云语义分割方法、装置及电子设备
CN111277904A (zh) * 2020-02-17 2020-06-12 腾讯科技(深圳)有限公司 一种视频的播放控制方法、装置及计算设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4254351A4

Also Published As

Publication number Publication date
US20230048474A1 (en) 2023-02-16
EP4254351A1 (en) 2023-10-04
EP4254351A4 (en) 2024-05-08
JP2023550752A (ja) 2023-12-05
CN114549778A (zh) 2022-05-27
KR20230110790A9 (ko) 2024-03-25
KR20230110790A (ko) 2023-07-25

Similar Documents

Publication Publication Date Title
CN114079781B (zh) 一种点云媒体的数据处理方法、装置、设备及存储介质
US20230421810A1 (en) Encapsulation and decapsulation methods and apparatuses for point cloud media file, and storage medium
CN113891117B (zh) 沉浸媒体的数据处理方法、装置、设备及可读存储介质
US20230169719A1 (en) Method and Apparatus for Processing Immersive Media Data, Storage Medium and Electronic Apparatus
WO2022062860A1 (zh) 一种点云媒体的数据处理方法、装置、设备及存储介质
WO2024037247A1 (zh) 一种点云媒体的数据处理方法及相关设备
CN114095737A (zh) 点云媒体文件封装方法、装置、设备及存储介质
WO2022111343A1 (zh) 非时序点云媒体的处理方法、装置、设备及存储介质
WO2024041239A1 (zh) 一种沉浸媒体的数据处理方法、装置、设备、存储介质及程序产品
WO2024041238A1 (zh) 一种点云媒体的数据处理方法及相关设备
US20230086988A1 (en) Method and apparatus for processing multi-view video, device and storage medium
WO2023226504A1 (zh) 一种媒体数据处理方法、装置、设备以及可读存储介质
US20230034937A1 (en) Media file encapsulating method, media file decapsulating method, and related devices
JP7508710B2 (ja) 非時系列ポイントクラウドメディアの処理方法、装置、機器、及びコンピュータプログラム
CN114581631A (zh) 沉浸式媒体的数据处理方法、装置和计算机可读存储介质
WO2023024839A1 (zh) 媒体文件封装与解封装方法、装置、设备及存储介质
CN115733576B (zh) 点云媒体文件的封装与解封装方法、装置及存储介质
WO2022111348A1 (zh) 点云媒体的数据处理方法、装置、设备及存储介质
US20230062933A1 (en) Data processing method, apparatus, and device for non-sequential point cloud media
CN115426502A (zh) 点云媒体的数据处理方法、装置、设备及存储介质
CN117082262A (zh) 点云文件封装与解封装方法、装置、设备及存储介质
CN115061984A (zh) 点云媒体的数据处理方法、装置、设备、存储介质
CN115941995A (zh) 媒体文件封装与解封装方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21896844

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023530295

Country of ref document: JP

ENP Entry into the national phase

Ref document number: 20237021494

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2021896844

Country of ref document: EP

Effective date: 20230626