WO2021209044A1 - Multimedia data transmission and reception methods, system, processor, and player - Google Patents

Multimedia data transmission and reception methods, system, processor, and player Download PDF

Info

Publication number
WO2021209044A1
WO2021209044A1 PCT/CN2021/087805 CN2021087805W WO2021209044A1 WO 2021209044 A1 WO2021209044 A1 WO 2021209044A1 CN 2021087805 W CN2021087805 W CN 2021087805W WO 2021209044 A1 WO2021209044 A1 WO 2021209044A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
media
information
type
multimedia data
Prior art date
Application number
PCT/CN2021/087805
Other languages
French (fr)
Chinese (zh)
Inventor
徐异凌
王超斐
Original Assignee
上海交通大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海交通大学 filed Critical 上海交通大学
Publication of WO2021209044A1 publication Critical patent/WO2021209044A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments

Definitions

  • This application belongs to the field of immersive multimedia, and specifically relates to a method for sending and receiving multimedia data with multiple degrees of freedom, a multimedia data system with multiple degrees of freedom, and a media processor and player.
  • VR virtual reality
  • HMD head-mounted displays
  • the immersive media produced by the VR system represents a virtual space where users can interact naturally as in the real world.
  • Virtual reality renders visual and auditory sensory stimuli in the real world and presents them to users. The user starts to look around from a display area in a three-dimensional space, and at the same time obtains the associated audio according to the window.
  • the traditional immersive media system design is mainly aimed at the omnidirectional video media transmission under 3Dof, and the degree of freedom that the content consumer has in the media experience.
  • 3Dof media content when the consumer experiences 3Dof media content, it has and only has three free head rotation operations, which are around the three coordinate axes of the three-dimensional rectangular coordinate system with the consumer's head as the origin. Spin.
  • the relevant media used to realize this immersive media experience is a series of technologies related to omnidirectional video, and the media content is also targeted at the data transmitted, that is, the 2D image frame of the traditional video form is designed, which leads to the media content oriented to the system structure Relatively single such problem.
  • the 3Dof+ media experience form adds the freedom of limited head displacement on the basis of the three head freedoms, that is, immersive media content consumers can obtain different media contents through displacement within a certain limit.
  • the sense of parallax generated by the displacement can be perceived by the device and the system can feed back different media content brought about by the parallax in real time to match the operation behavior of the consumer.
  • 3DoF+ video is made from content acquired by multiple cameras deployed to predict user displacement.
  • the depth image scene presented by 3DoF+ media is obtained by 2D image synthesis, where the 2D image is composed of texture components and corresponding depth components. Depth information can be directly collected by camera equipment or obtained indirectly through algorithms; or, 3DoF+ view can be synthesized from a planar image of a background area and multiple foreground images (non-planar).
  • the current media content processing and data form include, for 3Dof+, the Atlas related technology is mainly used to realize the atlas related technology.
  • the International Organization for Standardization MPEG has already implemented the Atlas related technology.
  • this type of solution uses texture components and corresponding depth components to form an atlas (Atlas) for encapsulation and transmission.
  • the atlas is a collection of rectangular blocks from one or more 2D images into an image pair, the image pair contains a texture component image and a corresponding depth component image.
  • the images of different viewpoints taken by different angle cameras are trimmed to obtain a basic atlas containing basic image blocks and additional atlases containing supplementary image blocks.
  • the basic atlas and the additional atlas 1 are used to generate the visual field image 1.
  • the basic atlas and the additional atlas 2 generate a field of view image 2.
  • the method of using the atlas can reduce the amount of data that needs to be transmitted to a certain extent on the premise of realizing the corresponding media function, and has a better reconstruction effect on the user side.
  • 6Dof is a richer immersive media experience based on 3Dof and 3Dof+.
  • the displacement of the three coordinate axes in the three-dimensional space with itself as the origin is added.
  • the processing of traditional video media content can no longer meet the requirements.
  • the media content and technology to realize 6Dof related media experience are still in the exploration stage, mainly point cloud, light field, etc.
  • the point cloud data content is shown as an example in Figure 2, showing 6 degrees of freedom (6Dof video) immersive media data content
  • the presentation is the surface information of the object obtained by scanning, including three-dimensional coordinate data, depth information, color information, etc., forming a geometric skeleton and then further point cloud presentation.
  • point cloud data compression algorithms for static and dynamic point cloud data, as well as different types of point cloud data such as machine perception and human eye perception.
  • a typical point cloud compression algorithm is to convert 3D point cloud data into 2D image data, and then perform data processing, one of which is video-based point cloud compression (Video- based Point Cloud Compression, VPCC) algorithm.
  • Video- based Point Cloud Compression Video- based Point Cloud Compression
  • This compression method first projects a 3D point cloud onto a 2D plane to obtain occupancy map information, geometric information, attribute information, and auxiliary information.
  • the attribute information usually includes texture information and color information. Therefore, the compressed information is usually divided into four categories.
  • Data is transferred. They are geometric information, attribute information, occupancy map information, and auxiliary information.
  • the decoding of geometric information relies on occupancy map information and auxiliary information
  • the decoding of attribute information relies on geometric information, occupancy map information and auxiliary information.
  • Point cloud media needs to process different types of data simultaneously, and after integration, present media with rich spatial and texture characteristics to users. As the exploration of related technologies progresses, the system also needs to complete and update the corresponding content for the exploration of 6Dof.
  • a higher degree of freedom of immersive media experience means more diverse information and data types. Whether it is an atlas, point cloud, or other forms of media such as light fields, its information content is diversified. If you want to achieve a new immersive media experience under multiple degrees of freedom, it originally only supports a single content structure design.
  • the immersive media system framework will not be able to effectively support the storage and transmission design of the new multi-degree-of-freedom media content, and it is necessary to design new information and structures in the new multi-degree-of-freedom media.
  • this application proposes a multi-degree-of-freedom multimedia data sending method, a receiving method, a multi-degree-of-freedom multimedia data system, and a media processor and player.
  • This application provides a method for sending multimedia data with multiple degrees of freedom, including: encapsulating the multimedia data according to an encapsulation transmission protocol.
  • the encapsulation transmission protocol includes: determining the attribute information of the multimedia data, including: different media types for the multimedia data , Determine the data type; determine and identify the number and location information of the media stream in the track where the multimedia data of the media type is located; and determine the association relationship between multiple data contents in different media data; and determine the corresponding attribute information respectively
  • the index method and index information are used to transmit the encapsulated multimedia data.
  • the data format of the multimedia data includes 3Dof+ mode and/or 6Dof mode; encapsulation transmission is suitable for MPEG media file transmission MMT mode or smart media transmission SMT mode or an ISO-based media file format ISOBMFF or an extension of the panoramic media application OMAF Way.
  • the different media types of the multimedia data include any one or more of the following: traditional two-dimensional video, atlas video, dynamic point cloud, static point cloud, and light field.
  • determine the data type of the multimedia data when the media type is atlas video, the data type includes texture data and depth data; when the media type is dynamic point cloud, the data type includes texture, geometry, occupancy map, and additional information data ; When the media type is static point cloud, the data type includes texture, geometry, and additional information data; when the media type is light field, the data type includes texture data and angle data.
  • the determining the data type of the multimedia data further includes: determining the number of data groups of the corresponding data type for each data type.
  • the corresponding relationship between the numbers of data groups of different data types includes: the same structure corresponds to the same texture; or the same structure corresponds to different textures that are mutually complementary.
  • the number and location information of the track media stream where the multimedia data for determining and identifying the media type is located includes: defining the track type, indicating that the multimedia data of each media type is in one or at least two tracks. : Define the media track number where the multimedia data is located; and define the specific position of each data in the multimedia data in the track; when there are at least two tracks: define the media track number of each data contained in the multimedia data, and define each of the multimedia data The specific location of the data in the track.
  • the association relationship includes: mutual dependence between data contents and/or single dependence between data contents and/or mutual replacement between data contents.
  • the interdependent association relationship includes: the texture and depth data in the atlas are interdependent; the geometry, occupancy map, and additional information in the point cloud are interdependent to jointly construct the point cloud geometric skeleton; the single dependent association
  • the relationship includes: the texture data in the point cloud needs to rely on geometry, occupancy map, and additional information to jointly construct the geometric skeleton; the additional atlas relies on the basic atlas, and the mutual replacement relationship includes: for the same point cloud geometric skeleton, match Use different texture data for replacement.
  • the index information includes a collection of the above-mentioned attribute information, and the attribute information is respectively placed at different levels of the encapsulation transmission protocol to describe, or defines an index that includes all attribute information of the media.
  • the data stream of the targeted multimedia data includes any one or more of the following: outer layer information, description indication information, and data content information.
  • the outer layer information is used to define the file type and content compatibility of the multimedia data. Description; description indication information, used to describe and indicate multimedia data; data content information, used for specific content information of multimedia data.
  • this application also provides a method for receiving multimedia data in multiple degrees of freedom, including: receiving the encapsulated multimedia data, analyzing the multimedia data according to the encapsulated transmission protocol that is the inverse of the sending method, and analyzing the multimedia data according to the parsed content. The data is processed accordingly.
  • the receiving method includes:
  • S1 Receive the media content data of the multimedia data, analyze it according to the encapsulation transmission protocol, and obtain the description instruction information of the multimedia data;
  • S2 Determine the media content data according to the description instruction information;
  • S3 Analyze and obtain the corresponding media content type according to the media content type The description information of the number of data groups and/or the description information of the media data type and/or the description information of the track type;
  • S4 Obtain the description information of the media data type, analyze and obtain the description information about the association relationship of different data types;
  • S5 Based on different media data types Descriptive information and the number of data groups. Descriptive information.
  • S6 Completely obtain the index information corresponding to each data type in the analysis information according to the number of data groups of different types of data, and obtain according to S3
  • this application also provides a media processor, including:
  • the storage module, the receiving module, the analysis module, and the data processing module are used to receive multimedia data for analysis and processing according to the encapsulation transmission protocol
  • the encapsulation transmission protocol includes:
  • Determining the attribute information of the multimedia data includes: determining the data type for different media types of the multimedia data; determining and identifying the number and location information of the media stream of the track where the multimedia data of the media type is located; and determining the number of data contents in different media data And the corresponding index mode and index information are respectively determined for the attribute information.
  • this application also provides a player, including:
  • the storage module, the receiving module, the analysis module, and the data processing module are used to receive multimedia data for analysis and processing according to the encapsulation transmission protocol
  • the encapsulation transmission protocol includes:
  • Determining the attribute information of the multimedia data includes: determining the data type for different media types of the multimedia data; determining and identifying the number and location information of the media stream of the track where the multimedia data of the media type is located; and determining the number of data contents in different media data And the corresponding index mode and index information are respectively determined for the attribute information.
  • the solution to existing protocols is mainly for traditional media, and for new media, especially new ones.
  • the problem of attribute non-support provides a new encapsulated and designed immersive media system framework for the new features and attributes of new media with multiple degrees of freedom, and expands the existing protocol by defining and describing the important features and attributes of the new media , Can adapt to the diversification of media data types under the new multi-degree of freedom and the diversified relationship between data units, better compatible with the new multi-degree of freedom media content, with a certain degree of scalability, and provide corresponding system frame structure design
  • the solution supports the storage and transmission of new media, enables devices and applications to support new media, and also realizes the effective use of multi-degree-of-freedom media data streams.
  • Figure 1 is a schematic diagram of the comparison between traditional immersive media content and the realization of atlas technology
  • Figure 2 is a block diagram of the data flow of point cloud technology
  • Figure 3-1 is the frame diagram of the media system design in the traditional scheme
  • Attached Figure 3-2 is a framework diagram of the multi-degree-of-freedom immersive media system design in this application;
  • Figure 4-1 is an ISOBMFF-based data transmission single-track design diagram of the atlas in the embodiment
  • Figure 4-2 is a schematic diagram of the data flow targeted under the single track of the atlas in Figure 4-1;
  • Figure 5-1 is an ISOBMFF-based data transmission single-track design diagram of the point cloud in the embodiment
  • Figure 5-2 is a schematic diagram of the data flow targeted under the single track of the point cloud in Figure 5-1;
  • Figure 6-1 is an ISOBMFF-based data transmission multi-track design diagram of the atlas in the embodiment
  • Figure 6-2 is a schematic diagram of the data flow targeted under the multi-track of the atlas in Figure 6-1;
  • Figure 7-1 is an ISOBMFF-based data transmission multi-track design diagram of the point cloud in the embodiment
  • Fig. 7-2 is a schematic diagram of the data flow under the multi-track point cloud in Fig. 7-1;
  • Fig. 8 is a flow chart of multi-degree-of-freedom media data analysis
  • Fig. 9 is a flow chart of data analysis corresponding to specific media content
  • Figure 10 is a schematic diagram of the functional module structure of the multi-degree-of-freedom immersive media system.
  • the multimedia data targeted by this application has the following characteristics:
  • the traditional video stream in Figure 3-1 is composed of continuous image frames.
  • immersive media under the new multi-degree of freedom.
  • the content of the newly-appearing atlas in the immersive media content of 3+ degrees of freedom shown in Figure 1 contains texture and depth information
  • the point cloud of 6 degrees of freedom shown in Figure 2 contains texture map information, Geometric map information, occupancy map information, and additional information.
  • FIGS. 1 and 2 the immersive media with multiple degrees of freedom targeted by this application requires an effective combination of multiple types of data to be correctly presented, and the original data encapsulation metadata cannot accurately describe these different types of data attributes.
  • Figure 3-2 is a framework diagram of the multi-degree-of-freedom immersive media system design in this application. It can be seen that different types of data of the immersive media under the new degrees of freedom can form a variety of combined association relationships.
  • the atlas with 3+ degrees of freedom in Figure 1 is composed of a set of textures and depths to form a basic atlas, and another set of textures and depths to form a supplementary atlas.
  • the content of the basic atlas and the supplementary atlas can be combined to form a free-view video.
  • the point cloud with 6 degrees of freedom in Figure 2 can be restored using geometric map information, occupancy map information and additional information.
  • only geometric structure information can be restored without using texture information.
  • Different geometric structures and different texture map information can be combined to obtain point clouds with different textures under a unified geometric structure.
  • Functions such as skinning of the character model can be realized by using the association relationship of related content attributes. Therefore, the original encapsulation protocol metadata needs to be extended to support the description of complex relationships.
  • this application provides a method for sending multimedia data with multiple degrees of freedom, including: encapsulating the multimedia data according to an encapsulation transmission protocol.
  • the encapsulation transmission protocol includes: determining the attribute information of the multimedia data, including: Different media types of the data, determine the data type; determine and identify the number and location information of the media stream of the track where the multimedia data of the media type is located; and determine the association relationship between multiple data contents in different media data; and the attribute information
  • the corresponding index method and index information are determined respectively, and the encapsulated multimedia data is transmitted.
  • multimedia data flow As shown in Figure 3-2, in the immersive media system design framework, a new description of multimedia data, also known as multimedia data flow, needs to be added, 1. Media type; 2. Media flow content quantity; 3. Media flow content type and corresponding The amount of content; 4. The relationship between the media content; and 5.
  • the content indexing method and index information Specifically, the following instructions are included:
  • Table 1 is a media type table of multimedia data in this embodiment.
  • ISOBMFF ISO-based media file format ISO Base File Format
  • new video types are added, such as traditional two-dimensional video, atlas, point cloud, light field, and reserved for defining future new media.
  • Video type 1 Two-dimensional video (traditional video) 2 Gallery video 3 Dynamic point cloud 4 Static point cloud 5 Light field 6 Reserved (used to define new media types)
  • Table 2 is a corresponding table of the number of data types and the number of data groups determined according to different media types in this embodiment.
  • the corresponding table definition in Table 2 describes the new video type, and describes the attributes and quantity of data contained in each video type.
  • the atlas contains texture and depth data; 3.
  • the dynamic point cloud contains texture, geometry, occupancy map, and additional information data; 4.
  • the static point cloud video contains texture, geometry, and additional information data; current technical solutions Among them, 5.
  • the light field contains texture and angle data, which may be expanded in the future with the study of the light field.
  • each video type contains several sets of data
  • the number of data sets can also be defined.
  • the atlas video can contain multiple atlases
  • the point cloud can contain multiple sets of point cloud data
  • the light field can contain multiple sets of texture and angle data.
  • the immersive media data stream under the new degree of freedom is not limited to one type of data content form.
  • the new freedom The design of the immersive media system framework under the high degree describes the type of content in the media data stream and the amount of corresponding content.
  • each type of media is in one media stream or distributed in multiple media streams, distinguish all data of each new type of media in a media stream for storage and transmission, and the address or location of each data.
  • Table 3 is a corresponding table of the track type of the media stream of the track where the multimedia data is located and the location of the data in this embodiment.
  • the track type is defined in ISOBMFF to describe whether each video is in one or at least two tracks.
  • Table 4 is a table of association relations between multiple media contents in multimedia data, which determines the association relations between multiple data contents in different media data: mutual dependence, single dependence, and mutual replacement.
  • the interdependent relationship includes: the texture and depth data in the atlas are interdependent; the geometry, occupancy map, and additional information in the point cloud are interdependent to construct the point cloud geometric skeleton.
  • the single dependent relationship includes: The texture data in the point cloud needs to rely on geometry, occupancy maps, and additional information to jointly construct the geometric skeleton; the additional atlas depends on the basic atlas, and the mutual replacement relationship includes: for the same point cloud geometric skeleton, different texture data Used for replacement.
  • Table 4 is only an example of a better example, and is not a limitation of the application.
  • the new type of media data has complex types, quantities, and association relationships.
  • the index information of the media data can be defined.
  • Table 5 is a corresponding table of different media types of multimedia data and corresponding indexing methods and index information determined respectively.
  • ISOBMFF defines the indexing method between the data content of each video type and the indexing method of indexing information media, that is, the data composition and index information of the media are given to help the device quickly analyze its media type, composition, quantity and access Information, to achieve effective acquisition and corresponding processing of content.
  • Atlas video as an example.
  • the media type is atlas video and distributed on a multi-track structure, it is expanded by using the Track Reference Box (Track Reference Box, the same below) in the protocol.
  • Add index information namely track (Track) type and track (Track) ID, to help the device quickly analyze its media type, composition, quantity and access information, to achieve effective content acquisition and corresponding processing.
  • index information can be used as a collection of the above-mentioned newly defined attributes. These attribute information can be placed at different levels of the protocol file to describe, or an index can be defined to contain all relevant information of the media, which is convenient for the device to quickly read and Parsing.
  • the immersive media system framework given in this application adds a description of the multimedia data stream to the protocol and performs corresponding processing, respectively, in conjunction with Figure 4-1
  • Embodiments 1 to 4 of Figure 7-2 describe the sending method and receiving method of multimedia data under multiple degrees of freedom, the multimedia data system under multiple degrees of freedom, and media processors and players, so as to finally realize the acquisition of media content on the consumer side.
  • Immersive media experience under the new multiple degrees of freedom.
  • Figure 4-1 is an ISOBMFF-based data transmission single-track design diagram of the atlas in the embodiment.
  • Fig. 4-2 is a schematic diagram of the data flow targeted under the single track of the atlas in Fig. 4-1.
  • the outer information can be represented by any field.
  • the field ftyp is used to represent the outer information, where ftyp is the outermost data box of the encapsulated file to define the file Type and content compatibility.
  • the description indication information can be represented by any field.
  • the field moov is used to represent the description indication information.
  • moov is the data box of the media content description information in the file, which contains various related information describing the transmission media content.
  • the data content information can be represented by any field.
  • the field mdat is used to represent the data content information.
  • the mdat is the specific media data content information.
  • the content contained in the moov is used to describe and describe the specific media data content in the mdat. Indicating role. This application adds description information about the content of media data contained in mdat in the moov structure.
  • the media data content format is shown in Figure 4-2, indicating that the number of atlases contained in the current data stream is "n".
  • the moov data box shown in Figure 4-1 In, new information about the type of media content, the type of media track, the number of media data groups, the type of media data and its corresponding number, the association relationship between different data types, and the index information are added.
  • the description "miv” about the media type of the atlas is added, indicating that the current media data stream is an atlas data stream (miv).
  • the track type is single track, indicate that the data types existing in the current media data stream are texture and depth, add a description about the amount of data, and indicate that the number of atlases contained in the current data stream is "n"
  • each The atlas contains a depth layer and a texture layer. Indicates the position of the corresponding data in each atlas, the position of the depth layer "depth 0" of the first atlas in the track, and the position of the texture layer "texture 0" of the first atlas in the track.
  • the instructions for the corresponding texture and depth position information in each atlas are completed. Add relevant information about the relationship between the data in the media data stream. For example, atlas 0 containing the basic view block is necessary data, and the atlas where other supplementary view blocks are located are supplementary content, which depends on atlas 0 and is related to atlas 0 The miv image corresponding to the viewpoint is restored together.
  • Fig. 5-1 is a design diagram of a single track of ISOBMFF-based data transmission of the point cloud in the embodiment.
  • Fig. 5-2 is a schematic diagram of the data flow targeted under the single track of the point cloud in Fig. 5-1.
  • the outer information can be represented by any field.
  • the field ftyp is used to represent the outer information, where ftyp is the outermost data box of the encapsulated file to define the file Type and content compatibility.
  • the description indication information can be represented by any field.
  • the field moov is used to represent the description indication information.
  • moov is the data box of the media content description information in the file, which contains various related information describing the transmission media content.
  • the data content information can be represented by any field.
  • the field mdat is used to represent the data content information.
  • the mdat is the specific media data content information.
  • the content contained in the moov is used to describe and describe the specific media data content in the mdat. Indicating role. This application adds description information about the content of media data contained in mdat in the moov structure.
  • the media data content format is shown in Figure 5-2.
  • point cloud data group 0 to point cloud data group n each group contains 2 sets of textures (texture 01, texture 02), geometry, occupancy map, and extra information.
  • the description of the point cloud media type "point cloud” is added to the moov structure, indicating that the current media data stream is a point cloud data stream (vpcc) .
  • the track type is single track, indicating that the data types existing in the current media data stream are texture, geometry, occupancy map and additional information.
  • the description about the amount of data is added, indicating that the texture contained in the current data stream is "t"
  • Numbers, geometry, occupancy map and additional information are all "n”.
  • the restoration depends on the restoration of the geometric structure 0, that is, the texture information 0 depends on the geometric 0, occupying the image 0 and the additional information 0.
  • the same structure 0 can correspond to the same texture, that is, a modification of the second embodiment above, and the number of data: texture, geometry, occupancy map, and additional information are all n.
  • the same structure 0 can also correspond to different textures, that is, in the second embodiment, the number of data: the number of textures is t, and the geometry, occupancy map, and additional information are all n.
  • Structure 0 can correspond to texture 00, texture 01, and texture 02.
  • a typical application scenario is point cloud character model skinning. It can be seen that different textures corresponding to the same geometric structure are mutually complementary.
  • each set of atlas contains one or more sets of texture data. Therefore, it can be seen that the number of texture data t is more than the number of data n of other data types (geometry, occupancy map, and additional information).
  • Fig. 6-1 is an ISOBMFF-based data transmission multi-track design diagram of the atlas in the embodiment.
  • Fig. 6-2 is a schematic diagram of the data flow targeted under the multi-track of the atlas in Fig. 6-1.
  • the outer information can be represented by any field.
  • the field ftyp is used to represent the outer information, where ftyp is the outermost data box of the encapsulated file to define File type and content compatibility.
  • the description indication information can be represented by any field.
  • the field moov is used to represent the description indication information.
  • moov is the data box of the media content description information in the file, which contains various related descriptions of the transmission media content.
  • data content information can be represented by any field, here the field mdat is used to represent data content information, mdat is specific media data content information, and the content contained in moov describes the specific media data content in mdat And indicating role. This application adds description information about the content of media data contained in mdat in the moov structure.
  • Atlas data 0 to atlas data n are distributed on track 1 (Track-1) and track 2 (Track-1), and each atlas includes a geometry (in this embodiment, depth) and a texture.
  • each atlas includes a geometry (in this embodiment, depth) and a texture.
  • new information about the media content type, media track type, number of media data groups, media data type and its corresponding quantity, and different data types are added. The relationship between and index information.
  • the description "miv” about the media type of the atlas is added to the moov structure, indicating that the current media data stream is an atlas data stream (miv).
  • the track type is multi-track, indicate that the data types existing in the current media data stream are texture and depth, add a description of the data quantity information, and indicate that the number of atlases contained in the current data stream is "n", each An atlas contains a depth layer and a texture layer.
  • Fig. 7-1 is an ISOBMFF-based data transmission multi-track design diagram of the point cloud in the embodiment.
  • Fig. 7-2 is a schematic diagram of the data flow under the multi-track point cloud in Fig. 7-1.
  • the outer information can be represented by any field.
  • the field ftyp is used to represent the outer information, where ftyp is the outermost data box of the encapsulated file to define File type and content compatibility.
  • the description indication information can be represented by any field.
  • the field moov is used to represent the description indication information.
  • moov is the data box of the media content description information in the file, which contains various related descriptions of the transmission media content.
  • data content information can be represented by any field, here the field mdat is used to represent data content information, mdat is specific media data content information, and the content contained in moov describes the specific media data content in mdat And indicating role. This application adds description information about the content of media data contained in mdat in the moov structure.
  • Point cloud data 0 to point cloud data n are distributed on track 1 to track 5 (Track-1 to Track-5).
  • the point cloud data contains t textures, and the geometry, occupancy map and additional information are all n, among which, The first group of textures are distributed in Track-1, the second group of textures are distributed in Track-2, and the geometry, occupancy map, and additional information are distributed in Track-3 to Track-5, respectively.
  • new information is added about the type of media content, the type of media track, the number of media data groups, the type of media data and its corresponding quantity, and the different data types. The relationship between and index information. specifically:
  • the description of the point cloud media type "point cloud” is added to the moov structure, indicating that the current media data stream is a point cloud data stream (vpcc).
  • vpcc point cloud data stream
  • the description about the amount of data is added, indicating that the texture contained in the current data stream is "t” ", geometry, occupancy map and additional information are all "n”.
  • texture 0 is in track 1 whose type is texture and its corresponding position
  • texture 1 is in track 1 whose track type is texture and its corresponding position
  • Geometry 0 is in the track 3 whose type is geometry and indicates its corresponding position, etc., and so on, to complete the instructions for four different types of data information.
  • the restoration depends on the restoration of the geometric structure 0, that is, the texture information 0 depends on the geometric 0, occupying the image 0 and the additional information 0.
  • the same structure 0 can correspond to the same texture, that is, a modification of the above-mentioned embodiment 4.
  • the amount of data: texture, geometry, occupancy map, and additional information are all For n.
  • the same structure 0 can also correspond to different textures. That is, in the fourth embodiment, the number of data: the texture is t, and the geometry, occupancy map, and additional information are all n.
  • Structure 0 can correspond to texture 00, texture 01, and texture 02.
  • a typical application scenario is point cloud character model skinning. It can be seen that different textures corresponding to the same geometric structure are mutually complementary.
  • each set of point clouds contains one or more sets of texture data. Therefore, it can be seen that the number of texture data t is more than the number of data n of other data types (geometry, occupancy map, and additional information).
  • Fig. 8 is a flow chart of multi-degree-of-freedom media data analysis, which is used to illustrate the method of receiving multimedia data under multi-degree-of-freedom.
  • this application provides a multi-degree-of-freedom immersive media system, which includes a sender side and a server side.
  • the server includes a receiving module, a parsing module, and a data processing module.
  • the server After the sender finishes sending the encapsulated media file, the server will receive the media file through the receiver.
  • the encapsulated media file protocol will be parsed, and the media data content will be processed according to the parsed content. .
  • Figure 8 shows:
  • the server After the sender completes the modification of the corresponding content in the data encapsulation transmission protocol, the server receives the corresponding media file data through the receiver, and completes the analysis of the related protocol to obtain the description information of the media content data.
  • S2 The data processing module will process the media content data according to the description information parsed in S1. First, the media content is judged, and the judgment is based on the parsed media type description information.
  • Fig. 9 is a flow chart of data analysis corresponding to specific different media content, when corresponding to specific different media content: dynamic point cloud (a in Fig. 9), static point cloud (b in Fig. 9), atlas video (Fig. 9 c) and light field (figure 9 d), including the following steps:
  • the media type is judged according to the media type description information, and the media type has been defined in the encapsulated content. If it is a traditional video media type, it is processed according to the old immersive media processing flow. If it is an immersive media type under the new multi-degree of freedom, dynamic point cloud, static point cloud, atlas video, light field, the media content processing flow corresponding to the resolved media type is used for processing.
  • the processing flow and processor corresponding to the media type are started, and at the same time, the number of media content data groups, the media content type corresponding to the media content and the track type during transmission are further obtained.
  • the corresponding media content types include texture, geometry, occupancy map, and additional information.
  • the corresponding media content types include texture, geometry, and additional information.
  • the corresponding media content types include texture and depth.
  • the corresponding media content types currently include texture and angle.
  • the third step T3 after completing the acquisition of the data type under the corresponding media type, combine the number of media data groups to analyze the number of different media data types.
  • the number of media data groups can assist in the acquisition of the number of media data types, avoiding content loss, and media
  • the number of data types can guide the data analysis terminal to complete the complete analysis of different types of data, avoiding content loss and affecting the media video recovery effect.
  • the fourth step T4 after completing the acquisition of the number of data groups and the number of data types, the index information and the association relationship of the corresponding data types are parsed, combined with the previous track type judgment results, and the data combination is performed.
  • the data combination method is:
  • T4.1 As shown in the branch a in Figure 9, for dynamic point clouds, according to the relationship between data types, the geometry of the same group of dynamic point cloud data, the occupancy map and additional information depend on each other to recover the geometry of the dynamic point cloud
  • the restoration of shape and texture depends on the restoration of geometric shapes, and the same set of dynamic point cloud data can have multiple sets of corresponding texture information but only one set of geometry, occupancy map and additional information.
  • the track type is single track, first find the geometry, occupancy map and additional information of the same group in the track according to the index information, complete the restoration of the point cloud geometry, and then index different texture data under the same group as needed to find all
  • the required texture data is restored on the basis of point cloud geometry, occupancy map and additional information.
  • the track type When the track type is multi-track, first find the geometry, occupancy map and additional information and texture of the track in the track type index according to the index information, and find the data of the corresponding type in the corresponding track according to the data type index. First find the geometry, occupancy map and additional information that belong to the same group in the corresponding type of track, complete the restoration of the point cloud geometry, and then index the different texture data belonging to the same group in the corresponding texture track as needed to find what you need Texture data, complete the restoration of texture information on the basis of point cloud geometry, occupancy map and additional information.
  • T4.2 As shown in the branch b in Figure 9, for static point clouds, according to the relationship between the data types, the geometry of the same group of dynamic point cloud data, additional information depend on each other to restore the geometry of the dynamic point cloud, and The restoration of texture depends on the restoration of geometric shapes.
  • the track type is single track, first find the same group of geometry in the track according to the index information, additional information, complete the restoration of the point cloud geometry, and then index the texture data under the same group as needed to find the required texture data , Complete the restoration of texture information on the basis of point cloud geometry and additional information.
  • the track type When the track type is multi-track, first find the track where geometry, additional information and texture are located in the track type index according to the index information, and find the corresponding type of data in the corresponding track according to the data type index. First find the geometry and additional information that belong to the same group in the corresponding type of track, and complete the restoration of the point cloud geometry. After that, index the texture data belonging to the same group in the corresponding texture track as needed to find the required texture data. The point cloud geometry, based on additional information, completes the restoration of texture information.
  • T4.3 As shown in the c branch in Figure 9, for atlas videos, according to the association relationship between data types, the depth and texture of the same set of atlas data depend on each other, and the atlas video content is restored together.
  • the track type is single track, find the texture and depth of the same group in the track according to the index information, and combine them together to complete the restoration of the image.
  • the track type is multi-track, first find the track where the texture and depth are located according to the track type index according to the index information, and find the data of the corresponding type in the corresponding track according to the data type index. After that, the texture and depth of the same set of atlas data together recover the content of the set of atlas.
  • T4.4 As shown in the d branch in Figure 9, for the light field, according to the correlation between the data types, the angle and texture of the same group of light field data and the extended information are dependent on each other, and the content of the light field is restored together.
  • the track type is single track, find the same set of texture, angle and expansion information in the track according to the index information, and combine them together to complete the restoration of the image.
  • the track type is multi-track, first find the track where the texture, angle, and expansion information are located in the track type index according to the index information, and find the corresponding type of data in the corresponding track according to the data type index. After that, the texture and angle of the same group of light field data and the expansion information together recover the content of the group of light field data.
  • the fifth step T5 according to the number of media data of the corresponding type and the number of media data types, complete the analysis and combination of all media data in sequence, and finally present the immersive media video content under the new multiple degrees of freedom.
  • the application concept of this application, the described embodiments, and the scope of this application enable the immersive media system to provide system architecture support for the upcoming implementation of immersive media 3Dof+ and 6Dof related experiences and technical applications.
  • this embodiment uses packaging protocols such as ISOBMFF and atlas and point cloud technologies as examples to illustrate the proposed immersive media 3Dof+ and 6Dof metadata and its structure, parameter content, data and its packaging and transmission methods
  • packaging protocols such as ISOBMFF and atlas and point cloud technologies
  • the immersive media data form and content under the new multiple degrees of freedom in this embodiment can also be encapsulated and transmitted in other formats, parameter expressions and files, such as using MMT, SMT transmission, ISOBMFF encapsulation, or based on OMAF (The expansion of omnidirectional media application format, the application format of panoramic media, does not affect the expression of the core technology of this application.
  • this application provides a multi-degree-of-freedom immersive media system, which includes a sender side and a server side.
  • the server includes a receiving module, a parsing module, and a data processing module.
  • the server After the sender finishes sending the encapsulated media file, the server will receive the media file through the receiver. First, the encapsulated media file protocol will be parsed, and the media data content will be processed according to the parsed content. .
  • a processor and a memory coupled to the processor are provided.
  • the processor When executing the computer-readable program in the memory, the processor may be configured to execute the method and system for receiving multimedia data in multiple degrees of freedom described in conjunction with FIGS. 1-9.
  • DSP digital signal processors
  • ASIC application-specific integrated circuits
  • FPGA field programmable gate arrays
  • a general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine.
  • the processor may also be implemented as a combination of computing devices, such as a combination of a DSP and a microprocessor, multiple microprocessors, one or more microprocessors cooperating with a DSP core, or any other such configuration.
  • the steps of the method or algorithm described in conjunction with the embodiments disclosed herein may be directly embodied in hardware, in a software module executed by a processor, or in a combination of the two.
  • the software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art.
  • An exemplary storage medium is coupled to the processor such that the processor can read information from and write information to the storage medium.
  • the storage medium may be integrated into the processor.
  • the processor and the storage medium may reside in the ASIC.
  • the ASIC may reside in the user terminal.
  • the processor and the storage medium may reside as discrete components in the user terminal.
  • the described functions may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software as a computer program product, each function can be stored as one or more instructions or codes on a computer-readable medium or transmitted through it.
  • Computer-readable media includes both computer storage media and communication media, including any medium that facilitates the transfer of a computer program from one place to another.
  • the storage medium may be any available medium that can be accessed by a computer.
  • such computer-readable media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or can be used to carry or store instructions or data in the form of a structure Any other medium that agrees with the program code and can be accessed by a computer.
  • any connection is also properly called a computer-readable medium.
  • the software is transmitted from a web site, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave .
  • coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of the medium.
  • Disks and discs as used in this article include compact discs (CD), laser discs, optical discs, digital versatile discs (DVD), floppy disks and Blu-ray discs, in which disks are often reproduced in a magnetic manner Data, and a disc (disc) optically reproduces the data with a laser. Combinations of the above should also be included in the scope of computer-readable media.

Abstract

Multiple degrees of freedom multimedia data transmission and reception methods, a multiple degrees of freedom multimedia data system, media processor, and player. Data types of different media types and track media stream distribution information are determined by means of increasing attribute descriptions of immersive multimedia. Association relationships between multiple pieces of data content in different media data are defined, and indices are provided. The system structure enables packaging and transmission of new media content and forms in multiple degrees of freedom, thereby providing a compatible and expandable framework for implementing subsequent relating techniques and designs, and better adapting to visual media consumption and applications in new degrees of freedom.

Description

多媒体数据收发方法、系统、处理器和播放器Multimedia data transceiving method, system, processor and player
本申请要求在2020年04月16日提交中国专利局、申请号为202010301699.0的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed with the Chinese Patent Office on April 16, 2020 with the application number of 202010301699.0, and the entire content of the application is incorporated into this application by reference.
技术领域Technical field
本申请属于沉浸式多媒体领域,具体涉及一种多自由度下多媒体数据的发送方法、接收方法、多自由度下多媒体数据系统以及媒体处理器和播放器。This application belongs to the field of immersive multimedia, and specifically relates to a method for sending and receiving multimedia data with multiple degrees of freedom, a multimedia data system with multiple degrees of freedom, and a media processor and player.
背景技术Background technique
近年来,由于虚拟现实(VR)技术的发展,媒体服务已经从传统的平面二维电视发展到佩戴头戴显示器(Head Mounted Display,HMD)进行全景沉浸式内容的体验。由VR系统制作的沉浸式媒体代表了一个虚拟空间,用户可以像在现实世界中一样自然地进行互动。虚拟现实对现实世界中的视觉和听觉的感官刺激进行渲染并呈现给用户。用户从一个三维空间的显示区域开始向周围观看,同时根据视窗得到关联的音频。In recent years, due to the development of virtual reality (VR) technology, media services have evolved from traditional flat two-dimensional TVs to head-mounted displays (HMD) for panoramic and immersive content experiences. The immersive media produced by the VR system represents a virtual space where users can interact naturally as in the real world. Virtual reality renders visual and auditory sensory stimuli in the real world and presents them to users. The user starts to look around from a display area in a three-dimensional space, and at the same time obtains the associated audio according to the window.
但随着视觉媒体相关硬件性能的增强,特别是媒体获取设备、媒体处理设备和计算设备性能的提升。传统的沉浸媒体,如3自由度(three Degrees of Freedom,3DoF)已经得到了全面和成熟的应用和发展。随着用户对于沉浸式媒体的需求不断增多,3DoF技术由于仅支持用户在固定点进行头部旋转的观看模式,已经不能完全满足用户的需求,因此3DoF+技术进入了快速发展阶段。相应视觉媒体领域的研究和设计也逐渐涉及更多自由度下的媒体内容。在3Dof的基础上,诞生了3Dof+和6Dof相关的媒体体验形式。对应新诞生的媒体体验形式,视觉媒体领域也设计了许多对应可以用于实现3Dof+和6Dof自由度的媒体内容,并提出和完善了对应的媒体实现的技术。However, with the enhancement of the performance of visual media-related hardware, especially the performance of media acquisition equipment, media processing equipment and computing equipment. Traditional immersive media, such as three Degrees of Freedom (3DoF), have been fully and maturely applied and developed. As users' demands for immersive media continue to increase, 3DoF technology can no longer fully meet the needs of users because it only supports the viewing mode of users rotating their heads at a fixed point, so 3DoF+ technology has entered a stage of rapid development. Corresponding research and design in the field of visual media also gradually involve media content with more degrees of freedom. On the basis of 3Dof, 3Dof+ and 6Dof related media experience forms were born. Corresponding to the newly born media experience form, the visual media field has also designed many corresponding media content that can be used to realize 3Dof+ and 6Dof degrees of freedom, and proposed and perfected the corresponding media realization technology.
传统的沉浸媒体系统设计,主要针对3Dof下的全向视频媒体传输,内容消费方在进行媒体体验时所拥有的自由度。场景化举例来说,当消费方体验3Dof媒体内容时,其有且仅拥有三个头部自由旋转的操作,分别是围绕以消费方头部为原点的三维直角坐标系的三个坐标轴的旋转。用于实现该沉浸媒体体验的相关媒体为全向视频相关的一系列技术,面向的媒体内容也是针对其传输的数据,即传统视频形式的2D图像帧被设计,因而导致系统结构面向的媒体内容相对单一这样的问题。The traditional immersive media system design is mainly aimed at the omnidirectional video media transmission under 3Dof, and the degree of freedom that the content consumer has in the media experience. For example, when the consumer experiences 3Dof media content, it has and only has three free head rotation operations, which are around the three coordinate axes of the three-dimensional rectangular coordinate system with the consumer's head as the origin. Spin. The relevant media used to realize this immersive media experience is a series of technologies related to omnidirectional video, and the media content is also targeted at the data transmitted, that is, the 2D image frame of the traditional video form is designed, which leads to the media content oriented to the system structure Relatively single such problem.
3Dof+的媒体体验形式在三个头部自由度的基础上增加了头部有限位移的自由度,即沉浸媒体内容消费方可以通过一定限制范围内的位移获取到不同的 媒体内容。也就是说,位移产生的视差感可以被设备感知并能够让系统实时反馈由视差带来的不同的媒体内容以匹配消费方的操作行为。这就需要在原先的媒体内容上新增可以带来视差互动的媒体信息,以使其视觉系统感受到更真实的景象。3DoF+视频由按照预测用户位移部署的多个摄像头获取的内容制作而成。3DoF+媒体呈现的深度图像场景通过2D图像合成得到,其中2D图像由纹理分量和相应的深度分量组成。深度信息可直接通过摄像设备采集或通过算法间接得到;或者,3DoF+视图可由一个背景区域的平面图像和多个前景图像(非平面)合成。The 3Dof+ media experience form adds the freedom of limited head displacement on the basis of the three head freedoms, that is, immersive media content consumers can obtain different media contents through displacement within a certain limit. In other words, the sense of parallax generated by the displacement can be perceived by the device and the system can feed back different media content brought about by the parallax in real time to match the operation behavior of the consumer. This requires adding media information that can bring parallax interaction to the original media content, so that the visual system can feel a more realistic scene. 3DoF+ video is made from content acquired by multiple cameras deployed to predict user displacement. The depth image scene presented by 3DoF+ media is obtained by 2D image synthesis, where the 2D image is composed of texture components and corresponding depth components. Depth information can be directly collected by camera equipment or obtained indirectly through algorithms; or, 3DoF+ view can be synthesized from a planar image of a background area and multiple foreground images (non-planar).
由上述需求可知,仅仅是传统视频形式中的对2D图像帧的处理形式无法满足通过有限位移产生位移带来的视差感的要求。因此需要设计新的媒体信息内容和处理形式,相适应匹配新的数据形式。It can be seen from the above requirements that only the processing form of the 2D image frame in the traditional video form cannot meet the requirement of the parallax caused by the displacement generated by the limited displacement. Therefore, it is necessary to design a new media information content and processing form to adapt to the new data form.
当前媒体内容处理和数据形式包含,对于3Dof+主要是采用图集(Atlas)相关的技术来进行实现,国际标准化组织MPEG中已有对图集相关技术的实现。其如图1中3+自由度(3Dof+视频)下的图集数据内容所示,此类方案是使用纹理分量和相应的深度分量组成图集(Atlas)进行封装和传输。图集是聚集了来自一个或多个2D图像的矩形块到一幅图像对的集合,图像对包含一幅纹理分量图像和相应的深度分量图像。在编码端对不同角度相机所拍摄到的不同视点的图像进行修剪处理,得到包含基础图像块的基本图集和包含补充图像块的附加图集,在解码端,根据用户当前视点与源相机的对应关系,选择基础图集和对应视点下的补充图集进行组合,就可以得到不同视点下的不同的视野图像,例如图1中,采用基本图集和附加图集1生成视野图像1、采用基本图集和附加图集2生成视野图像2。使用图集的方法可以在实现对应媒体功能的前提下一定程度上减少需要传输的数据量,并在用户端拥有较好的重建效果。The current media content processing and data form include, for 3Dof+, the Atlas related technology is mainly used to realize the atlas related technology. The International Organization for Standardization MPEG has already implemented the Atlas related technology. As shown in the atlas data content under 3+ degrees of freedom (3Dof+video) in Figure 1, this type of solution uses texture components and corresponding depth components to form an atlas (Atlas) for encapsulation and transmission. The atlas is a collection of rectangular blocks from one or more 2D images into an image pair, the image pair contains a texture component image and a corresponding depth component image. At the encoding end, the images of different viewpoints taken by different angle cameras are trimmed to obtain a basic atlas containing basic image blocks and additional atlases containing supplementary image blocks. At the decoding end, according to the user’s current viewpoint and the source camera’s Correspondence, select the basic atlas and the supplementary atlas under the corresponding viewpoint to combine, you can get different visual field images under different viewpoints. For example, in Figure 1, the basic atlas and the additional atlas 1 are used to generate the visual field image 1. The basic atlas and the additional atlas 2 generate a field of view image 2. The method of using the atlas can reduce the amount of data that needs to be transmitted to a certain extent on the premise of realizing the corresponding media function, and has a better reconstruction effect on the user side.
另外,6Dof是在3Dof和3Dof+的基础上更加丰富的沉浸媒体体验。在三个头部的自由度的基础上,增加了三维空间中以自身为原点的三个坐标轴方向的位移。要实现媒体呈现内容随媒体内容消费方头部旋转和身体位移带来的视差和转换,仅仅是对传统视频媒体内容的处理已经无法满足要求。当前对于实现6Dof相关媒体体验的媒体内容和技术尚在探索阶段,主要有点云,光场等,点云数据内容由图2中举例来看,展示了6自由度(6Dof视频)沉浸媒体数据内容的呈现,该呈现是通过扫描得到的物体表面信息,包括三维坐标数据、深度信息、颜色信息等,形成几何骨架再进一步点云呈现。其中,针对静态、动态点云数据,及机器感知、人眼感知等不同类型的点云数据,有不同的点云数据压缩算法。例如,针对动态人眼感知点云数据,典型的点云压缩算法是将3D的点云数据转换为2D的图像数据,然后再进行数据处理,其中一种就是基于视频的点云压缩(Video-based Point Cloud Compression,VPCC)算法。这种压缩方 法首先将3D点云投影到2D平面,得到占用图信息、几何信息、属性信息以及辅助信息,属性信息通常包括纹理信息和色彩信息,因此,压缩后的信息通常也分为四类数据进行传输。分别是几何信息、属性信息、占用图信息以及辅助信息。几何信息的解码依赖于占用图信息和辅助信息,属性信息的解码依赖几何信息、占用图信息及辅助信息。点云媒体需要同步处理不同类型的数据,整合之后,向用户呈现具有丰富的空间和纹理特性的媒体。随着相关技术实现探索的进行,系统对于6Dof的实现探索也需要进行相应的内容完善和更新。In addition, 6Dof is a richer immersive media experience based on 3Dof and 3Dof+. On the basis of the three degrees of freedom of the head, the displacement of the three coordinate axes in the three-dimensional space with itself as the origin is added. To realize the parallax and conversion of media presentation content with the head rotation and body displacement of the media content consumer, only the processing of traditional video media content can no longer meet the requirements. At present, the media content and technology to realize 6Dof related media experience are still in the exploration stage, mainly point cloud, light field, etc. The point cloud data content is shown as an example in Figure 2, showing 6 degrees of freedom (6Dof video) immersive media data content The presentation is the surface information of the object obtained by scanning, including three-dimensional coordinate data, depth information, color information, etc., forming a geometric skeleton and then further point cloud presentation. Among them, there are different point cloud data compression algorithms for static and dynamic point cloud data, as well as different types of point cloud data such as machine perception and human eye perception. For example, for dynamic human eye perception point cloud data, a typical point cloud compression algorithm is to convert 3D point cloud data into 2D image data, and then perform data processing, one of which is video-based point cloud compression (Video- based Point Cloud Compression, VPCC) algorithm. This compression method first projects a 3D point cloud onto a 2D plane to obtain occupancy map information, geometric information, attribute information, and auxiliary information. The attribute information usually includes texture information and color information. Therefore, the compressed information is usually divided into four categories. Data is transferred. They are geometric information, attribute information, occupancy map information, and auxiliary information. The decoding of geometric information relies on occupancy map information and auxiliary information, and the decoding of attribute information relies on geometric information, occupancy map information and auxiliary information. Point cloud media needs to process different types of data simultaneously, and after integration, present media with rich spatial and texture characteristics to users. As the exploration of related technologies progresses, the system also needs to complete and update the corresponding content for the exploration of 6Dof.
综上所述,更高的自由度的沉浸媒体体验意味着更多元的信息和数据类型。无论是图集,点云,或是光场等其他形式的媒体形式,其信息内容都是多元化的,想要实现新的多自由度下的沉浸媒体体验,原来只支持单一内容结构设计的沉浸媒体系统框架将无法有效地支持新的多自由度媒体内容的存储传输设计,就需要对出现的新的多自由度媒体中新的信息和结构进行新的设计。In summary, a higher degree of freedom of immersive media experience means more diverse information and data types. Whether it is an atlas, point cloud, or other forms of media such as light fields, its information content is diversified. If you want to achieve a new immersive media experience under multiple degrees of freedom, it originally only supports a single content structure design. The immersive media system framework will not be able to effectively support the storage and transmission design of the new multi-degree-of-freedom media content, and it is necessary to design new information and structures in the new multi-degree-of-freedom media.
如何解决现有系统架构的问题,如何针对多自由度下新的媒体内容和形式进行封装和传输等系统结构设计,使之可以为后续相应技术和设计的实现提供兼容和可扩展的架构,更好地适应新的自由度下的视觉媒体消费和应用,是亟待解决的关键问题。How to solve the problems of the existing system architecture, how to design the system structure for packaging and transmission of new media content and forms under multiple degrees of freedom, so that it can provide a compatible and extensible architecture for the implementation of subsequent corresponding technologies and designs, and more Adapting well to the consumption and application of visual media under the new degree of freedom is a key issue that needs to be solved urgently.
发明内容Summary of the invention
针对多自由度沉浸媒体内容的相关技术和实现,本申请提出了一种多自由度下多媒体数据的发送方法、接收方法、多自由度下多媒体数据系统以及媒体处理器和播放器。Aiming at the related technology and implementation of multi-degree-of-freedom immersive media content, this application proposes a multi-degree-of-freedom multimedia data sending method, a receiving method, a multi-degree-of-freedom multimedia data system, and a media processor and player.
本申请提供了一种多自由度下多媒体数据的发送方法,包括:对多媒体数据按照封装传输协议进行封装,该封装传输协议包含:确定多媒体数据的属性信息,包含:针对多媒体数据的不同媒体类型,确定数据类型;确定并标识媒体类型的多媒体数据所在轨道媒体流的数量和位置信息;和确定不同媒体数据中多个数据内容之间的关联关系;以及对所述属性信息分别确定相对应的索引方式和索引信息,将封装后的多媒体数据进行传输。This application provides a method for sending multimedia data with multiple degrees of freedom, including: encapsulating the multimedia data according to an encapsulation transmission protocol. The encapsulation transmission protocol includes: determining the attribute information of the multimedia data, including: different media types for the multimedia data , Determine the data type; determine and identify the number and location information of the media stream in the track where the multimedia data of the media type is located; and determine the association relationship between multiple data contents in different media data; and determine the corresponding attribute information respectively The index method and index information are used to transmit the encapsulated multimedia data.
进一步地,所述多媒体数据的数据形式包含3Dof+方式和/或6Dof方式;封装传输适用于MPEG媒体文件传输MMT方式或智能媒体传输SMT方式或基于ISO的媒体文件格式ISOBMFF或全景媒体应用OMAF的扩展方式。Further, the data format of the multimedia data includes 3Dof+ mode and/or 6Dof mode; encapsulation transmission is suitable for MPEG media file transmission MMT mode or smart media transmission SMT mode or an ISO-based media file format ISOBMFF or an extension of the panoramic media application OMAF Way.
进一步地,所述多媒体数据的不同媒体类型包括以下任一种或几种:传统二维视频、图集视频、动态点云、静态点云、光场。Further, the different media types of the multimedia data include any one or more of the following: traditional two-dimensional video, atlas video, dynamic point cloud, static point cloud, and light field.
进一步地,确定多媒体数据的数据类型:当媒体类型为图集视频时,数据 类型包含纹理数据和深度数据;当媒体类型为动态点云时,数据类型包含纹理、几何、占用图和附加信息数据;当媒体类型为静态点云时,数据类型包含纹理、几何、和附加信息数据;当媒体类型为光场时,数据类型包含纹理数据、和角度数据。Further, determine the data type of the multimedia data: when the media type is atlas video, the data type includes texture data and depth data; when the media type is dynamic point cloud, the data type includes texture, geometry, occupancy map, and additional information data ; When the media type is static point cloud, the data type includes texture, geometry, and additional information data; when the media type is light field, the data type includes texture data and angle data.
进一步地,所述确定多媒体数据的数据类型,还包括:针对每个数据类型确定对应的数据类型的数据组数。Further, the determining the data type of the multimedia data further includes: determining the number of data groups of the corresponding data type for each data type.
进一步地,不同数据类型的数据组数之间对应关系包含:同一结构对应同一的纹理;或者同一结构对应的不同且互为替补关系的纹理。Further, the corresponding relationship between the numbers of data groups of different data types includes: the same structure corresponds to the same texture; or the same structure corresponds to different textures that are mutually complementary.
进一步地,所述确定并标识媒体类型的多媒体数据所在轨道媒体流的数量和位置信息,包含:定义轨道类型,表明每种媒体类型的多媒体数据在一个或至少两个轨道中,其中,单轨时:定义多媒体数据所在媒体轨道号;以及定义多媒体数据中每个数据在轨道中的具体位置;至少两轨时:定义多媒体数据中包含的每个数据在媒体轨道号,以及定义多媒体数据中每个数据在轨道中的具体位置。Further, the number and location information of the track media stream where the multimedia data for determining and identifying the media type is located includes: defining the track type, indicating that the multimedia data of each media type is in one or at least two tracks. : Define the media track number where the multimedia data is located; and define the specific position of each data in the multimedia data in the track; when there are at least two tracks: define the media track number of each data contained in the multimedia data, and define each of the multimedia data The specific location of the data in the track.
进一步地,所述确定不同媒体数据中多个数据内容之间的关联关系,该关联关系包含:数据内容之间相互依赖和/或数据内容之间单一依赖和/或数据内容之间互相替换。Further, the determining the association relationship between multiple data contents in different media data, the association relationship includes: mutual dependence between data contents and/or single dependence between data contents and/or mutual replacement between data contents.
进一步地,所述相互依赖的关联关系包含:图集中纹理和深度数据相互依赖;点云中的几何、占用图、附加信息之间相互依赖共同构建出点云几何骨架;所述单一依赖的关联关系包含:点云中的纹理数据需要依赖几何、占用图、附加信息共同构建几何骨架;附加图集依赖基本图集,以及所述互相替换的关联关系包含:针对同一个点云几何骨架,配以不同的纹理数据用于替换。Further, the interdependent association relationship includes: the texture and depth data in the atlas are interdependent; the geometry, occupancy map, and additional information in the point cloud are interdependent to jointly construct the point cloud geometric skeleton; the single dependent association The relationship includes: the texture data in the point cloud needs to rely on geometry, occupancy map, and additional information to jointly construct the geometric skeleton; the additional atlas relies on the basic atlas, and the mutual replacement relationship includes: for the same point cloud geometric skeleton, match Use different texture data for replacement.
进一步地,所述索引信息包含上述属性信息的集合,该属性信息分别放在封装传输协议的不同层级来描述,或者,定义包含该媒体的所有属性信息的索引。Further, the index information includes a collection of the above-mentioned attribute information, and the attribute information is respectively placed at different levels of the encapsulation transmission protocol to describe, or defines an index that includes all attribute information of the media.
进一步地,所述针对的多媒体数据的数据流包含以下任一种或几种:外层信息、描述指示信息、数据内容信息,其中,外层信息,用于定义多媒体数据的文件类型和内容兼容性;描述指示信息,用于对多媒体数据进行描述和指示;数据内容信息,用于多媒体数据的具体内容信息。Further, the data stream of the targeted multimedia data includes any one or more of the following: outer layer information, description indication information, and data content information. The outer layer information is used to define the file type and content compatibility of the multimedia data. Description; description indication information, used to describe and indicate multimedia data; data content information, used for specific content information of multimedia data.
另外,本申请还提供了一种多自由度下多媒体数据的接收方法,包括:对封装的多媒体数据进行接收,按照与所述发送方法相逆的封装传输协议进行解析,根据解析内容对该多媒体数据进行相应的处理。In addition, this application also provides a method for receiving multimedia data in multiple degrees of freedom, including: receiving the encapsulated multimedia data, analyzing the multimedia data according to the encapsulated transmission protocol that is the inverse of the sending method, and analyzing the multimedia data according to the parsed content. The data is processed accordingly.
进一步地,所述接收方法包括:Further, the receiving method includes:
S1:接收多媒体数据的媒体内容数据,按照封装传输协议进行解析,得到多媒体数据的描述指示信息;S2:依据描述指示信息判断媒体内容数据;S3:依据媒体内容类型,解析获取对应媒体内容类型下的数据组数量描述信息和/或媒体数据类型描述信息和/或轨道类型描述信息;S4:获取媒体数据类型描述信息,解析获取关于不同数据类型的关联关系描述信息;S5:基于不同媒体数据类型描述信息和数据组数量描述信息,完整获取解析后的各个数据类型对应的数量;S6:依据不同类型数据的数据组数量,完整地获取解析信息中各个数据类型对应的索引信息,依据S3中获取的轨道类型描述信息、S4中获取的数据类型之间的关联关系描述信息以及S5中获取的各个数据类型的索引信息描述信息,得到所需的媒体内容。S1: Receive the media content data of the multimedia data, analyze it according to the encapsulation transmission protocol, and obtain the description instruction information of the multimedia data; S2: Determine the media content data according to the description instruction information; S3: Analyze and obtain the corresponding media content type according to the media content type The description information of the number of data groups and/or the description information of the media data type and/or the description information of the track type; S4: Obtain the description information of the media data type, analyze and obtain the description information about the association relationship of different data types; S5: Based on different media data types Descriptive information and the number of data groups. Descriptive information. Completely obtain the number corresponding to each data type after analysis; S6: Completely obtain the index information corresponding to each data type in the analysis information according to the number of data groups of different types of data, and obtain according to S3 The track type description information, the association relationship description information between the data types obtained in S4, and the index information description information of each data type obtained in S5, to obtain the required media content.
另外,本申请还提供了一种媒体处理器,包括:In addition, this application also provides a media processor, including:
存储模块、接收模块、解析模块以及数据处理模块,用于接收多媒体数据按照封装传输协议进行解析处理,该封装传输协议包含:The storage module, the receiving module, the analysis module, and the data processing module are used to receive multimedia data for analysis and processing according to the encapsulation transmission protocol, the encapsulation transmission protocol includes:
确定多媒体数据的属性信息,包含:针对多媒体数据的不同媒体类型,确定数据类型;确定并标识媒体类型的多媒体数据所在轨道媒体流的数量和位置信息;和确定不同媒体数据中多个数据内容之间的关联关系;以及对所述属性信息分别确定相对应的索引方式和索引信息。Determining the attribute information of the multimedia data includes: determining the data type for different media types of the multimedia data; determining and identifying the number and location information of the media stream of the track where the multimedia data of the media type is located; and determining the number of data contents in different media data And the corresponding index mode and index information are respectively determined for the attribute information.
另外,本申请还提供一种播放器,包括:In addition, this application also provides a player, including:
存储模块、接收模块、解析模块以及数据处理模块,用于接收多媒体数据按照封装传输协议进行解析处理,该封装传输协议包含:The storage module, the receiving module, the analysis module, and the data processing module are used to receive multimedia data for analysis and processing according to the encapsulation transmission protocol, the encapsulation transmission protocol includes:
确定多媒体数据的属性信息,包含:针对多媒体数据的不同媒体类型,确定数据类型;确定并标识媒体类型的多媒体数据所在轨道媒体流的数量和位置信息;和确定不同媒体数据中多个数据内容之间的关联关系;以及对所述属性信息分别确定相对应的索引方式和索引信息。Determining the attribute information of the multimedia data includes: determining the data type for different media types of the multimedia data; determining and identifying the number and location information of the media stream of the track where the multimedia data of the media type is located; and determining the number of data contents in different media data And the corresponding index mode and index information are respectively determined for the attribute information.
本申请的作用和效果:Functions and effects of this application:
根据本申请所提供的多自由度下多媒体数据的发送方法、接收方法、多自由度下多媒体数据系统以及媒体处理器和播放器,解决现有协议主要针对传统媒体,对新型媒体特别是其新属性的不支持问题,针对多自由度新型媒体的新特征新属性提供了一种新的封装、设计的沉浸媒体系统框架,通过对新媒体的重要特征和属性进行定义和描述,扩展现有协议,能够适应新的多自由度下的媒体数据类型多元化和数据单元之间关联关系多样化,更好兼容新的多自由度媒体内容,具有一定的扩展性,并提供对应的系统框架结构设计方案,从而支持新型媒体的存储和传输,实现设备和应用对新媒体的支持,也实现了多自由 度媒体数据流的有效使用。According to the multi-degree-of-freedom multimedia data sending method, receiving method, multi-degree-of-freedom multimedia data system, and media processor and player provided by this application, the solution to existing protocols is mainly for traditional media, and for new media, especially new ones. The problem of attribute non-support, provides a new encapsulated and designed immersive media system framework for the new features and attributes of new media with multiple degrees of freedom, and expands the existing protocol by defining and describing the important features and attributes of the new media , Can adapt to the diversification of media data types under the new multi-degree of freedom and the diversified relationship between data units, better compatible with the new multi-degree of freedom media content, with a certain degree of scalability, and provide corresponding system frame structure design The solution supports the storage and transmission of new media, enables devices and applications to support new media, and also realizes the effective use of multi-degree-of-freedom media data streams.
附图说明Description of the drawings
附图1是传统沉浸媒体内容和图集技术实现的对比示意图;Figure 1 is a schematic diagram of the comparison between traditional immersive media content and the realization of atlas technology;
附图2是点云技术数据流的内容框图;Figure 2 is a block diagram of the data flow of point cloud technology;
附图3-1为传统方案中媒体系统设计的框架图;Figure 3-1 is the frame diagram of the media system design in the traditional scheme;
附图3-2为本申请中多自由度沉浸媒体系统设计的框架图;Attached Figure 3-2 is a framework diagram of the multi-degree-of-freedom immersive media system design in this application;
附图4-1为实施例中图集基于ISOBMFF的数据传输单轨设计图;Figure 4-1 is an ISOBMFF-based data transmission single-track design diagram of the atlas in the embodiment;
附图4-2为附图4-1中图集单轨下所针对的数据流示意图;Figure 4-2 is a schematic diagram of the data flow targeted under the single track of the atlas in Figure 4-1;
附图5-1为实施例中点云基于ISOBMFF的数据传输单轨设计图;Figure 5-1 is an ISOBMFF-based data transmission single-track design diagram of the point cloud in the embodiment;
附图5-2为附图5-1中点云单轨下所针对的数据流示意图;Figure 5-2 is a schematic diagram of the data flow targeted under the single track of the point cloud in Figure 5-1;
附图6-1为实施例中图集基于ISOBMFF的数据传输多轨设计图;Figure 6-1 is an ISOBMFF-based data transmission multi-track design diagram of the atlas in the embodiment;
附图6-2为附图6-1中图集多轨下所针对的数据流示意图;Figure 6-2 is a schematic diagram of the data flow targeted under the multi-track of the atlas in Figure 6-1;
附图7-1为实施例中点云基于ISOBMFF的数据传输多轨设计图;Figure 7-1 is an ISOBMFF-based data transmission multi-track design diagram of the point cloud in the embodiment;
附图7-2为附图7-1中点云多轨下所针对的数据流示意图;Fig. 7-2 is a schematic diagram of the data flow under the multi-track point cloud in Fig. 7-1;
附图8为多自由度媒体数据解析流程图;Fig. 8 is a flow chart of multi-degree-of-freedom media data analysis;
附图9为对应具体媒体内容的数据解析流程图;Fig. 9 is a flow chart of data analysis corresponding to specific media content;
图10为多自由度沉浸媒体系统的功能模块结构示意图。Figure 10 is a schematic diagram of the functional module structure of the multi-degree-of-freedom immersive media system.
具体实施方式Detailed ways
下面结合具体实施例对本申请进行详细说明。以下实施例将有助于本领域的技术人员进一步理解本申请,但不以任何形式限制本申请。The application will be described in detail below in conjunction with specific embodiments. The following examples will help those skilled in the art to further understand the application, but do not limit the application in any form.
本申请所针对的多媒体数据作为多自由度沉浸媒体,其特性具有以下特点:As a multi-degree-of-freedom immersive media, the multimedia data targeted by this application has the following characteristics:
第(1),数据类型多样化。First (1), diversification of data types.
附图3-1中传统视频流是由连续图像帧组成。而新的多自由度下的沉浸媒体的构成元素有多种。比如图1所示的构成3+自由度沉浸媒体内容中的新出现的图集的内容,图集包含纹理和深度信息;如图2所示的构成6自由度的点云包含纹理图信息、几何图信息、占用图信息、和附加信息。由图1和图2可知,本申请所针对的一个多自由度下的沉浸媒体需要多个类型数据的有效组合才能正确呈现,原有的数据封装元数据无法准确描述这些不同类型的数据属性。The traditional video stream in Figure 3-1 is composed of continuous image frames. There are many elements of immersive media under the new multi-degree of freedom. For example, the content of the newly-appearing atlas in the immersive media content of 3+ degrees of freedom shown in Figure 1, the atlas contains texture and depth information; the point cloud of 6 degrees of freedom shown in Figure 2 contains texture map information, Geometric map information, occupancy map information, and additional information. It can be seen from FIGS. 1 and 2 that the immersive media with multiple degrees of freedom targeted by this application requires an effective combination of multiple types of data to be correctly presented, and the original data encapsulation metadata cannot accurately describe these different types of data attributes.
第(2),不同数据单元间关联关系多样化。(2) The relationship between different data units is diversified.
附图3-1中传统视频单元按照时间线排序。然而,本申请中,附图3-2为本申请中多自由度沉浸媒体系统设计的框架图,可见,新的自由度下的沉浸媒体的不同类型数据之间可以形成多种组合关联关系。比如图1中构成3+自由度的图集,一组纹理和深度组成基本图集,另外一组纹理和深度组成补充图集,获取基本图集和补充图集中内容可以组合出自由视野视频。比如图2中构成6自由度的点云,使用几何图信息,占用图信息和附加信息可以恢复出点云的几何结构,在一些特定情况下可以只恢复几何结构信息而不使用纹理信息,而不同的几何结构与不同的纹理图信息结合又可以得到统一几何结构下的不同纹理的点云可以使用相关内容属性的关联关系进行人物模型换肤等功能的实现。因此,原有封装协议元数据需要扩展以支持复杂关系的描述。The traditional video units in Figure 3-1 are sorted according to the timeline. However, in this application, Figure 3-2 is a framework diagram of the multi-degree-of-freedom immersive media system design in this application. It can be seen that different types of data of the immersive media under the new degrees of freedom can form a variety of combined association relationships. For example, the atlas with 3+ degrees of freedom in Figure 1 is composed of a set of textures and depths to form a basic atlas, and another set of textures and depths to form a supplementary atlas. The content of the basic atlas and the supplementary atlas can be combined to form a free-view video. For example, the point cloud with 6 degrees of freedom in Figure 2 can be restored using geometric map information, occupancy map information and additional information. In some specific cases, only geometric structure information can be restored without using texture information. Different geometric structures and different texture map information can be combined to obtain point clouds with different textures under a unified geometric structure. Functions such as skinning of the character model can be realized by using the association relationship of related content attributes. Therefore, the original encapsulation protocol metadata needs to be extended to support the description of complex relationships.
因为上述特点,所以在进行新的多自由度下的沉浸媒体系统框架设计时,为了支持多元媒体内容,需要在对所需封装和传输的媒体数据描述结构中设计关于对应多自由度的沉浸媒体数据流内容的描述。Because of the above characteristics, when designing the framework of a new immersive media system with multiple degrees of freedom, in order to support multiple media content, it is necessary to design the corresponding immersive media with multiple degrees of freedom in the description structure of the media data to be packaged and transmitted. The description of the content of the data stream.
为了实现上述目的,本申请提供了一种多自由度下多媒体数据的发送方法,包括:对多媒体数据按照封装传输协议进行封装,该封装传输协议包含:确定多媒体数据的属性信息,包含:针对多媒体数据的不同媒体类型,确定数据类型;确定并标识媒体类型的多媒体数据所在轨道媒体流的数量和位置信息;和确定不同媒体数据中多个数据内容之间的关联关系;以及对所述属性信息分别确定相对应的索引方式和索引信息,将封装后的多媒体数据进行传输。In order to achieve the above objective, this application provides a method for sending multimedia data with multiple degrees of freedom, including: encapsulating the multimedia data according to an encapsulation transmission protocol. The encapsulation transmission protocol includes: determining the attribute information of the multimedia data, including: Different media types of the data, determine the data type; determine and identify the number and location information of the media stream of the track where the multimedia data of the media type is located; and determine the association relationship between multiple data contents in different media data; and the attribute information The corresponding index method and index information are determined respectively, and the encapsulated multimedia data is transmitted.
由图3-2可知,在沉浸媒体系统设计框架中,需要对多媒体数据、另称为多媒体数据流新增描述,1、媒体类型;2、媒体流内容数量;3、媒体流内容类型和对应内容的数量;4、媒体内容间的关联关系;以及5、内容索引方式和索引信息。具体而言,包括以下说明:As shown in Figure 3-2, in the immersive media system design framework, a new description of multimedia data, also known as multimedia data flow, needs to be added, 1. Media type; 2. Media flow content quantity; 3. Media flow content type and corresponding The amount of content; 4. The relationship between the media content; and 5. The content indexing method and index information. Specifically, the following instructions are included:
一、定义对新的多自由度下的媒体类型的描述。1. Define the description of the media type under the new multiple degrees of freedom.
即描述为支持多自由度而出现的新型媒体类型,以便协议及设备能够正确识别及处理此类新媒体。通过增加对媒体数据流中的媒体类型的相关描述信息,可以在设计媒体数据流信息、各类处理结构和处理器时起到指示作用。It is described as a new type of media that supports multiple degrees of freedom, so that protocols and devices can correctly identify and process such new media. By adding relevant description information of the media type in the media data stream, it can play an instructive role in the design of media data stream information, various processing structures and processors.
表1是本实施例中多媒体数据的媒体类型表。例如在ISOBMFF(基于ISO的媒体文件格式ISO Base File Format)里面增加新视频类型,如传统二维视频、图集、点云、光场、以及保留的用于定义未来新型媒体等,并对每种视频类型进行描述。其中,点云进一步区分动态点云、静态点云等。Table 1 is a media type table of multimedia data in this embodiment. For example, in ISOBMFF (ISO-based media file format ISO Base File Format), new video types are added, such as traditional two-dimensional video, atlas, point cloud, light field, and reserved for defining future new media. Descriptions of various video types. Among them, point cloud further distinguishes dynamic point cloud, static point cloud and so on.
表1Table 1
序号Serial number 视频类型Video type
11 二维视频(传统视频)Two-dimensional video (traditional video)
22 图集视频Gallery video
33 动态点云Dynamic point cloud
44 静态点云Static point cloud
55 光场Light field
66 保留(用于定义新型媒体类型)Reserved (used to define new media types)
二、定义新的多自由度下的媒体数据流中数据类型以及对应类型数量。2. Define the data type and the number of corresponding types in the media data stream under the new multi-degree-of-freedom.
描述每种新型媒体各自包含的不同种类数据的类型及数目,以便协议及设备能够正确识别及处理此类新媒体。Describe the types and numbers of different types of data contained in each new type of media, so that the protocol and equipment can correctly identify and process this type of new media.
表2是本实施例中依据不同媒体类型所确定的数据类型和数据组数的数量对应表。通过表2中这样的对应表定义描述新视频类型、并对每种视频类型包含的数据属性和数量进行描述。Table 2 is a corresponding table of the number of data types and the number of data groups determined according to different media types in this embodiment. The corresponding table definition in Table 2 describes the new video type, and describes the attributes and quantity of data contained in each video type.
例如在ISOBMFF里面定义新视频类型,如图集、点云、光场等,并对每种视频类型包含的数据属性和数量进行描述:For example, define new video types in ISOBMFF, such as atlas, point clouds, light fields, etc., and describe the attributes and quantities of data contained in each video type:
如表2中的2、图集包含纹理和深度数据;3、动态点云包含纹理、几何、占用图、附加信息数据;4、静态点云视频包含纹理、几何、附加信息数据;目前技术方案中,5、光场包含纹理、角度数据,未来随着光场研究可能还会扩展。As shown in Table 2, 2. The atlas contains texture and depth data; 3. The dynamic point cloud contains texture, geometry, occupancy map, and additional information data; 4. The static point cloud video contains texture, geometry, and additional information data; current technical solutions Among them, 5. The light field contains texture and angle data, which may be expanded in the future with the study of the light field.
进一步可扩展的,如果每种视频类型包含几组数据,还可以定义数据组数。如图集视频可包含多个图集,点云可包含多组点云数据;光场包含多组纹理、角度数据。Further expandable, if each video type contains several sets of data, the number of data sets can also be defined. The atlas video can contain multiple atlases, the point cloud can contain multiple sets of point cloud data; the light field can contain multiple sets of texture and angle data.
值得说明的是,在本申请中,新的自由度下的沉浸媒体数据流中不仅限于一种类型的数据内容形式,为了实现对于多种媒体数据流内容的系统结构设计,需要在新的自由度下的沉浸媒体系统框架设计时描述媒体数据流中的内容类型和对应内容的数量。It is worth noting that in this application, the immersive media data stream under the new degree of freedom is not limited to one type of data content form. In order to realize the system structure design for multiple media data stream contents, the new freedom The design of the immersive media system framework under the high degree describes the type of content in the media data stream and the amount of corresponding content.
表2Table 2
Figure PCTCN2021087805-appb-000001
Figure PCTCN2021087805-appb-000001
Figure PCTCN2021087805-appb-000002
Figure PCTCN2021087805-appb-000002
三、确定并标识媒体类型的多媒体数据所在轨道媒体流的数量和位置信息。3. Determine and identify the number and location information of the track media stream where the multimedia data of the media type is located.
定义每种类型媒体在一个媒体流还是分布在多个媒体流里,区分出每个新型媒体的所有数据放在一个媒体流中存储和传输,以及每个数据所在地址或位置。Define whether each type of media is in one media stream or distributed in multiple media streams, distinguish all data of each new type of media in a media stream for storage and transmission, and the address or location of each data.
表3是本实施例中多媒体数据所在轨道媒体流的轨道类型、数据所在位置的对应表。例如在ISOBMFF里面定义轨道类型,描述每种视频在一个还是至少2个轨道(track)中。Table 3 is a corresponding table of the track type of the media stream of the track where the multimedia data is located and the location of the data in this embodiment. For example, the track type is defined in ISOBMFF to describe whether each video is in one or at least two tracks.
表3table 3
Figure PCTCN2021087805-appb-000003
Figure PCTCN2021087805-appb-000003
Figure PCTCN2021087805-appb-000004
Figure PCTCN2021087805-appb-000004
四、定义不同媒体数据中多个数据内容之间的关联关系。4. Define the relationship between multiple data contents in different media data.
当媒体数据流中出现多种数据内容形式时,每种数据类型的数据可以有多个,它们之间存在复杂的关联关系,为了实现对该媒体从封装、传输到解码呈现,从处理媒体数据流到呈现媒体的系统支持,需要对数据流中的内容之间的关联关系信息进行描述,以实现以正确的和可行的方法对数据流的使用进行细化设计实现和应用。When there are multiple data content forms in the media data stream, there can be multiple data of each data type, and there are complex relationships between them. In order to realize the media from encapsulation, transmission to decoding and presentation, from processing media data For system support from streaming to presentation media, it is necessary to describe the association relationship information between the contents in the data stream, so as to realize the detailed design, implementation and application of the use of the data stream in a correct and feasible way.
表4是多媒体数据中多个媒体内容之间的关联关系表,确定不同媒体数据中多个数据内容之间的关联关系:相互依赖、单一依赖、以及互相替换。Table 4 is a table of association relations between multiple media contents in multimedia data, which determines the association relations between multiple data contents in different media data: mutual dependence, single dependence, and mutual replacement.
例如在ISOBMFF里面定义每种视频类型包含的不同数据之间的关联关系进行描述:For example, define the association relationship between the different data contained in each video type in ISOBMFF to describe:
1、数据之间相互依赖,缺一不可。比如,表4中2、图集中纹理和深度数据相互依赖;表4中3、动态点云中的几何、占用图、附加信息之间相互依赖,共同构建出点云的几何骨架。1. Data depends on each other, neither of which is indispensable. For example, in Table 4, 2, the texture and depth data in the atlas are dependent on each other; Table 4 in 3, the geometry, occupancy map, and additional information in the dynamic point cloud are dependent on each other to jointly construct the geometric skeleton of the point cloud.
2、单一依赖,对某个数据有依赖关系,缺少它本数据将失去意义。比如,表4中3、动态点云中的纹理数据需要依赖几何、占用图、附加信息共同构建出来的几何骨架;表4中2、附加图集依赖基本图集。2. Single dependency, there is a dependency on a certain data, the data will lose its meaning without it. For example, in Table 4, 3. The texture data in the dynamic point cloud needs to rely on a geometric skeleton constructed by geometry, occupancy map, and additional information; Table 4, 2. The additional atlas depends on the basic atlas.
3、替换关系,数据之间可以相互替换。比如,表4中,3、动态点云针对同一个点云几何骨架,可以配以不同的纹理数据,从而在一个骨架上展现不同“皮肤”。那么不同的纹理数据之间就是替换关系。3. Replacement relationship, data can be replaced with each other. For example, in Table 4, 3. The dynamic point cloud can be equipped with different texture data for the same point cloud geometric skeleton, so as to show different "skins" on the same skeleton. Then there is a substitution relationship between different texture data.
小结来看,相互依赖的关联关系包含:图集中纹理和深度数据相互依赖;点云中的几何、占用图、附加信息之间相互依赖共同构建出点云几何骨架,单一依赖的关联关系包含:点云中的纹理数据需要依赖几何、占用图、附加信息共同构建几何骨架;附加图集依赖基本图集,以及互相替换的关联关系包含:针对同一个点云几何骨架,配以不同的纹理数据用于替换。In summary, the interdependent relationship includes: the texture and depth data in the atlas are interdependent; the geometry, occupancy map, and additional information in the point cloud are interdependent to construct the point cloud geometric skeleton. The single dependent relationship includes: The texture data in the point cloud needs to rely on geometry, occupancy maps, and additional information to jointly construct the geometric skeleton; the additional atlas depends on the basic atlas, and the mutual replacement relationship includes: for the same point cloud geometric skeleton, different texture data Used for replacement.
以上分析省略了对表4中每种数据类型不同媒体内容的逐一说明,表4仅是举例出较优例子,并非对本申请的限制。The above analysis omits a description of the different media content of each data type in Table 4. Table 4 is only an example of a better example, and is not a limitation of the application.
表4Table 4
Figure PCTCN2021087805-appb-000005
Figure PCTCN2021087805-appb-000005
Figure PCTCN2021087805-appb-000006
Figure PCTCN2021087805-appb-000006
五、定义新的多自由度下的媒体数据流的索引方式和索引信息。5. Define the index method and index information of the media data stream under the new multi-degree-of-freedom.
上述说明表明,新型媒体数据有着复杂的类型、数量、关联关系,为了便于描述,可定义媒体数据的索引信息。The above description shows that the new type of media data has complex types, quantities, and association relationships. For ease of description, the index information of the media data can be defined.
表5是,多媒体数据的不同媒体类型和所分别确定相对应的索引方式和索引信息的对应表。Table 5 is a corresponding table of different media types of multimedia data and corresponding indexing methods and index information determined respectively.
例如在ISOBMFF定义每种视频类型包含的数据内容之间的索引方式和索引信息媒体的索引方式,即给出媒体的数据组成和索引信息,帮助设备快速解析其媒体类型、组成成分、数量及访问信息,实现对内容的有效获取和对应处理。For example, ISOBMFF defines the indexing method between the data content of each video type and the indexing method of indexing information media, that is, the data composition and index information of the media are given to help the device quickly analyze its media type, composition, quantity and access Information, to achieve effective acquisition and corresponding processing of content.
表5table 5
Figure PCTCN2021087805-appb-000007
Figure PCTCN2021087805-appb-000007
Figure PCTCN2021087805-appb-000008
Figure PCTCN2021087805-appb-000008
该表5中,以2、图集视频为例,针对媒体类型为图集视频、分布在单轨结构上的情况而言,通过利用协议中样本表格数据盒(Sample Table Box)进行扩展,增加索引信息,即样本(Sample)类型以及样本索引(Sample index),帮助设备快速解析其媒体类型、组成成分、数量及访问信息,实现对内容的有效获取和对应处理。In Table 5, taking 2. Atlas video as an example, for the case where the media type is atlas video and distributed on a single track structure, the protocol is expanded by using the Sample Table Box (Sample Table Box) to increase the index Information, namely the sample type and sample index, helps the device quickly analyze its media type, composition, quantity, and access information, so as to achieve effective content acquisition and corresponding processing.
另外,继续以2、图集视频为例,针对媒体类型为图集视频、分布在多轨结构上的情况而言,通过利用协议中轨道参考数据盒(Track Reference Box,以下相同)扩展进行扩展,增加索引信息,即轨道(Track)类型以及轨道(Track)ID,帮助设备快速解析其媒体类型、组成成分、数量及访问信息,实现对内容的有效获取和对应处理。In addition, continue to take 2. Atlas video as an example. For the case where the media type is atlas video and distributed on a multi-track structure, it is expanded by using the Track Reference Box (Track Reference Box, the same below) in the protocol. , Add index information, namely track (Track) type and track (Track) ID, to help the device quickly analyze its media type, composition, quantity and access information, to achieve effective content acquisition and corresponding processing.
省略对表5中其他媒体类型的单轨、多轨结构的索引方式和索引信息的对应描述,可推理得知,不再赘述。The corresponding descriptions of the index methods and index information of the single-track and multi-track structures of other media types in Table 5 are omitted, which can be inferred and will not be repeated.
进一步扩展说明的是,索引信息可作为上述新定义属性的集合,这些属性信息可以分别放在协议文件不同层级来描述,也可以定义一个索引包含该媒体的所有相关信息,便于设备快速读取和解析。It is further expanded to explain that the index information can be used as a collection of the above-mentioned newly defined attributes. These attribute information can be placed at different levels of the protocol file to describe, or an index can be defined to contain all relevant information of the media, which is convenient for the device to quickly read and Parsing.
小结来看,当需要支持新的多自由度的沉浸媒体时,本申请所给出的沉浸媒体系统框架,在协议中新增对多媒体数据流的描述并进行相应处理,分别结合图4-1至图7-2的实施例一至四,对多自由度下多媒体数据的发送方法、接收方法、多自由度下多媒体数据系统以及媒体处理器和播放器进行说明,以最终实现媒体内容消费端获得新的多自由度下的沉浸媒体体验。In summary, when it is necessary to support new multi-degree-of-freedom immersive media, the immersive media system framework given in this application adds a description of the multimedia data stream to the protocol and performs corresponding processing, respectively, in conjunction with Figure 4-1 Embodiments 1 to 4 of Figure 7-2 describe the sending method and receiving method of multimedia data under multiple degrees of freedom, the multimedia data system under multiple degrees of freedom, and media processors and players, so as to finally realize the acquisition of media content on the consumer side. Immersive media experience under the new multiple degrees of freedom.
以下基于ISOBMFF所列举的四个实施例:图集单轨、点云单轨、图集多 轨以及点云多轨是可选方案,并非本申请的限制范围。The following are based on the four embodiments listed in ISOBMFF: atlas single track, point cloud single track, atlas multi track, and point cloud multi track are optional solutions and are not limited by the scope of this application.
【实施例一】[Embodiment One]
附图4-1为实施例中图集基于ISOBMFF的数据传输单轨设计图。附图4-2为附图4-1中图集单轨下所针对的数据流示意图。Figure 4-1 is an ISOBMFF-based data transmission single-track design diagram of the atlas in the embodiment. Fig. 4-2 is a schematic diagram of the data flow targeted under the single track of the atlas in Fig. 4-1.
针对图集的单轨设计,如附图4-1所示,外层信息可以用任意字段表示,此处用字段ftyp表示外层信息,其中ftyp为封装文件最外层数据盒,用以定义文件类型和内容兼容性,描述指示信息可以用任意字段表示,此处用字段moov表示描述指示信息,moov为文件中媒体内容描述信息的数据盒,里面包含各种对传输媒体内容进行描述的相关信息,数据内容信息可以用任意字段表示,此处用字段mdat表示数据内容信息,mdat中为具体的媒体数据内容信息,其中moov中所包含的内容对于mdat中的具体的媒体数据内容起到描述和指示作用。本申请在moov结构中新增关于mdat中所包含媒体数据内容的描述信息。For the single-track design of the atlas, as shown in Figure 4-1, the outer information can be represented by any field. Here, the field ftyp is used to represent the outer information, where ftyp is the outermost data box of the encapsulated file to define the file Type and content compatibility. The description indication information can be represented by any field. Here, the field moov is used to represent the description indication information. moov is the data box of the media content description information in the file, which contains various related information describing the transmission media content. The data content information can be represented by any field. Here, the field mdat is used to represent the data content information. The mdat is the specific media data content information. The content contained in the moov is used to describe and describe the specific media data content in the mdat. Indicating role. This application adds description information about the content of media data contained in mdat in the moov structure.
其媒体数据内容形式如附图4-2所示,指示当前数据流中包含的图集数为“n”个,以该数据内容形式为依据,在附图4-1所示的moov数据盒中,新增关于其中媒体内容类型,媒体轨道类型,媒体数据组数量,媒体数据类型及其对应数量,不同数据类型间的关联关系以及索引信息。The media data content format is shown in Figure 4-2, indicating that the number of atlases contained in the current data stream is "n". Based on the data content format, the moov data box shown in Figure 4-1 In, new information about the type of media content, the type of media track, the number of media data groups, the type of media data and its corresponding number, the association relationship between different data types, and the index information are added.
具体地,在moov中,增加关于图集媒体类型的描述“miv”,指示当前的媒体数据流为图集数据流(miv)。指示轨道类型为单轨,指示当前媒体数据流中存在的数据类型为纹理和深度两种类型,增加关于数据数量信息的描述,指示当前数据流中包含的图集数为“n”个,每个图集包含一个深度层和一个纹理层。指示每个图集中对应数据的位置,指示第一个图集的深度层“深度0”在轨道中的位置,指示第一个图集的纹理层“纹理0”在轨道中的位置。以此类推,完成对每个图集中对应纹理和深度位置信息的指示。增加媒体数据流中数据之间关联关系的相关信息,如包含基础视图块的图集0是必要数据,其他补充视图块所在的图集为补充内容,依赖于图集0,并与图集0一同恢复出对应视点的miv图像。Specifically, in moov, the description "miv" about the media type of the atlas is added, indicating that the current media data stream is an atlas data stream (miv). Indicate that the track type is single track, indicate that the data types existing in the current media data stream are texture and depth, add a description about the amount of data, and indicate that the number of atlases contained in the current data stream is "n", each The atlas contains a depth layer and a texture layer. Indicates the position of the corresponding data in each atlas, the position of the depth layer "depth 0" of the first atlas in the track, and the position of the texture layer "texture 0" of the first atlas in the track. By analogy, the instructions for the corresponding texture and depth position information in each atlas are completed. Add relevant information about the relationship between the data in the media data stream. For example, atlas 0 containing the basic view block is necessary data, and the atlas where other supplementary view blocks are located are supplementary content, which depends on atlas 0 and is related to atlas 0 The miv image corresponding to the viewpoint is restored together.
【实施例二】[Embodiment 2]
附图5-1为实施例中点云基于ISOBMFF的数据传输单轨设计图。附图5-2为附图5-1中点云单轨下所针对的数据流示意图。Fig. 5-1 is a design diagram of a single track of ISOBMFF-based data transmission of the point cloud in the embodiment. Fig. 5-2 is a schematic diagram of the data flow targeted under the single track of the point cloud in Fig. 5-1.
针对点云的单轨设计,如附图5-1所示,外层信息可以用任意字段表示,此处用字段ftyp表示外层信息,其中ftyp为封装文件最外层数据盒,用以定义文件类型和内容兼容性,描述指示信息可以用任意字段表示,此处用字段moov表示描述指示信息,moov为文件中媒体内容描述信息的数据盒,里面包含各种对传输媒体内容进行描述的相关信息,数据内容信息可以用任意字段表示,此处 用字段mdat表示数据内容信息,mdat中为具体的媒体数据内容信息,其中moov中所包含的内容对于mdat中的具体的媒体数据内容起到描述和指示作用。本申请在moov结构中新增关于mdat中所包含媒体数据内容的描述信息。For the single-track design of the point cloud, as shown in Figure 5-1, the outer information can be represented by any field. Here, the field ftyp is used to represent the outer information, where ftyp is the outermost data box of the encapsulated file to define the file Type and content compatibility. The description indication information can be represented by any field. Here, the field moov is used to represent the description indication information. moov is the data box of the media content description information in the file, which contains various related information describing the transmission media content. The data content information can be represented by any field. Here, the field mdat is used to represent the data content information. The mdat is the specific media data content information. The content contained in the moov is used to describe and describe the specific media data content in the mdat. Indicating role. This application adds description information about the content of media data contained in mdat in the moov structure.
其媒体数据内容形式如附图5-2所示,mdat中,点云数据第0组至点云数据第n组,每组包含2组纹理(纹理01、纹理02)、几何、占用图以及附加信息。The media data content format is shown in Figure 5-2. In mdat, point cloud data group 0 to point cloud data group n, each group contains 2 sets of textures (texture 01, texture 02), geometry, occupancy map, and extra information.
以该数据内容形式为依据,在附图5-1所示的moov数据盒中,新增关于其中媒体内容类型,媒体轨道类型,媒体数据组数量,媒体数据类型及其对应数量,不同数据类型间的关联关系以及索引信息。Based on the form of the data content, in the moov data box shown in Figure 5-1, add new information about the type of media content, the type of media track, the number of media data groups, the type of media data and its corresponding quantity, and different data types. The relationship between and index information.
具体地:针对点云的单轨设计,如附图5-1所示,在moov结构中增加关于点云媒体类型的描述“点云”,指示当前的媒体数据流为点云数据流(vpcc)。指示轨道类型为单轨,指示当前媒体数据流中存在的数据类型为纹理,几何,占用图和附加信息四种类型,增加关于数据数量信息的描述,指示当前数据流中包含的纹理为“t”个,几何,占用图和附加信息均为“n”个。指示当前纹理信息,纹理1在轨道中的位置,纹理2在轨道中的位置,几何1在轨道中的位置等,以此类推,完成对四种不同类型数据信息的指示。增加媒体数据流中数据之间关联关系的相关信息,如同一点云帧0的几何0,占用图0,附加信息0互为依赖,共同恢复出该帧点云的几何结构0,而纹理0的恢复依赖于几何结构0的恢复,也就是纹理信息0依赖于几何0,占用图0和附加信息0。Specifically: For the single-track design of the point cloud, as shown in Figure 5-1, the description of the point cloud media type "point cloud" is added to the moov structure, indicating that the current media data stream is a point cloud data stream (vpcc) . Indicates that the track type is single track, indicating that the data types existing in the current media data stream are texture, geometry, occupancy map and additional information. The description about the amount of data is added, indicating that the texture contained in the current data stream is "t" Numbers, geometry, occupancy map and additional information are all "n". Indicate the current texture information, the position of texture 1 in the track, the position of texture 2 in the track, the position of geometry 1 in the track, etc., and so on to complete the instructions for four different types of data information. Increase the related information of the relationship between the data in the media data stream, like the geometry 0 of a point cloud frame 0, occupying graph 0, and the additional information 0 is mutually dependent, and together restore the geometry structure 0 of the point cloud of this frame, and the texture 0 The restoration depends on the restoration of the geometric structure 0, that is, the texture information 0 depends on the geometric 0, occupying the image 0 and the additional information 0.
值得说明的是,本申请中,常规使用场景下,同一结构0可以对应同一纹理,即上述实施例二的变形例,数据数量:纹理、几何、占用图以及附加信息均为n个。那么,其他扩展使用场景下,同一结构0也可以对应不同的纹理,即上述实施例二中,数据数量:纹理为t个,几何、占用图以及附加信息均为n个。结构0可以对应纹理00,纹理01,纹理02,典型的应用场景就是点云人物模型换肤,可知,同一几何结构对应的不同纹理间是互为替补关系。图5-2中,每组图集包含一组或多组纹理数据,因此,可知纹理的数据数量t多于其他数据类型(几何、占用图以及附加信息)的数据数量n。It is worth noting that in the present application, in a normal use scenario, the same structure 0 can correspond to the same texture, that is, a modification of the second embodiment above, and the number of data: texture, geometry, occupancy map, and additional information are all n. Then, in other extended use scenarios, the same structure 0 can also correspond to different textures, that is, in the second embodiment, the number of data: the number of textures is t, and the geometry, occupancy map, and additional information are all n. Structure 0 can correspond to texture 00, texture 01, and texture 02. A typical application scenario is point cloud character model skinning. It can be seen that different textures corresponding to the same geometric structure are mutually complementary. In Figure 5-2, each set of atlas contains one or more sets of texture data. Therefore, it can be seen that the number of texture data t is more than the number of data n of other data types (geometry, occupancy map, and additional information).
【实施例三】[Embodiment Three]
附图6-1为实施例中图集基于ISOBMFF的数据传输多轨设计图。附图6-2为附图6-1中图集多轨下所针对的数据流示意图。Fig. 6-1 is an ISOBMFF-based data transmission multi-track design diagram of the atlas in the embodiment. Fig. 6-2 is a schematic diagram of the data flow targeted under the multi-track of the atlas in Fig. 6-1.
针对图集的多轨设计,如附图6-1所示,外层信息可以用任意字段表示,此处用字段ftyp表示外层信息,其中ftyp为封装文件最外层数据盒,用以定义文件类型和内容兼容性,描述指示信息可以用任意字段表示,此处用字段moov表 示描述指示信息,moov为文件中媒体内容描述信息的数据盒,里面包含各种对传输媒体内容进行描述的相关信息,数据内容信息可以用任意字段表示,此处用字段mdat表示数据内容信息,mdat中为具体的媒体数据内容信息,其中moov中所包含的内容对于mdat中的具体的媒体数据内容起到描述和指示作用。本申请在moov结构中新增关于mdat中所包含媒体数据内容的描述信息。For the multi-track design of the atlas, as shown in Figure 6-1, the outer information can be represented by any field. Here, the field ftyp is used to represent the outer information, where ftyp is the outermost data box of the encapsulated file to define File type and content compatibility. The description indication information can be represented by any field. Here, the field moov is used to represent the description indication information. moov is the data box of the media content description information in the file, which contains various related descriptions of the transmission media content. Information, data content information can be represented by any field, here the field mdat is used to represent data content information, mdat is specific media data content information, and the content contained in moov describes the specific media data content in mdat And indicating role. This application adds description information about the content of media data contained in mdat in the moov structure.
其媒体数据内容形式如附图6-2所示。图集数据0至图集数据n,分布于轨道1(Track-1)和轨道2(Track-1)上,每个图集包含一个几何(本实施例中,深度)和一个纹理。以该数据内容形式为依据,在附图6-1所示的moov数据盒中,新增关于其中媒体内容类型,媒体轨道类型,媒体数据组数量,媒体数据类型及其对应数量,不同数据类型间的关联关系以及索引信息。The content format of its media data is shown in Figure 6-2. Atlas data 0 to atlas data n are distributed on track 1 (Track-1) and track 2 (Track-1), and each atlas includes a geometry (in this embodiment, depth) and a texture. Based on the data content form, in the moov data box shown in Figure 6-1, new information about the media content type, media track type, number of media data groups, media data type and its corresponding quantity, and different data types are added. The relationship between and index information.
具体地:如附图6-1所示,在moov结构中增加关于图集媒体类型的描述“miv”,指示当前的媒体数据流为图集数据流(miv)。指示轨道类型为多轨,指示当前媒体数据流中存在的数据类型为纹理和深度两种类型,增加关于数据数量信息的描述,指示当前数据流中包含的图集数为“n”个,每个图集包含一个深度层和一个纹理层。指示每个图集中对应数据类型的轨道和其在轨道中的位置,指示第一个图集的深度层“深度0”在类型为深度的轨道以及在该轨道中的位置,指示第一个图集的纹理层“纹理0”在类型为纹理的轨道以及在该轨道中的位置。以此类推,完成对每个图集中对应纹理和深度位置信息的指示。增加媒体数据流中数据之间关联关系的相关信息,如包含基础视图块的图集0是必要数据,其他补充视图块所在的图集为补充内容,依赖于图集0,并与图集0一同恢复出对应视点的miv图像。Specifically: as shown in Figure 6-1, the description "miv" about the media type of the atlas is added to the moov structure, indicating that the current media data stream is an atlas data stream (miv). Indicate that the track type is multi-track, indicate that the data types existing in the current media data stream are texture and depth, add a description of the data quantity information, and indicate that the number of atlases contained in the current data stream is "n", each An atlas contains a depth layer and a texture layer. Indicate the track of the corresponding data type in each atlas and its position in the track, indicate the depth layer "depth 0" of the first atlas in the track of type depth and the position in the track, indicating the first image The texture layer "texture 0" of the set is in the track of type texture and the position in the track. By analogy, the instructions for the corresponding texture and depth position information in each atlas are completed. Add relevant information about the relationship between the data in the media data stream. For example, atlas 0 containing the basic view block is necessary data, and the atlas where other supplementary view blocks are located are supplementary content, which depends on atlas 0 and is related to atlas 0 The miv image corresponding to the viewpoint is restored together.
【实施例四】[Embodiment Four]
附图7-1为实施例中点云基于ISOBMFF的数据传输多轨设计图。附图7-2为附图7-1中点云多轨下所针对的数据流示意图。Fig. 7-1 is an ISOBMFF-based data transmission multi-track design diagram of the point cloud in the embodiment. Fig. 7-2 is a schematic diagram of the data flow under the multi-track point cloud in Fig. 7-1.
针对点云的多轨设计,如附图7-1所示,外层信息可以用任意字段表示,此处用字段ftyp表示外层信息,其中ftyp为封装文件最外层数据盒,用以定义文件类型和内容兼容性,描述指示信息可以用任意字段表示,此处用字段moov表示描述指示信息,moov为文件中媒体内容描述信息的数据盒,里面包含各种对传输媒体内容进行描述的相关信息,数据内容信息可以用任意字段表示,此处用字段mdat表示数据内容信息,mdat中为具体的媒体数据内容信息,其中moov中所包含的内容对于mdat中的具体的媒体数据内容起到描述和指示作用。本申请在moov结构中新增关于mdat中所包含媒体数据内容的描述信息。For the multi-track design of the point cloud, as shown in Figure 7-1, the outer information can be represented by any field. Here, the field ftyp is used to represent the outer information, where ftyp is the outermost data box of the encapsulated file to define File type and content compatibility. The description indication information can be represented by any field. Here, the field moov is used to represent the description indication information. moov is the data box of the media content description information in the file, which contains various related descriptions of the transmission media content. Information, data content information can be represented by any field, here the field mdat is used to represent data content information, mdat is specific media data content information, and the content contained in moov describes the specific media data content in mdat And indicating role. This application adds description information about the content of media data contained in mdat in the moov structure.
其媒体数据内容形式如附图7-2所示。点云数据0至点云数据n,分布于轨道1至轨道5(Track-1至Track-5)上,点云数据包含t个纹理,几何、占用图 以及附加信息均为n个,其中,第1组纹理分布于Track-1、第2组纹理分布于Track-2,几何、占用图以及附加信息分别分布于Track-3至Track-5。以该数据内容形式为依据,在附图7-1所示的moov数据盒中,新增关于其中媒体内容类型,媒体轨道类型,媒体数据组数量,媒体数据类型及其对应数量,不同数据类型间的关联关系以及索引信息。具体地:The format of the media data content is shown in Figure 7-2. Point cloud data 0 to point cloud data n are distributed on track 1 to track 5 (Track-1 to Track-5). The point cloud data contains t textures, and the geometry, occupancy map and additional information are all n, among which, The first group of textures are distributed in Track-1, the second group of textures are distributed in Track-2, and the geometry, occupancy map, and additional information are distributed in Track-3 to Track-5, respectively. Based on the form of the data content, in the moov data box shown in Figure 7-1, new information is added about the type of media content, the type of media track, the number of media data groups, the type of media data and its corresponding quantity, and the different data types. The relationship between and index information. specifically:
如附图7所示,在moov结构中增加关于点云媒体类型的描述“点云”,指示当前的媒体数据流为点云数据流(vpcc)。指示轨道类型为多轨,指示当前媒体数据流中存在的数据类型为纹理,几何,占用图和附加信息四种类型,增加关于数据数量信息的描述,指示当前数据流中包含的纹理为“t”个,几何,占用图和附加信息均为“n”个。指示当前纹理信息位于的轨道类型和在轨道中的位置,纹理0在类型为纹理的轨道1中以及指示其对应位置,纹理1在轨道类型为纹理的的轨道1中以及指示其对应的位置,几何0在类型为几何的轨道3中的以及指示其对应的位置等,以此类推,完成对四种不同类型数据信息的指示。增加媒体数据流中数据之间关联关系的相关信息,如同一点云帧0的几何0,占用图0,附加信息0互为依赖,共同恢复出该帧点云的几何结构0,而纹理0的恢复依赖于几何结构0的恢复,也就是纹理信息0依赖于几何0,占用图0和附加信息0。As shown in FIG. 7, the description of the point cloud media type "point cloud" is added to the moov structure, indicating that the current media data stream is a point cloud data stream (vpcc). Indicate that the track type is multi-track, indicate that the data types existing in the current media data stream are texture, geometry, occupancy map, and additional information. The description about the amount of data is added, indicating that the texture contained in the current data stream is "t" ", geometry, occupancy map and additional information are all "n". Indicate the track type and position in the track where the current texture information is located, texture 0 is in track 1 whose type is texture and its corresponding position, and texture 1 is in track 1 whose track type is texture and its corresponding position, Geometry 0 is in the track 3 whose type is geometry and indicates its corresponding position, etc., and so on, to complete the instructions for four different types of data information. Increase the related information of the relationship between the data in the media data stream, like the geometry 0 of a point cloud frame 0, occupying graph 0, and the additional information 0 is mutually dependent, and together restore the geometry structure 0 of the point cloud of this frame, and the texture 0 The restoration depends on the restoration of the geometric structure 0, that is, the texture information 0 depends on the geometric 0, occupying the image 0 and the additional information 0.
与上述【实施例二】的方案类似,本申请中,常规使用场景下,同一结构0可以对应同一纹理,即上述实施例四的变形例,数据数量:纹理、几何、占用图以及附加信息均为n个。那么,其他扩展使用场景下,同一结构0也可以对应不同的纹理,即上述实施例四中,数据数量:纹理为t个,几何、占用图以及附加信息均为n个。结构0可以对应纹理00,纹理01,纹理02,典型的应用场景就是点云人物模型换肤,可知,同一几何结构对应的不同纹理间是互为替补关系。Similar to the above-mentioned solution in [Embodiment 2], in this application, in a normal use scenario, the same structure 0 can correspond to the same texture, that is, a modification of the above-mentioned embodiment 4. The amount of data: texture, geometry, occupancy map, and additional information are all For n. Then, in other extended use scenarios, the same structure 0 can also correspond to different textures. That is, in the fourth embodiment, the number of data: the texture is t, and the geometry, occupancy map, and additional information are all n. Structure 0 can correspond to texture 00, texture 01, and texture 02. A typical application scenario is point cloud character model skinning. It can be seen that different textures corresponding to the same geometric structure are mutually complementary.
图7-2中,每组点云包含一组或多组纹理数据,因此,可知纹理的数据数量t多于其他数据类型(几何、占用图以及附加信息)的数据数量n。In Figure 7-2, each set of point clouds contains one or more sets of texture data. Therefore, it can be seen that the number of texture data t is more than the number of data n of other data types (geometry, occupancy map, and additional information).
附图8为多自由度媒体数据解析流程图,用于说明多自由度下多媒体数据的接收方法。如附图10所示,本申请提供了一种多自由度沉浸媒体系统,包含发送端一侧和服务端一侧。其中,服务端包含接收端模块、解析模块以及数据处理模块。在发送端完成对封装好的媒体文件进行发送之后,服务端会通过接收端进行媒体文件的接收,首先会对封装好的媒体文件协议进行解析,根据解析内容对该媒体数据内容进行相应的处理。具体地:如附图8所示:Fig. 8 is a flow chart of multi-degree-of-freedom media data analysis, which is used to illustrate the method of receiving multimedia data under multi-degree-of-freedom. As shown in FIG. 10, this application provides a multi-degree-of-freedom immersive media system, which includes a sender side and a server side. Among them, the server includes a receiving module, a parsing module, and a data processing module. After the sender finishes sending the encapsulated media file, the server will receive the media file through the receiver. First, the encapsulated media file protocol will be parsed, and the media data content will be processed according to the parsed content. . Specifically: as shown in Figure 8:
S1:在发端完成对数据封装传输协议中对应内容的修改之后,服务器端通过接收端收到对应的媒体文件数据,并完成对相关协议的解析,得到媒体内容 数据的描述信息。S1: After the sender completes the modification of the corresponding content in the data encapsulation transmission protocol, the server receives the corresponding media file data through the receiver, and completes the analysis of the related protocol to obtain the description information of the media content data.
S2:数据处理模块会根据S1中解析到的描述信息对媒体内容数据进行处理。首先进行媒体内容判断,判断依据为解析到的媒体类型描述信息。S2: The data processing module will process the media content data according to the description information parsed in S1. First, the media content is judged, and the judgment is based on the parsed media type description information.
S3:根据S2中判断得到的新的多自由度下的媒体内容类型,对应内容下的解析后的数据组数量描述信息,媒体数据类型描述信息以及轨道类型描述信息的获取。S3: According to the media content type under the new multiple degrees of freedom determined in S2, the parsed data group quantity description information under the corresponding content, the media data type description information, and the track type description information are obtained.
S4:在S3完成数据类型描述信息获取的基础上,在解析后的信息中获取关于不同数据类型的关联关系描述信息。S4: On the basis of obtaining the data type description information in S3, obtain the association relationship description information of different data types from the parsed information.
S5:在不同数据类型描述信息和数据组数量描述信息的指导下,完整地获取解析后的各个数据类型对应的数量。S5: Under the guidance of different data type description information and data group quantity description information, the number corresponding to each data type after analysis is completely obtained.
S6:依据S5中获取的不同类型数据的数据组数量,完整地获取解析信息中各个数据类型对应的索引信息,依据S3中获取的轨道类型描述信息,S4中获取的数据类型之间的关联关系描述信息以及S5中获取的各个数据类型的索引信息描述信息的共同作用下,在数据处理端中恢复处所需的媒体内容。S6: According to the number of data groups of different types of data obtained in S5, the index information corresponding to each data type in the analysis information is completely obtained, according to the track type description information obtained in S3, and the association relationship between the data types obtained in S4 Under the combined action of the description information and the index information description information of each data type obtained in S5, the required media content is restored in the data processing terminal.
附图9为对应具体不同的媒体内容的数据解析流程图,对应于具体不同的媒体内容时:动态点云(图9中a)、静态点云(图9中b)、图集视频(图9中c)以及光场(图9中d)时,包含以下步骤:Fig. 9 is a flow chart of data analysis corresponding to specific different media content, when corresponding to specific different media content: dynamic point cloud (a in Fig. 9), static point cloud (b in Fig. 9), atlas video (Fig. 9 c) and light field (figure 9 d), including the following steps:
第一步T1,根据媒体类型描述信息进行媒体类型判断,根据封装内容中已经定义好的媒体类型,如果是传统视频媒体类型,则按旧的沉浸媒体处理流程进行处理。如果是新的多自由度下的沉浸媒体类型,动态点云,静态点云,图集视频,光场,则按照解析得到的媒体类型使用对应的媒体内容处理流程进行处理。In the first step T1, the media type is judged according to the media type description information, and the media type has been defined in the encapsulated content. If it is a traditional video media type, it is processed according to the old immersive media processing flow. If it is an immersive media type under the new multi-degree of freedom, dynamic point cloud, static point cloud, atlas video, light field, the media content processing flow corresponding to the resolved media type is used for processing.
第二步T2,在完成对媒体类型判断后,启动对应媒体类型的处理流程和处理器,同时,进一步获取媒体内容数据组数量,该媒体内容所对应的媒体内容类型和传输时的轨道类型。对于动态点云,如附图9(a)所示,其对应的媒体内容类型有纹理,几何,占用图,附加信息四种,对于静态点云,如附图9(b)所示,其对应的媒体内容类型有纹理,几何和附加信息三种,对于图集视频,如附图9(c),其对应的媒体内容类型有纹理和深度两种,对于光场,如附图9(d)所示,目前其对应的媒体内容类型有纹理和角度两种。In the second step T2, after the media type is judged, the processing flow and processor corresponding to the media type are started, and at the same time, the number of media content data groups, the media content type corresponding to the media content and the track type during transmission are further obtained. For dynamic point clouds, as shown in Figure 9(a), the corresponding media content types include texture, geometry, occupancy map, and additional information. For static point clouds, as shown in Figure 9(b), The corresponding media content types include texture, geometry, and additional information. For atlas videos, as shown in Figure 9(c), the corresponding media content types include texture and depth. For light fields, as shown in Figure 9( As shown in d), the corresponding media content types currently include texture and angle.
第三步T3,完成对对应媒体类型下数据类型的获取之后,结合媒体数据组数量,解析不同媒体数据类型的数量,媒体数据组数量可以辅助媒体数据类型数量的获取,避免内容缺失,同时媒体数据类型数量可以指导数据解析端完成对不同类型数据的完整解析,避免出现内容丢失,影响媒体视频恢复效果。The third step T3, after completing the acquisition of the data type under the corresponding media type, combine the number of media data groups to analyze the number of different media data types. The number of media data groups can assist in the acquisition of the number of media data types, avoiding content loss, and media The number of data types can guide the data analysis terminal to complete the complete analysis of different types of data, avoiding content loss and affecting the media video recovery effect.
第四步T4,完成对数据组数量和数据类型数量的获取之后,解析出对应数据类型的索引信息以及关联关系,结合之前的轨道类型判断结果,进行数据组合,数据组合方式为:The fourth step T4, after completing the acquisition of the number of data groups and the number of data types, the index information and the association relationship of the corresponding data types are parsed, combined with the previous track type judgment results, and the data combination is performed. The data combination method is:
T4.1:图9中a分支所示,对于动态点云而言,根据数据类型间的关联关系,同一组动态点云数据的几何,占用图和附加信息互相依赖恢复出动态点云的几何形状,而纹理的恢复依赖于几何形状的恢复,而同一组动态点云数据中可以有多组对应的纹理信息而只能有一组几何,占用图和附加信息。当轨道类型为单轨时,根据索引信息首先在轨道中找到同一组的几何,占用图和附加信息,完成对点云几何形状的恢复,之后,根据需要索引同一组下的不同纹理数据,找到所需要的纹理数据,在点云几何,占用图和附加信息的基础上完成对纹理信息的恢复。当轨道类型为多轨时,根据索引信息首先根据轨道类型索引中找到几何,占用图和附加信息和纹理所在的轨道,并在对应轨道中根据数据类型索引找到对应类型的数据。首先在对应类型的轨道中找到属于同一组的几何,占用图和附加信息,完成对点云几何形状的恢复,之后,根据需要索引对应纹理轨道中属于同一组的不同纹理数据,找到所需要的纹理数据,在点云几何,占用图和附加信息的基础上完成对纹理信息的恢复。T4.1: As shown in the branch a in Figure 9, for dynamic point clouds, according to the relationship between data types, the geometry of the same group of dynamic point cloud data, the occupancy map and additional information depend on each other to recover the geometry of the dynamic point cloud The restoration of shape and texture depends on the restoration of geometric shapes, and the same set of dynamic point cloud data can have multiple sets of corresponding texture information but only one set of geometry, occupancy map and additional information. When the track type is single track, first find the geometry, occupancy map and additional information of the same group in the track according to the index information, complete the restoration of the point cloud geometry, and then index different texture data under the same group as needed to find all The required texture data is restored on the basis of point cloud geometry, occupancy map and additional information. When the track type is multi-track, first find the geometry, occupancy map and additional information and texture of the track in the track type index according to the index information, and find the data of the corresponding type in the corresponding track according to the data type index. First find the geometry, occupancy map and additional information that belong to the same group in the corresponding type of track, complete the restoration of the point cloud geometry, and then index the different texture data belonging to the same group in the corresponding texture track as needed to find what you need Texture data, complete the restoration of texture information on the basis of point cloud geometry, occupancy map and additional information.
T4.2:图9中b分支所示,对于静态点云而言,根据数据类型间的关联关系,同一组动态点云数据的几何,附加信息互相依赖恢复出动态点云的几何形状,而纹理的恢复依赖于几何形状的恢复。当轨道类型为单轨时,根据索引信息首先在轨道中找到同一组的几何,附加信息,完成对点云几何形状的恢复,之后,根据需要索引同一组下的纹理数据,找到所需要的纹理数据,在点云几何,附加信息的基础上完成对纹理信息的恢复。当轨道类型为多轨时,根据索引信息首先根据轨道类型索引中找到几何,附加信息和纹理所在的轨道,并在对应轨道中根据数据类型索引找到对应类型的数据。首先在对应类型的轨道中找到属于同一组的几何,附加信息,完成对点云几何形状的恢复,之后,根据需要索引对应纹理轨道中属于同一组的纹理数据,找到所需要的纹理数据,在点云几何,附加信息的基础上完成对纹理信息的恢复。T4.2: As shown in the branch b in Figure 9, for static point clouds, according to the relationship between the data types, the geometry of the same group of dynamic point cloud data, additional information depend on each other to restore the geometry of the dynamic point cloud, and The restoration of texture depends on the restoration of geometric shapes. When the track type is single track, first find the same group of geometry in the track according to the index information, additional information, complete the restoration of the point cloud geometry, and then index the texture data under the same group as needed to find the required texture data , Complete the restoration of texture information on the basis of point cloud geometry and additional information. When the track type is multi-track, first find the track where geometry, additional information and texture are located in the track type index according to the index information, and find the corresponding type of data in the corresponding track according to the data type index. First find the geometry and additional information that belong to the same group in the corresponding type of track, and complete the restoration of the point cloud geometry. After that, index the texture data belonging to the same group in the corresponding texture track as needed to find the required texture data. The point cloud geometry, based on additional information, completes the restoration of texture information.
T4.3:图9中c分支所示,对于图集视频而言,根据数据类型间的关联关系,同一组图集数据的深度和纹理互相依赖,共同恢复出图集视频内容。当轨道类型为单轨时,根据索引信息在轨道中找到同一组的纹理和深度,共同组合完成对图像的恢复。当轨道类型为多轨时,根据索引信息首先根据轨道类型索引中找到纹理和深度所在的轨道,并在对应轨道中根据数据类型索引找到对应类型的数据。之后,同一组图集数据的纹理和深度共同恢复出该组图集的内容。T4.3: As shown in the c branch in Figure 9, for atlas videos, according to the association relationship between data types, the depth and texture of the same set of atlas data depend on each other, and the atlas video content is restored together. When the track type is single track, find the texture and depth of the same group in the track according to the index information, and combine them together to complete the restoration of the image. When the track type is multi-track, first find the track where the texture and depth are located according to the track type index according to the index information, and find the data of the corresponding type in the corresponding track according to the data type index. After that, the texture and depth of the same set of atlas data together recover the content of the set of atlas.
T4.4:图9中d分支所示,对于光场而言,根据数据类型间的关联关系, 同一组光场数据的角度和纹理和拓展信息互相依赖,共同恢复出光场的内容。当轨道类型为单轨时,根据索引信息在轨道中找到同一组的纹理和角度和拓展信息,共同组合完成对图像的恢复。当轨道类型为多轨时,根据索引信息首先根据轨道类型索引中找到纹理,角度和拓展信息所在的轨道,并在对应轨道中根据数据类型索引找到对应类型的数据。之后,同一组光场数据的纹理和角度和拓展信息共同恢复出该组光场数据的内容。T4.4: As shown in the d branch in Figure 9, for the light field, according to the correlation between the data types, the angle and texture of the same group of light field data and the extended information are dependent on each other, and the content of the light field is restored together. When the track type is single track, find the same set of texture, angle and expansion information in the track according to the index information, and combine them together to complete the restoration of the image. When the track type is multi-track, first find the track where the texture, angle, and expansion information are located in the track type index according to the index information, and find the corresponding type of data in the corresponding track according to the data type index. After that, the texture and angle of the same group of light field data and the expansion information together recover the content of the group of light field data.
第五步T5,根据对应类型的媒体数据数量和媒体数据类型数量,依次完成对所有媒体数据的解析组合,最终呈现新的多自由度下的沉浸媒体视频内容。The fifth step T5, according to the number of media data of the corresponding type and the number of media data types, complete the analysis and combination of all media data in sequence, and finally present the immersive media video content under the new multiple degrees of freedom.
本申请的申请构思、描述的实施例以及本申请的范围,使得在沉浸媒体系统能够对即将开展的沉浸媒体3Dof+和6Dof相关体验的实现和技术的应用提供系统架构的支持。The application concept of this application, the described embodiments, and the scope of this application enable the immersive media system to provide system architecture support for the upcoming implementation of immersive media 3Dof+ and 6Dof related experiences and technical applications.
需要说明的是,本实施例虽以ISOBMFF等封装协议和基于图集和点云技术为例阐明所提出的沉浸媒体3Dof+和6Dof元数据及其结构、参数内容、数据及其封装、传输方式,但是本实施例的新的多自由度下的沉浸媒体数据形式和内容也可采用其它格式,参数表达和文件进行封装和传输,如使用MMT,SMT传输,使用ISOBMFF封装,也可以是基于OMAF(omnidirectional media application format,全景媒体的应用格式)的扩展,并不影响本申请核心技术的表达。It should be noted that although this embodiment uses packaging protocols such as ISOBMFF and atlas and point cloud technologies as examples to illustrate the proposed immersive media 3Dof+ and 6Dof metadata and its structure, parameter content, data and its packaging and transmission methods, However, the immersive media data form and content under the new multiple degrees of freedom in this embodiment can also be encapsulated and transmitted in other formats, parameter expressions and files, such as using MMT, SMT transmission, ISOBMFF encapsulation, or based on OMAF ( The expansion of omnidirectional media application format, the application format of panoramic media, does not affect the expression of the core technology of this application.
如附图10所示,本申请提供了一种多自由度沉浸媒体系统,包含发送端一侧和服务端一侧。其中,服务端包含接收端模块、解析模块以及数据处理模块。在发送端完成对封装好的媒体文件进行发送之后,服务端会通过接收端进行媒体文件的接收,首先会对封装好的媒体文件协议进行解析,根据解析内容对该媒体数据内容进行相应的处理。As shown in FIG. 10, this application provides a multi-degree-of-freedom immersive media system, which includes a sender side and a server side. Among them, the server includes a receiving module, a parsing module, and a data processing module. After the sender finishes sending the encapsulated media file, the server will receive the media file through the receiver. First, the encapsulated media file protocol will be parsed, and the media data content will be processed according to the parsed content. .
如图10所示,提供了处理器和耦接至该处理器的存储器。当执行存储器中的计算机可读程序时,处理器可配置为执行结合图1-9所描述的多自由度下多媒体数据的接收方法以及系统。As shown in FIG. 10, a processor and a memory coupled to the processor are provided. When executing the computer-readable program in the memory, the processor may be configured to execute the method and system for receiving multimedia data in multiple degrees of freedom described in conjunction with FIGS. 1-9.
本领域技术人员知道,除了以纯计算机可读程序代码方式实现本申请提供的系统及其各个装置、模块、单元以外,完全可以通过将方法步骤进行逻辑编程来使得本申请提供的系统及其各个装置、模块、单元以逻辑门、开关、专用集成电路、可编程逻辑控制器以及嵌入式微控制器等的形式来实现相同功能。所以,本申请提供的系统及其各项装置、模块、单元可以被认为是一种硬件部件,而对其内包括的用于实现各种功能的装置、模块、单元也可以视为硬件部件内的结构;也可以将用于实现各种功能的装置、模块、单元视为既可以是实现方法的软件模块又可以是硬件部件内的结构。Those skilled in the art know that, in addition to implementing the system and its various devices, modules, and units provided by this application in a purely computer-readable program code manner, it is completely possible to make the system and its various devices provided by this application by logically programming the method steps , Modules and units implement the same functions in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers and embedded microcontrollers. Therefore, the system and its various devices, modules, and units provided in this application can be regarded as a hardware component, and the devices, modules, and units included in the system for realizing various functions can also be regarded as hardware components. The structure; the devices, modules, and units used to implement various functions can also be regarded as both software modules for implementing methods and structures within hardware components.
本领域技术人员将进一步领会,结合本文中所公开的实施例来描述的各种解说性逻辑板块、模块、电路、和算法步骤可实现为电子硬件、计算机软件、或这两者的组合。为清楚地解说硬件与软件的这一可互换性,各种解说性组件、框、模块、电路、和步骤在上面是以其功能性的形式作一般化描述的。此类功能性是被实现为硬件还是软件取决于具体应用和施加于整体系统的设计约束。技术人员对于每种特定应用可用不同的方式来实现所描述的功能性,但这样的实现决策不应被解读成导致脱离了本申请的范围。Those skilled in the art will further appreciate that the various illustrative logic blocks, modules, circuits, and algorithm steps described in conjunction with the embodiments disclosed herein can be implemented as electronic hardware, computer software, or a combination of the two. In order to clearly explain the interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps are generally described above in terms of their functionality. Whether such functionality is implemented as hardware or software depends on the specific application and the design constraints imposed on the overall system. Technicians can implement the described functionality in different ways for each specific application, but such implementation decisions should not be interpreted as causing a departure from the scope of this application.
结合本文所公开的实施例描述的各种解说性逻辑模块、和电路可用通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或其它可编程逻辑器件、分立的门或晶体管逻辑、分立的硬件组件、或其设计成执行本文所描述功能的任何组合来实现或执行。通用处理器可以是微处理器,但在替换方案中,该处理器可以是任何常规的处理器、控制器、微控制器、或状态机。处理器还可以被实现为计算设备的组合,例如DSP与微处理器的组合、多个微处理器、与DSP核心协作的一个或多个微处理器、或任何其他此类配置。The various illustrative logic modules and circuits described in conjunction with the embodiments disclosed herein can be used with general-purpose processors, digital signal processors (DSP), application-specific integrated circuits (ASIC), field programmable gate arrays (FPGA) or other programmable Logic devices, discrete gates or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein are implemented or executed. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. The processor may also be implemented as a combination of computing devices, such as a combination of a DSP and a microprocessor, multiple microprocessors, one or more microprocessors cooperating with a DSP core, or any other such configuration.
结合本文中公开的实施例描述的方法或算法的步骤可直接在硬件中、在由处理器执行的软件模块中、或在这两者的组合中体现。软件模块可驻留在RAM存储器、闪存、ROM存储器、EPROM存储器、EEPROM存储器、寄存器、硬盘、可移动盘、CD-ROM、或本领域中所知的任何其他形式的存储介质中。示例性存储介质耦合到处理器以使得该处理器能从/向该存储介质读取和写入信息。在替换方案中,存储介质可以被整合到处理器。处理器和存储介质可驻留在ASIC中。ASIC可驻留在用户终端中。在替换方案中,处理器和存储介质可作为分立组件驻留在用户终端中。The steps of the method or algorithm described in conjunction with the embodiments disclosed herein may be directly embodied in hardware, in a software module executed by a processor, or in a combination of the two. The software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such that the processor can read information from and write information to the storage medium. In the alternative, the storage medium may be integrated into the processor. The processor and the storage medium may reside in the ASIC. The ASIC may reside in the user terminal. In the alternative, the processor and the storage medium may reside as discrete components in the user terminal.
在一个或多个示例性实施例中,所描述的功能可在硬件、软件、固件或其任何组合中实现。如果在软件中实现为计算机程序产品,则各功能可以作为一条或更多条指令或代码存储在计算机可读介质上或藉其进行传送。计算机可读介质包括计算机存储介质和通信介质两者,其包括促成计算机程序从一地向另一地转移的任何介质。存储介质可以是能被计算机访问的任何可用介质。作为示例而非限定,这样的计算机可读介质可包括RAM、ROM、EEPROM、CD-ROM或其它光盘存储、磁盘存储或其它磁存储设备、或能被用来携带或存储指令或数据结构形式的合意程序代码且能被计算机访问的任何其它介质。任何连接也被正当地称为计算机可读介质。例如,如果软件是使用同轴电缆、光纤电缆、双绞线、数字订户线(DSL)、或诸如红外、无线电、以及微波之类的无线技术从web网站、服务器、或其它远程源传送而来,则该同轴电缆、光纤电缆、双绞线、DSL、或诸如红外、无线电、以及微波之类的无线技术就被包括在介质的 定义之中。如本文中所使用的盘(disk)和碟(disc)包括压缩碟(CD)、激光碟、光碟、数字多用碟(DVD)、软盘和蓝光碟,其中盘(disk)往往以磁的方式再现数据,而碟(disc)用激光以光学方式再现数据。上述的组合也应被包括在计算机可读介质的范围内。In one or more exemplary embodiments, the described functions may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software as a computer program product, each function can be stored as one or more instructions or codes on a computer-readable medium or transmitted through it. Computer-readable media includes both computer storage media and communication media, including any medium that facilitates the transfer of a computer program from one place to another. The storage medium may be any available medium that can be accessed by a computer. By way of example and not limitation, such computer-readable media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or can be used to carry or store instructions or data in the form of a structure Any other medium that agrees with the program code and can be accessed by a computer. Any connection is also properly called a computer-readable medium. For example, if the software is transmitted from a web site, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave , Then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of the medium. Disks and discs as used in this article include compact discs (CD), laser discs, optical discs, digital versatile discs (DVD), floppy disks and Blu-ray discs, in which disks are often reproduced in a magnetic manner Data, and a disc (disc) optically reproduces the data with a laser. Combinations of the above should also be included in the scope of computer-readable media.

Claims (15)

  1. 一种多自由度下多媒体数据的发送方法,包括:A method for sending multimedia data with multiple degrees of freedom, including:
    对多媒体数据按照封装传输协议进行封装,该封装传输协议包含:The multimedia data is encapsulated according to the encapsulation transmission protocol, the encapsulation transmission protocol includes:
    确定多媒体数据的属性信息,包含:针对多媒体数据的不同媒体类型,确定数据类型;确定并标识媒体类型的多媒体数据所在轨道媒体流的数量和位置信息;和确定不同媒体数据中多个数据内容之间的关联关系;以及,Determining the attribute information of the multimedia data includes: determining the data type for different media types of the multimedia data; determining and identifying the number and location information of the media stream of the track where the multimedia data of the media type is located; and determining the number of data contents in different media data The relationship between; and,
    对所述属性信息分别确定相对应的索引方式和索引信息,Respectively determine the corresponding index mode and index information for the attribute information,
    将封装后的多媒体数据进行传输。Transmit the encapsulated multimedia data.
  2. 根据权利要求1所述的多自由度下多媒体数据的发送方法,其中,多媒体数据的数据形式包含3Dof+方式和/或6Dof方式;封装传输适用于MPEG媒体文件传输MMT方式或智能媒体传输SMT方式或基于ISO的媒体文件格式ISOBMFF或全景媒体应用OMAF的扩展方式。The method for sending multimedia data under multiple degrees of freedom according to claim 1, wherein the data form of the multimedia data includes 3Dof+ mode and/or 6Dof mode; encapsulation transmission is suitable for MPEG media file transmission MMT mode or smart media transmission SMT mode or ISO-based media file format ISOBMFF or panoramic media application OMAF extension method.
  3. 根据权利要求1所述的多自由度下多媒体数据的发送方法,其中,多媒体数据的不同媒体类型包括以下任一种或几种:传统二维视频、图集视频、动态点云、静态点云、光场。The method for sending multimedia data with multiple degrees of freedom according to claim 1, wherein the different media types of the multimedia data include any one or more of the following: traditional two-dimensional video, atlas video, dynamic point cloud, static point cloud , Light field.
  4. 根据权利要求1所述的多自由度下多媒体数据的发送方法,其中,确定多媒体数据的数据类型:The method for sending multimedia data with multiple degrees of freedom according to claim 1, wherein determining the data type of the multimedia data:
    当媒体类型为图集视频时,数据类型包含纹理数据和深度数据;When the media type is atlas video, the data type includes texture data and depth data;
    当媒体类型为动态点云时,数据类型包含纹理、几何、占用图和附加信息数据;When the media type is dynamic point cloud, the data type includes texture, geometry, occupancy map and additional information data;
    当媒体类型为静态点云时,数据类型包含纹理、几何、和附加信息数据;When the media type is static point cloud, the data type includes texture, geometry, and additional information data;
    当媒体类型为光场时,数据类型包含纹理数据、和角度数据。When the media type is light field, the data type includes texture data and angle data.
  5. 根据权利要求1所述的多自由度下多媒体数据的发送方法,其中,确定多媒体数据的数据类型,还包括:The method for sending multimedia data with multiple degrees of freedom according to claim 1, wherein determining the data type of the multimedia data further comprises:
    针对每个数据类型确定对应的数据类型的数据组数。Determine the number of data groups of the corresponding data type for each data type.
  6. 根据权利要求5所述的多自由度下多媒体数据的发送方法,其中,不同数据类型的数据组数之间对应关系包含:The method for sending multimedia data with multiple degrees of freedom according to claim 5, wherein the correspondence between the numbers of data groups of different data types comprises:
    同一结构对应同一的纹理;或者,The same structure corresponds to the same texture; or,
    同一结构对应的不同且互为替补关系的纹理。The same structure corresponds to different textures that are substitutes for each other.
  7. 根据权利要求1所述的多自由度下多媒体数据的发送方法,其中,确定并标识媒体类型的多媒体数据所在轨道媒体流的数量和位置信息,包含:The method for sending multimedia data in multiple degrees of freedom according to claim 1, wherein the number and location information of the track media stream where the multimedia data of the media type is determined and identified includes:
    定义轨道类型,表明每种媒体类型的多媒体数据在一个或至少两个轨道中,其中,Define the track type, indicating that the multimedia data of each media type is in one or at least two tracks, among which,
    单轨时:定义多媒体数据所在媒体轨道号;以及定义多媒体数据中每个数据在轨道中的具体位置;Single track: define the media track number where the multimedia data is located; and define the specific position of each data in the multimedia data in the track;
    至少两轨时:定义多媒体数据中包含的每个数据在媒体轨道号,以及定义多媒体数据中每个数据在轨道中的具体位置。At least two tracks: define the media track number of each data contained in the multimedia data, and define the specific position of each data in the multimedia data in the track.
  8. 根据权利要求1所述的多自由度下多媒体数据的发送方法,其中,确定不同媒体数据中多个数据内容之间的关联关系,该关联关系包含:The method for sending multimedia data with multiple degrees of freedom according to claim 1, wherein the association relationship between multiple data contents in different media data is determined, and the association relationship includes:
    数据内容之间相互依赖和/或数据内容之间单一依赖和/或数据内容之间互相替换。Data content is interdependent and/or data content is single dependent and/or data content is interchangeable.
  9. 根据权利要求8所述的多自由度下多媒体数据的发送方法,其中,The method for transmitting multimedia data in multiple degrees of freedom according to claim 8, wherein:
    相互依赖的关联关系包含:图集中纹理和深度数据相互依赖;点云中的几何、占用图、附加信息之间相互依赖共同构建出点云几何骨架;The interdependent relationship includes: the texture and depth data in the atlas are dependent on each other; the geometry, occupancy map, and additional information in the point cloud are dependent on each other to jointly construct the point cloud geometric skeleton;
    单一依赖的关联关系包含:点云中的纹理数据需要依赖几何、占用图、附加信息共同构建几何骨架;附加图集依赖基本图集;以及,The single dependent association relationship includes: the texture data in the point cloud needs to rely on geometry, occupancy map, and additional information to jointly construct the geometric skeleton; additional atlas depends on the basic atlas; and,
    互相替换的关联关系包含:针对同一个点云几何骨架,配以不同的纹理数据用于替换。The mutual replacement relationship includes: for the same point cloud geometric skeleton, different texture data are used for replacement.
  10. 根据权利要求1所述的多自由度下多媒体数据的发送方法,其中,索引信息包含上述属性信息的集合,该属性信息分别放在封装传输协议的不同层级来描述,或者,定义包含该媒体的所有属性信息的索引。The method for sending multimedia data with multiple degrees of freedom according to claim 1, wherein the index information includes a collection of the above-mentioned attribute information, and the attribute information is respectively placed at different levels of the encapsulation transmission protocol to describe, or the definition includes the media Index of all attribute information.
  11. 根据权利要求1所述的多自由度下多媒体数据的发送方法,其中,所针对的多媒体数据的数据流包含以下任一种或几种:外层信息、描述指示信息、数据内容信息;The method for sending multimedia data with multiple degrees of freedom according to claim 1, wherein the targeted multimedia data data stream includes any one or more of the following: outer layer information, description indication information, and data content information;
    其中,外层信息,用于定义多媒体数据的文件类型和内容兼容性;描述指示信息,用于对多媒体数据进行描述和指示;数据内容信息,用于多媒体数据的具体内容信息。Among them, the outer layer information is used to define the file type and content compatibility of the multimedia data; the description indication information is used to describe and indicate the multimedia data; the data content information is used for the specific content information of the multimedia data.
  12. 一种多自由度下多媒体数据的接收方法,包括:A method for receiving multimedia data under multiple degrees of freedom, including:
    对封装的多媒体数据进行接收,按照与权利要求1相逆的封装传输协议进行解析,根据解析内容对该多媒体数据进行相应的处理。The encapsulated multimedia data is received, analyzed according to the encapsulated transmission protocol that is inverse to claim 1, and the multimedia data is processed correspondingly according to the parsed content.
  13. 根据权利要求12所述的多自由度下多媒体数据的接收方法,其中,包括:The method for receiving multimedia data with multiple degrees of freedom according to claim 12, which comprises:
    S1:接收多媒体数据的媒体内容数据,按照封装传输协议进行解析,得到多媒体数据的描述指示信息;S1: Receive the media content data of the multimedia data, analyze it according to the encapsulation transmission protocol, and obtain the description instruction information of the multimedia data;
    S2:依据描述指示信息判断媒体内容数据;S2: Judging the media content data according to the description instruction information;
    S3:依据媒体内容类型,解析获取对应媒体内容类型下的数据组数量描述信息和/或媒体数据类型描述信息和/或轨道类型描述信息;S3: According to the media content type, analyze and obtain the data group quantity description information and/or media data type description information and/or track type description information under the corresponding media content type;
    S4:获取媒体数据类型描述信息,解析获取关于不同数据类型的关联关系描述信息;S4: Obtain the description information of the media data type, and analyze and obtain the description information about the association relationship of different data types;
    S5:基于不同媒体数据类型描述信息和数据组数量描述信息,完整获取解析后的各个数据类型对应的数量;S5: Based on the description information of different media data types and the description information of the number of data groups, the number corresponding to each data type after analysis is completely obtained;
    S6:依据不同类型数据的数据组数量,完整地获取解析信息中各个数据类型对应的索引信息,依据S3中获取的轨道类型描述信息、S4中获取的数据类型之间的关联关系描述信息以及S5中获取的各个数据类型的索引信息描述信息,得到所需的媒体内容。S6: Completely obtain the index information corresponding to each data type in the analysis information according to the number of data groups of different types of data, according to the track type description information obtained in S3, the association relationship description information between the data types obtained in S4, and S5 The index information description information of each data type obtained in, obtain the required media content.
  14. 一种媒体处理器,包括:A media processor, including:
    存储模块、接收模块、解析模块以及数据处理模块,用于接收多媒体数据按照封装传输协议进行解析处理,该封装传输协议包含:The storage module, the receiving module, the analysis module, and the data processing module are used to receive multimedia data for analysis and processing according to the encapsulation transmission protocol, the encapsulation transmission protocol includes:
    确定多媒体数据的属性信息,包含:针对多媒体数据的不同媒体类型,确定数据类型;确定并标识媒体类型的多媒体数据所在轨道媒体流的数量和位置信息;和确定不同媒体数据中多个数据内容之间的关联关系;以及,Determining the attribute information of the multimedia data includes: determining the data type for different media types of the multimedia data; determining and identifying the number and location information of the media stream of the track where the multimedia data of the media type is located; and determining the number of data contents in different media data The relationship between; and,
    对所述属性信息分别确定相对应的索引方式和索引信息。The corresponding index mode and index information are respectively determined for the attribute information.
  15. 一种播放器,包括:A player including:
    存储模块、接收模块、解析模块以及数据处理模块,用于接收多媒体数据按照封装传输协议进行解析处理,该封装传输协议包含:The storage module, the receiving module, the analysis module, and the data processing module are used to receive multimedia data for analysis and processing according to the encapsulation transmission protocol, the encapsulation transmission protocol includes:
    确定多媒体数据的属性信息,包含:针对多媒体数据的不同媒体类型,确定数据类型;确定并标识媒体类型的多媒体数据所在轨道媒体流的数量和位置信息;和确定不同媒体数据中多个数据内容之间的关联关系;以及,Determining the attribute information of the multimedia data includes: determining the data type for different media types of the multimedia data; determining and identifying the number and location information of the media stream of the track where the multimedia data of the media type is located; and determining the number of data contents in different media data The relationship between; and,
    对所述属性信息分别确定相对应的索引方式和索引信息。The corresponding index mode and index information are respectively determined for the attribute information.
PCT/CN2021/087805 2020-04-16 2021-04-16 Multimedia data transmission and reception methods, system, processor, and player WO2021209044A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010301699.0A CN113542907B (en) 2020-04-16 2020-04-16 Multimedia data transceiving method, system, processor and player
CN202010301699.0 2020-04-16

Publications (1)

Publication Number Publication Date
WO2021209044A1 true WO2021209044A1 (en) 2021-10-21

Family

ID=78084686

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/087805 WO2021209044A1 (en) 2020-04-16 2021-04-16 Multimedia data transmission and reception methods, system, processor, and player

Country Status (2)

Country Link
CN (1) CN113542907B (en)
WO (1) WO2021209044A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023169003A1 (en) * 2022-03-11 2023-09-14 腾讯科技(深圳)有限公司 Point cloud media decoding method and apparatus and point cloud media coding method and apparatus

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116800987A (en) * 2022-03-14 2023-09-22 中兴通讯股份有限公司 Data processing method, apparatus, device, storage medium, and program product

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080250047A1 (en) * 2007-04-03 2008-10-09 Nokia Corporation System and method for using multiple meta boxes in the iso base media file format
CN101978699A (en) * 2008-01-25 2011-02-16 电子部品研究院 Stereoscopic video file format and computer readable recording medium in which stereoscopic video file is recorded according thereto
CN102005231A (en) * 2010-09-08 2011-04-06 东莞电子科技大学电子信息工程研究院 Storage method of rich-media scene flows
CN108271068A (en) * 2016-12-30 2018-07-10 华为技术有限公司 A kind of processing method and processing device of the video data based on stream media technology
CN109155874A (en) * 2016-05-23 2019-01-04 佳能株式会社 The method, apparatus and computer program of the self adaptation stream transmission of virtual reality media content

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7402743B2 (en) * 2005-06-30 2008-07-22 Body Harp Interactive Corporation Free-space human interface for interactive music, full-body musical instrument, and immersive media controller
US9922680B2 (en) * 2015-02-10 2018-03-20 Nokia Technologies Oy Method, an apparatus and a computer program product for processing image sequence tracks
CN105536268A (en) * 2015-12-15 2016-05-04 广州中国科学院先进技术研究所 Six-degree-of-freedom virtual reality dynamic seat and seat platform
US10389999B2 (en) * 2016-02-17 2019-08-20 Qualcomm Incorporated Storage of virtual reality video in media files
CN106178551B (en) * 2016-06-27 2018-01-30 山东大学 A kind of real-time rendering interactive movie theatre system and method based on multi-modal interaction
GB2563865A (en) * 2017-06-27 2019-01-02 Canon Kk Method, device, and computer program for transmitting media content
US20190104326A1 (en) * 2017-10-03 2019-04-04 Qualcomm Incorporated Content source description for immersive media data
US10559126B2 (en) * 2017-10-13 2020-02-11 Samsung Electronics Co., Ltd. 6DoF media consumption architecture using 2D video decoder
CN113178019B (en) * 2018-07-09 2023-01-03 上海交通大学 Indication information identification method, system and storage medium based on video content
CN110944222B (en) * 2018-09-21 2021-02-12 上海交通大学 Method and system for immersive media content as user moves
CN110971906B (en) * 2018-09-29 2021-11-30 上海交通大学 Hierarchical point cloud code stream packaging method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080250047A1 (en) * 2007-04-03 2008-10-09 Nokia Corporation System and method for using multiple meta boxes in the iso base media file format
CN101978699A (en) * 2008-01-25 2011-02-16 电子部品研究院 Stereoscopic video file format and computer readable recording medium in which stereoscopic video file is recorded according thereto
CN102005231A (en) * 2010-09-08 2011-04-06 东莞电子科技大学电子信息工程研究院 Storage method of rich-media scene flows
CN109155874A (en) * 2016-05-23 2019-01-04 佳能株式会社 The method, apparatus and computer program of the self adaptation stream transmission of virtual reality media content
CN108271068A (en) * 2016-12-30 2018-07-10 华为技术有限公司 A kind of processing method and processing device of the video data based on stream media technology

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023169003A1 (en) * 2022-03-11 2023-09-14 腾讯科技(深圳)有限公司 Point cloud media decoding method and apparatus and point cloud media coding method and apparatus

Also Published As

Publication number Publication date
CN113542907A (en) 2021-10-22
CN113542907B (en) 2022-09-23

Similar Documents

Publication Publication Date Title
JP5022443B2 (en) Method of decoding metadata used for playback of stereoscopic video content
CN107682688B (en) Video real-time recording method and recording equipment based on augmented reality
US9224246B2 (en) Method and apparatus for processing media file for augmented reality service
US20120288257A1 (en) Image processing device, information recording medium, image processing method, and program
JP2020099045A (en) Method and device for overlaying 3d graphic over 3d video
WO2021209044A1 (en) Multimedia data transmission and reception methods, system, processor, and player
CN106713988A (en) Beautifying method and system for virtual scene live
EP2242262A2 (en) Data structure, recording medium, playback apparatus and method, and program
US10965928B2 (en) Method for 360 video processing based on multiple viewpoints and apparatus therefor
CN106303289A (en) A kind of real object and virtual scene are merged the method for display, Apparatus and system
KR20100002048A (en) Image processing method, image outputting method, and apparatuses thereof
CN114697668B (en) Encoding and decoding method of point cloud media and related products
KR20200017534A (en) Method and apparatus for transmitting and receiving media data
JP2012244622A (en) Content converter, content conversion method and storage medium thereof
CN110971906A (en) Hierarchical point cloud code stream packaging method and system
US20230048715A1 (en) Point cloud data encapsulation method and point cloud data transmission method
US20230224533A1 (en) Mapping architecture of immersive technologies media format (itmf) specification with rendering engines
WO2023207119A1 (en) Immersive media processing method and apparatus, device, and storage medium
CN113891117A (en) Immersion medium data processing method, device, equipment and readable storage medium
CN102474650B (en) Reproduction apparatus of stereovision video, integrated circuit, and reproduction method
US20140369422A1 (en) Remultiplexing Bitstreams of Encoded Video for Video Playback
US11937070B2 (en) Layered description of space of interest
JP2012513148A (en) Method for transmitting data relating to stereoscopic video, method for reproducing stereoscopic video, and method for generating file of stereoscopic video data
US11122252B2 (en) Image processing device, display device, information recording medium, image processing method, and program for virtual reality content
WO2023169003A1 (en) Point cloud media decoding method and apparatus and point cloud media coding method and apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21789486

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21789486

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 21789486

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17/05/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21789486

Country of ref document: EP

Kind code of ref document: A1