WO2021209044A1 - Multimedia data transmission and reception methods, system, processor, and player - Google Patents
Multimedia data transmission and reception methods, system, processor, and player Download PDFInfo
- Publication number
- WO2021209044A1 WO2021209044A1 PCT/CN2021/087805 CN2021087805W WO2021209044A1 WO 2021209044 A1 WO2021209044 A1 WO 2021209044A1 CN 2021087805 W CN2021087805 W CN 2021087805W WO 2021209044 A1 WO2021209044 A1 WO 2021209044A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- media
- information
- type
- multimedia data
- Prior art date
Links
- 230000005540 biological transmission Effects 0.000 title claims abstract description 52
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000012545 processing Methods 0.000 claims description 28
- 238000005538 encapsulation Methods 0.000 claims description 26
- 238000004458 analytical method Methods 0.000 claims description 18
- 230000003068 static effect Effects 0.000 claims description 12
- 230000001419 dependent effect Effects 0.000 claims description 11
- 238000013461 design Methods 0.000 abstract description 33
- 230000000007 visual effect Effects 0.000 abstract description 9
- 238000004806 packaging method and process Methods 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 22
- 238000005516 engineering process Methods 0.000 description 18
- AWSBQWZZLBPUQH-UHFFFAOYSA-N mdat Chemical compound C1=C2CC(N)CCC2=CC2=C1OCO2 AWSBQWZZLBPUQH-UHFFFAOYSA-N 0.000 description 17
- 230000006870 function Effects 0.000 description 9
- 238000006073 displacement reaction Methods 0.000 description 8
- 238000007405 data analysis Methods 0.000 description 5
- 238000007906 compression Methods 0.000 description 4
- 230000006835 compression Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000000295 complement effect Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000008447 perception Effects 0.000 description 3
- 238000004590 computer program Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/435—Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/816—Monomedia components thereof involving special video data, e.g 3D video
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/845—Structuring of content, e.g. decomposing content into time segments
Definitions
- This application belongs to the field of immersive multimedia, and specifically relates to a method for sending and receiving multimedia data with multiple degrees of freedom, a multimedia data system with multiple degrees of freedom, and a media processor and player.
- VR virtual reality
- HMD head-mounted displays
- the immersive media produced by the VR system represents a virtual space where users can interact naturally as in the real world.
- Virtual reality renders visual and auditory sensory stimuli in the real world and presents them to users. The user starts to look around from a display area in a three-dimensional space, and at the same time obtains the associated audio according to the window.
- the traditional immersive media system design is mainly aimed at the omnidirectional video media transmission under 3Dof, and the degree of freedom that the content consumer has in the media experience.
- 3Dof media content when the consumer experiences 3Dof media content, it has and only has three free head rotation operations, which are around the three coordinate axes of the three-dimensional rectangular coordinate system with the consumer's head as the origin. Spin.
- the relevant media used to realize this immersive media experience is a series of technologies related to omnidirectional video, and the media content is also targeted at the data transmitted, that is, the 2D image frame of the traditional video form is designed, which leads to the media content oriented to the system structure Relatively single such problem.
- the 3Dof+ media experience form adds the freedom of limited head displacement on the basis of the three head freedoms, that is, immersive media content consumers can obtain different media contents through displacement within a certain limit.
- the sense of parallax generated by the displacement can be perceived by the device and the system can feed back different media content brought about by the parallax in real time to match the operation behavior of the consumer.
- 3DoF+ video is made from content acquired by multiple cameras deployed to predict user displacement.
- the depth image scene presented by 3DoF+ media is obtained by 2D image synthesis, where the 2D image is composed of texture components and corresponding depth components. Depth information can be directly collected by camera equipment or obtained indirectly through algorithms; or, 3DoF+ view can be synthesized from a planar image of a background area and multiple foreground images (non-planar).
- the current media content processing and data form include, for 3Dof+, the Atlas related technology is mainly used to realize the atlas related technology.
- the International Organization for Standardization MPEG has already implemented the Atlas related technology.
- this type of solution uses texture components and corresponding depth components to form an atlas (Atlas) for encapsulation and transmission.
- the atlas is a collection of rectangular blocks from one or more 2D images into an image pair, the image pair contains a texture component image and a corresponding depth component image.
- the images of different viewpoints taken by different angle cameras are trimmed to obtain a basic atlas containing basic image blocks and additional atlases containing supplementary image blocks.
- the basic atlas and the additional atlas 1 are used to generate the visual field image 1.
- the basic atlas and the additional atlas 2 generate a field of view image 2.
- the method of using the atlas can reduce the amount of data that needs to be transmitted to a certain extent on the premise of realizing the corresponding media function, and has a better reconstruction effect on the user side.
- 6Dof is a richer immersive media experience based on 3Dof and 3Dof+.
- the displacement of the three coordinate axes in the three-dimensional space with itself as the origin is added.
- the processing of traditional video media content can no longer meet the requirements.
- the media content and technology to realize 6Dof related media experience are still in the exploration stage, mainly point cloud, light field, etc.
- the point cloud data content is shown as an example in Figure 2, showing 6 degrees of freedom (6Dof video) immersive media data content
- the presentation is the surface information of the object obtained by scanning, including three-dimensional coordinate data, depth information, color information, etc., forming a geometric skeleton and then further point cloud presentation.
- point cloud data compression algorithms for static and dynamic point cloud data, as well as different types of point cloud data such as machine perception and human eye perception.
- a typical point cloud compression algorithm is to convert 3D point cloud data into 2D image data, and then perform data processing, one of which is video-based point cloud compression (Video- based Point Cloud Compression, VPCC) algorithm.
- Video- based Point Cloud Compression Video- based Point Cloud Compression
- This compression method first projects a 3D point cloud onto a 2D plane to obtain occupancy map information, geometric information, attribute information, and auxiliary information.
- the attribute information usually includes texture information and color information. Therefore, the compressed information is usually divided into four categories.
- Data is transferred. They are geometric information, attribute information, occupancy map information, and auxiliary information.
- the decoding of geometric information relies on occupancy map information and auxiliary information
- the decoding of attribute information relies on geometric information, occupancy map information and auxiliary information.
- Point cloud media needs to process different types of data simultaneously, and after integration, present media with rich spatial and texture characteristics to users. As the exploration of related technologies progresses, the system also needs to complete and update the corresponding content for the exploration of 6Dof.
- a higher degree of freedom of immersive media experience means more diverse information and data types. Whether it is an atlas, point cloud, or other forms of media such as light fields, its information content is diversified. If you want to achieve a new immersive media experience under multiple degrees of freedom, it originally only supports a single content structure design.
- the immersive media system framework will not be able to effectively support the storage and transmission design of the new multi-degree-of-freedom media content, and it is necessary to design new information and structures in the new multi-degree-of-freedom media.
- this application proposes a multi-degree-of-freedom multimedia data sending method, a receiving method, a multi-degree-of-freedom multimedia data system, and a media processor and player.
- This application provides a method for sending multimedia data with multiple degrees of freedom, including: encapsulating the multimedia data according to an encapsulation transmission protocol.
- the encapsulation transmission protocol includes: determining the attribute information of the multimedia data, including: different media types for the multimedia data , Determine the data type; determine and identify the number and location information of the media stream in the track where the multimedia data of the media type is located; and determine the association relationship between multiple data contents in different media data; and determine the corresponding attribute information respectively
- the index method and index information are used to transmit the encapsulated multimedia data.
- the data format of the multimedia data includes 3Dof+ mode and/or 6Dof mode; encapsulation transmission is suitable for MPEG media file transmission MMT mode or smart media transmission SMT mode or an ISO-based media file format ISOBMFF or an extension of the panoramic media application OMAF Way.
- the different media types of the multimedia data include any one or more of the following: traditional two-dimensional video, atlas video, dynamic point cloud, static point cloud, and light field.
- determine the data type of the multimedia data when the media type is atlas video, the data type includes texture data and depth data; when the media type is dynamic point cloud, the data type includes texture, geometry, occupancy map, and additional information data ; When the media type is static point cloud, the data type includes texture, geometry, and additional information data; when the media type is light field, the data type includes texture data and angle data.
- the determining the data type of the multimedia data further includes: determining the number of data groups of the corresponding data type for each data type.
- the corresponding relationship between the numbers of data groups of different data types includes: the same structure corresponds to the same texture; or the same structure corresponds to different textures that are mutually complementary.
- the number and location information of the track media stream where the multimedia data for determining and identifying the media type is located includes: defining the track type, indicating that the multimedia data of each media type is in one or at least two tracks. : Define the media track number where the multimedia data is located; and define the specific position of each data in the multimedia data in the track; when there are at least two tracks: define the media track number of each data contained in the multimedia data, and define each of the multimedia data The specific location of the data in the track.
- the association relationship includes: mutual dependence between data contents and/or single dependence between data contents and/or mutual replacement between data contents.
- the interdependent association relationship includes: the texture and depth data in the atlas are interdependent; the geometry, occupancy map, and additional information in the point cloud are interdependent to jointly construct the point cloud geometric skeleton; the single dependent association
- the relationship includes: the texture data in the point cloud needs to rely on geometry, occupancy map, and additional information to jointly construct the geometric skeleton; the additional atlas relies on the basic atlas, and the mutual replacement relationship includes: for the same point cloud geometric skeleton, match Use different texture data for replacement.
- the index information includes a collection of the above-mentioned attribute information, and the attribute information is respectively placed at different levels of the encapsulation transmission protocol to describe, or defines an index that includes all attribute information of the media.
- the data stream of the targeted multimedia data includes any one or more of the following: outer layer information, description indication information, and data content information.
- the outer layer information is used to define the file type and content compatibility of the multimedia data. Description; description indication information, used to describe and indicate multimedia data; data content information, used for specific content information of multimedia data.
- this application also provides a method for receiving multimedia data in multiple degrees of freedom, including: receiving the encapsulated multimedia data, analyzing the multimedia data according to the encapsulated transmission protocol that is the inverse of the sending method, and analyzing the multimedia data according to the parsed content. The data is processed accordingly.
- the receiving method includes:
- S1 Receive the media content data of the multimedia data, analyze it according to the encapsulation transmission protocol, and obtain the description instruction information of the multimedia data;
- S2 Determine the media content data according to the description instruction information;
- S3 Analyze and obtain the corresponding media content type according to the media content type The description information of the number of data groups and/or the description information of the media data type and/or the description information of the track type;
- S4 Obtain the description information of the media data type, analyze and obtain the description information about the association relationship of different data types;
- S5 Based on different media data types Descriptive information and the number of data groups. Descriptive information.
- S6 Completely obtain the index information corresponding to each data type in the analysis information according to the number of data groups of different types of data, and obtain according to S3
- this application also provides a media processor, including:
- the storage module, the receiving module, the analysis module, and the data processing module are used to receive multimedia data for analysis and processing according to the encapsulation transmission protocol
- the encapsulation transmission protocol includes:
- Determining the attribute information of the multimedia data includes: determining the data type for different media types of the multimedia data; determining and identifying the number and location information of the media stream of the track where the multimedia data of the media type is located; and determining the number of data contents in different media data And the corresponding index mode and index information are respectively determined for the attribute information.
- this application also provides a player, including:
- the storage module, the receiving module, the analysis module, and the data processing module are used to receive multimedia data for analysis and processing according to the encapsulation transmission protocol
- the encapsulation transmission protocol includes:
- Determining the attribute information of the multimedia data includes: determining the data type for different media types of the multimedia data; determining and identifying the number and location information of the media stream of the track where the multimedia data of the media type is located; and determining the number of data contents in different media data And the corresponding index mode and index information are respectively determined for the attribute information.
- the solution to existing protocols is mainly for traditional media, and for new media, especially new ones.
- the problem of attribute non-support provides a new encapsulated and designed immersive media system framework for the new features and attributes of new media with multiple degrees of freedom, and expands the existing protocol by defining and describing the important features and attributes of the new media , Can adapt to the diversification of media data types under the new multi-degree of freedom and the diversified relationship between data units, better compatible with the new multi-degree of freedom media content, with a certain degree of scalability, and provide corresponding system frame structure design
- the solution supports the storage and transmission of new media, enables devices and applications to support new media, and also realizes the effective use of multi-degree-of-freedom media data streams.
- Figure 1 is a schematic diagram of the comparison between traditional immersive media content and the realization of atlas technology
- Figure 2 is a block diagram of the data flow of point cloud technology
- Figure 3-1 is the frame diagram of the media system design in the traditional scheme
- Attached Figure 3-2 is a framework diagram of the multi-degree-of-freedom immersive media system design in this application;
- Figure 4-1 is an ISOBMFF-based data transmission single-track design diagram of the atlas in the embodiment
- Figure 4-2 is a schematic diagram of the data flow targeted under the single track of the atlas in Figure 4-1;
- Figure 5-1 is an ISOBMFF-based data transmission single-track design diagram of the point cloud in the embodiment
- Figure 5-2 is a schematic diagram of the data flow targeted under the single track of the point cloud in Figure 5-1;
- Figure 6-1 is an ISOBMFF-based data transmission multi-track design diagram of the atlas in the embodiment
- Figure 6-2 is a schematic diagram of the data flow targeted under the multi-track of the atlas in Figure 6-1;
- Figure 7-1 is an ISOBMFF-based data transmission multi-track design diagram of the point cloud in the embodiment
- Fig. 7-2 is a schematic diagram of the data flow under the multi-track point cloud in Fig. 7-1;
- Fig. 8 is a flow chart of multi-degree-of-freedom media data analysis
- Fig. 9 is a flow chart of data analysis corresponding to specific media content
- Figure 10 is a schematic diagram of the functional module structure of the multi-degree-of-freedom immersive media system.
- the multimedia data targeted by this application has the following characteristics:
- the traditional video stream in Figure 3-1 is composed of continuous image frames.
- immersive media under the new multi-degree of freedom.
- the content of the newly-appearing atlas in the immersive media content of 3+ degrees of freedom shown in Figure 1 contains texture and depth information
- the point cloud of 6 degrees of freedom shown in Figure 2 contains texture map information, Geometric map information, occupancy map information, and additional information.
- FIGS. 1 and 2 the immersive media with multiple degrees of freedom targeted by this application requires an effective combination of multiple types of data to be correctly presented, and the original data encapsulation metadata cannot accurately describe these different types of data attributes.
- Figure 3-2 is a framework diagram of the multi-degree-of-freedom immersive media system design in this application. It can be seen that different types of data of the immersive media under the new degrees of freedom can form a variety of combined association relationships.
- the atlas with 3+ degrees of freedom in Figure 1 is composed of a set of textures and depths to form a basic atlas, and another set of textures and depths to form a supplementary atlas.
- the content of the basic atlas and the supplementary atlas can be combined to form a free-view video.
- the point cloud with 6 degrees of freedom in Figure 2 can be restored using geometric map information, occupancy map information and additional information.
- only geometric structure information can be restored without using texture information.
- Different geometric structures and different texture map information can be combined to obtain point clouds with different textures under a unified geometric structure.
- Functions such as skinning of the character model can be realized by using the association relationship of related content attributes. Therefore, the original encapsulation protocol metadata needs to be extended to support the description of complex relationships.
- this application provides a method for sending multimedia data with multiple degrees of freedom, including: encapsulating the multimedia data according to an encapsulation transmission protocol.
- the encapsulation transmission protocol includes: determining the attribute information of the multimedia data, including: Different media types of the data, determine the data type; determine and identify the number and location information of the media stream of the track where the multimedia data of the media type is located; and determine the association relationship between multiple data contents in different media data; and the attribute information
- the corresponding index method and index information are determined respectively, and the encapsulated multimedia data is transmitted.
- multimedia data flow As shown in Figure 3-2, in the immersive media system design framework, a new description of multimedia data, also known as multimedia data flow, needs to be added, 1. Media type; 2. Media flow content quantity; 3. Media flow content type and corresponding The amount of content; 4. The relationship between the media content; and 5.
- the content indexing method and index information Specifically, the following instructions are included:
- Table 1 is a media type table of multimedia data in this embodiment.
- ISOBMFF ISO-based media file format ISO Base File Format
- new video types are added, such as traditional two-dimensional video, atlas, point cloud, light field, and reserved for defining future new media.
- Video type 1 Two-dimensional video (traditional video) 2 Gallery video 3 Dynamic point cloud 4 Static point cloud 5 Light field 6 Reserved (used to define new media types)
- Table 2 is a corresponding table of the number of data types and the number of data groups determined according to different media types in this embodiment.
- the corresponding table definition in Table 2 describes the new video type, and describes the attributes and quantity of data contained in each video type.
- the atlas contains texture and depth data; 3.
- the dynamic point cloud contains texture, geometry, occupancy map, and additional information data; 4.
- the static point cloud video contains texture, geometry, and additional information data; current technical solutions Among them, 5.
- the light field contains texture and angle data, which may be expanded in the future with the study of the light field.
- each video type contains several sets of data
- the number of data sets can also be defined.
- the atlas video can contain multiple atlases
- the point cloud can contain multiple sets of point cloud data
- the light field can contain multiple sets of texture and angle data.
- the immersive media data stream under the new degree of freedom is not limited to one type of data content form.
- the new freedom The design of the immersive media system framework under the high degree describes the type of content in the media data stream and the amount of corresponding content.
- each type of media is in one media stream or distributed in multiple media streams, distinguish all data of each new type of media in a media stream for storage and transmission, and the address or location of each data.
- Table 3 is a corresponding table of the track type of the media stream of the track where the multimedia data is located and the location of the data in this embodiment.
- the track type is defined in ISOBMFF to describe whether each video is in one or at least two tracks.
- Table 4 is a table of association relations between multiple media contents in multimedia data, which determines the association relations between multiple data contents in different media data: mutual dependence, single dependence, and mutual replacement.
- the interdependent relationship includes: the texture and depth data in the atlas are interdependent; the geometry, occupancy map, and additional information in the point cloud are interdependent to construct the point cloud geometric skeleton.
- the single dependent relationship includes: The texture data in the point cloud needs to rely on geometry, occupancy maps, and additional information to jointly construct the geometric skeleton; the additional atlas depends on the basic atlas, and the mutual replacement relationship includes: for the same point cloud geometric skeleton, different texture data Used for replacement.
- Table 4 is only an example of a better example, and is not a limitation of the application.
- the new type of media data has complex types, quantities, and association relationships.
- the index information of the media data can be defined.
- Table 5 is a corresponding table of different media types of multimedia data and corresponding indexing methods and index information determined respectively.
- ISOBMFF defines the indexing method between the data content of each video type and the indexing method of indexing information media, that is, the data composition and index information of the media are given to help the device quickly analyze its media type, composition, quantity and access Information, to achieve effective acquisition and corresponding processing of content.
- Atlas video as an example.
- the media type is atlas video and distributed on a multi-track structure, it is expanded by using the Track Reference Box (Track Reference Box, the same below) in the protocol.
- Add index information namely track (Track) type and track (Track) ID, to help the device quickly analyze its media type, composition, quantity and access information, to achieve effective content acquisition and corresponding processing.
- index information can be used as a collection of the above-mentioned newly defined attributes. These attribute information can be placed at different levels of the protocol file to describe, or an index can be defined to contain all relevant information of the media, which is convenient for the device to quickly read and Parsing.
- the immersive media system framework given in this application adds a description of the multimedia data stream to the protocol and performs corresponding processing, respectively, in conjunction with Figure 4-1
- Embodiments 1 to 4 of Figure 7-2 describe the sending method and receiving method of multimedia data under multiple degrees of freedom, the multimedia data system under multiple degrees of freedom, and media processors and players, so as to finally realize the acquisition of media content on the consumer side.
- Immersive media experience under the new multiple degrees of freedom.
- Figure 4-1 is an ISOBMFF-based data transmission single-track design diagram of the atlas in the embodiment.
- Fig. 4-2 is a schematic diagram of the data flow targeted under the single track of the atlas in Fig. 4-1.
- the outer information can be represented by any field.
- the field ftyp is used to represent the outer information, where ftyp is the outermost data box of the encapsulated file to define the file Type and content compatibility.
- the description indication information can be represented by any field.
- the field moov is used to represent the description indication information.
- moov is the data box of the media content description information in the file, which contains various related information describing the transmission media content.
- the data content information can be represented by any field.
- the field mdat is used to represent the data content information.
- the mdat is the specific media data content information.
- the content contained in the moov is used to describe and describe the specific media data content in the mdat. Indicating role. This application adds description information about the content of media data contained in mdat in the moov structure.
- the media data content format is shown in Figure 4-2, indicating that the number of atlases contained in the current data stream is "n".
- the moov data box shown in Figure 4-1 In, new information about the type of media content, the type of media track, the number of media data groups, the type of media data and its corresponding number, the association relationship between different data types, and the index information are added.
- the description "miv” about the media type of the atlas is added, indicating that the current media data stream is an atlas data stream (miv).
- the track type is single track, indicate that the data types existing in the current media data stream are texture and depth, add a description about the amount of data, and indicate that the number of atlases contained in the current data stream is "n"
- each The atlas contains a depth layer and a texture layer. Indicates the position of the corresponding data in each atlas, the position of the depth layer "depth 0" of the first atlas in the track, and the position of the texture layer "texture 0" of the first atlas in the track.
- the instructions for the corresponding texture and depth position information in each atlas are completed. Add relevant information about the relationship between the data in the media data stream. For example, atlas 0 containing the basic view block is necessary data, and the atlas where other supplementary view blocks are located are supplementary content, which depends on atlas 0 and is related to atlas 0 The miv image corresponding to the viewpoint is restored together.
- Fig. 5-1 is a design diagram of a single track of ISOBMFF-based data transmission of the point cloud in the embodiment.
- Fig. 5-2 is a schematic diagram of the data flow targeted under the single track of the point cloud in Fig. 5-1.
- the outer information can be represented by any field.
- the field ftyp is used to represent the outer information, where ftyp is the outermost data box of the encapsulated file to define the file Type and content compatibility.
- the description indication information can be represented by any field.
- the field moov is used to represent the description indication information.
- moov is the data box of the media content description information in the file, which contains various related information describing the transmission media content.
- the data content information can be represented by any field.
- the field mdat is used to represent the data content information.
- the mdat is the specific media data content information.
- the content contained in the moov is used to describe and describe the specific media data content in the mdat. Indicating role. This application adds description information about the content of media data contained in mdat in the moov structure.
- the media data content format is shown in Figure 5-2.
- point cloud data group 0 to point cloud data group n each group contains 2 sets of textures (texture 01, texture 02), geometry, occupancy map, and extra information.
- the description of the point cloud media type "point cloud” is added to the moov structure, indicating that the current media data stream is a point cloud data stream (vpcc) .
- the track type is single track, indicating that the data types existing in the current media data stream are texture, geometry, occupancy map and additional information.
- the description about the amount of data is added, indicating that the texture contained in the current data stream is "t"
- Numbers, geometry, occupancy map and additional information are all "n”.
- the restoration depends on the restoration of the geometric structure 0, that is, the texture information 0 depends on the geometric 0, occupying the image 0 and the additional information 0.
- the same structure 0 can correspond to the same texture, that is, a modification of the second embodiment above, and the number of data: texture, geometry, occupancy map, and additional information are all n.
- the same structure 0 can also correspond to different textures, that is, in the second embodiment, the number of data: the number of textures is t, and the geometry, occupancy map, and additional information are all n.
- Structure 0 can correspond to texture 00, texture 01, and texture 02.
- a typical application scenario is point cloud character model skinning. It can be seen that different textures corresponding to the same geometric structure are mutually complementary.
- each set of atlas contains one or more sets of texture data. Therefore, it can be seen that the number of texture data t is more than the number of data n of other data types (geometry, occupancy map, and additional information).
- Fig. 6-1 is an ISOBMFF-based data transmission multi-track design diagram of the atlas in the embodiment.
- Fig. 6-2 is a schematic diagram of the data flow targeted under the multi-track of the atlas in Fig. 6-1.
- the outer information can be represented by any field.
- the field ftyp is used to represent the outer information, where ftyp is the outermost data box of the encapsulated file to define File type and content compatibility.
- the description indication information can be represented by any field.
- the field moov is used to represent the description indication information.
- moov is the data box of the media content description information in the file, which contains various related descriptions of the transmission media content.
- data content information can be represented by any field, here the field mdat is used to represent data content information, mdat is specific media data content information, and the content contained in moov describes the specific media data content in mdat And indicating role. This application adds description information about the content of media data contained in mdat in the moov structure.
- Atlas data 0 to atlas data n are distributed on track 1 (Track-1) and track 2 (Track-1), and each atlas includes a geometry (in this embodiment, depth) and a texture.
- each atlas includes a geometry (in this embodiment, depth) and a texture.
- new information about the media content type, media track type, number of media data groups, media data type and its corresponding quantity, and different data types are added. The relationship between and index information.
- the description "miv” about the media type of the atlas is added to the moov structure, indicating that the current media data stream is an atlas data stream (miv).
- the track type is multi-track, indicate that the data types existing in the current media data stream are texture and depth, add a description of the data quantity information, and indicate that the number of atlases contained in the current data stream is "n", each An atlas contains a depth layer and a texture layer.
- Fig. 7-1 is an ISOBMFF-based data transmission multi-track design diagram of the point cloud in the embodiment.
- Fig. 7-2 is a schematic diagram of the data flow under the multi-track point cloud in Fig. 7-1.
- the outer information can be represented by any field.
- the field ftyp is used to represent the outer information, where ftyp is the outermost data box of the encapsulated file to define File type and content compatibility.
- the description indication information can be represented by any field.
- the field moov is used to represent the description indication information.
- moov is the data box of the media content description information in the file, which contains various related descriptions of the transmission media content.
- data content information can be represented by any field, here the field mdat is used to represent data content information, mdat is specific media data content information, and the content contained in moov describes the specific media data content in mdat And indicating role. This application adds description information about the content of media data contained in mdat in the moov structure.
- Point cloud data 0 to point cloud data n are distributed on track 1 to track 5 (Track-1 to Track-5).
- the point cloud data contains t textures, and the geometry, occupancy map and additional information are all n, among which, The first group of textures are distributed in Track-1, the second group of textures are distributed in Track-2, and the geometry, occupancy map, and additional information are distributed in Track-3 to Track-5, respectively.
- new information is added about the type of media content, the type of media track, the number of media data groups, the type of media data and its corresponding quantity, and the different data types. The relationship between and index information. specifically:
- the description of the point cloud media type "point cloud” is added to the moov structure, indicating that the current media data stream is a point cloud data stream (vpcc).
- vpcc point cloud data stream
- the description about the amount of data is added, indicating that the texture contained in the current data stream is "t” ", geometry, occupancy map and additional information are all "n”.
- texture 0 is in track 1 whose type is texture and its corresponding position
- texture 1 is in track 1 whose track type is texture and its corresponding position
- Geometry 0 is in the track 3 whose type is geometry and indicates its corresponding position, etc., and so on, to complete the instructions for four different types of data information.
- the restoration depends on the restoration of the geometric structure 0, that is, the texture information 0 depends on the geometric 0, occupying the image 0 and the additional information 0.
- the same structure 0 can correspond to the same texture, that is, a modification of the above-mentioned embodiment 4.
- the amount of data: texture, geometry, occupancy map, and additional information are all For n.
- the same structure 0 can also correspond to different textures. That is, in the fourth embodiment, the number of data: the texture is t, and the geometry, occupancy map, and additional information are all n.
- Structure 0 can correspond to texture 00, texture 01, and texture 02.
- a typical application scenario is point cloud character model skinning. It can be seen that different textures corresponding to the same geometric structure are mutually complementary.
- each set of point clouds contains one or more sets of texture data. Therefore, it can be seen that the number of texture data t is more than the number of data n of other data types (geometry, occupancy map, and additional information).
- Fig. 8 is a flow chart of multi-degree-of-freedom media data analysis, which is used to illustrate the method of receiving multimedia data under multi-degree-of-freedom.
- this application provides a multi-degree-of-freedom immersive media system, which includes a sender side and a server side.
- the server includes a receiving module, a parsing module, and a data processing module.
- the server After the sender finishes sending the encapsulated media file, the server will receive the media file through the receiver.
- the encapsulated media file protocol will be parsed, and the media data content will be processed according to the parsed content. .
- Figure 8 shows:
- the server After the sender completes the modification of the corresponding content in the data encapsulation transmission protocol, the server receives the corresponding media file data through the receiver, and completes the analysis of the related protocol to obtain the description information of the media content data.
- S2 The data processing module will process the media content data according to the description information parsed in S1. First, the media content is judged, and the judgment is based on the parsed media type description information.
- Fig. 9 is a flow chart of data analysis corresponding to specific different media content, when corresponding to specific different media content: dynamic point cloud (a in Fig. 9), static point cloud (b in Fig. 9), atlas video (Fig. 9 c) and light field (figure 9 d), including the following steps:
- the media type is judged according to the media type description information, and the media type has been defined in the encapsulated content. If it is a traditional video media type, it is processed according to the old immersive media processing flow. If it is an immersive media type under the new multi-degree of freedom, dynamic point cloud, static point cloud, atlas video, light field, the media content processing flow corresponding to the resolved media type is used for processing.
- the processing flow and processor corresponding to the media type are started, and at the same time, the number of media content data groups, the media content type corresponding to the media content and the track type during transmission are further obtained.
- the corresponding media content types include texture, geometry, occupancy map, and additional information.
- the corresponding media content types include texture, geometry, and additional information.
- the corresponding media content types include texture and depth.
- the corresponding media content types currently include texture and angle.
- the third step T3 after completing the acquisition of the data type under the corresponding media type, combine the number of media data groups to analyze the number of different media data types.
- the number of media data groups can assist in the acquisition of the number of media data types, avoiding content loss, and media
- the number of data types can guide the data analysis terminal to complete the complete analysis of different types of data, avoiding content loss and affecting the media video recovery effect.
- the fourth step T4 after completing the acquisition of the number of data groups and the number of data types, the index information and the association relationship of the corresponding data types are parsed, combined with the previous track type judgment results, and the data combination is performed.
- the data combination method is:
- T4.1 As shown in the branch a in Figure 9, for dynamic point clouds, according to the relationship between data types, the geometry of the same group of dynamic point cloud data, the occupancy map and additional information depend on each other to recover the geometry of the dynamic point cloud
- the restoration of shape and texture depends on the restoration of geometric shapes, and the same set of dynamic point cloud data can have multiple sets of corresponding texture information but only one set of geometry, occupancy map and additional information.
- the track type is single track, first find the geometry, occupancy map and additional information of the same group in the track according to the index information, complete the restoration of the point cloud geometry, and then index different texture data under the same group as needed to find all
- the required texture data is restored on the basis of point cloud geometry, occupancy map and additional information.
- the track type When the track type is multi-track, first find the geometry, occupancy map and additional information and texture of the track in the track type index according to the index information, and find the data of the corresponding type in the corresponding track according to the data type index. First find the geometry, occupancy map and additional information that belong to the same group in the corresponding type of track, complete the restoration of the point cloud geometry, and then index the different texture data belonging to the same group in the corresponding texture track as needed to find what you need Texture data, complete the restoration of texture information on the basis of point cloud geometry, occupancy map and additional information.
- T4.2 As shown in the branch b in Figure 9, for static point clouds, according to the relationship between the data types, the geometry of the same group of dynamic point cloud data, additional information depend on each other to restore the geometry of the dynamic point cloud, and The restoration of texture depends on the restoration of geometric shapes.
- the track type is single track, first find the same group of geometry in the track according to the index information, additional information, complete the restoration of the point cloud geometry, and then index the texture data under the same group as needed to find the required texture data , Complete the restoration of texture information on the basis of point cloud geometry and additional information.
- the track type When the track type is multi-track, first find the track where geometry, additional information and texture are located in the track type index according to the index information, and find the corresponding type of data in the corresponding track according to the data type index. First find the geometry and additional information that belong to the same group in the corresponding type of track, and complete the restoration of the point cloud geometry. After that, index the texture data belonging to the same group in the corresponding texture track as needed to find the required texture data. The point cloud geometry, based on additional information, completes the restoration of texture information.
- T4.3 As shown in the c branch in Figure 9, for atlas videos, according to the association relationship between data types, the depth and texture of the same set of atlas data depend on each other, and the atlas video content is restored together.
- the track type is single track, find the texture and depth of the same group in the track according to the index information, and combine them together to complete the restoration of the image.
- the track type is multi-track, first find the track where the texture and depth are located according to the track type index according to the index information, and find the data of the corresponding type in the corresponding track according to the data type index. After that, the texture and depth of the same set of atlas data together recover the content of the set of atlas.
- T4.4 As shown in the d branch in Figure 9, for the light field, according to the correlation between the data types, the angle and texture of the same group of light field data and the extended information are dependent on each other, and the content of the light field is restored together.
- the track type is single track, find the same set of texture, angle and expansion information in the track according to the index information, and combine them together to complete the restoration of the image.
- the track type is multi-track, first find the track where the texture, angle, and expansion information are located in the track type index according to the index information, and find the corresponding type of data in the corresponding track according to the data type index. After that, the texture and angle of the same group of light field data and the expansion information together recover the content of the group of light field data.
- the fifth step T5 according to the number of media data of the corresponding type and the number of media data types, complete the analysis and combination of all media data in sequence, and finally present the immersive media video content under the new multiple degrees of freedom.
- the application concept of this application, the described embodiments, and the scope of this application enable the immersive media system to provide system architecture support for the upcoming implementation of immersive media 3Dof+ and 6Dof related experiences and technical applications.
- this embodiment uses packaging protocols such as ISOBMFF and atlas and point cloud technologies as examples to illustrate the proposed immersive media 3Dof+ and 6Dof metadata and its structure, parameter content, data and its packaging and transmission methods
- packaging protocols such as ISOBMFF and atlas and point cloud technologies
- the immersive media data form and content under the new multiple degrees of freedom in this embodiment can also be encapsulated and transmitted in other formats, parameter expressions and files, such as using MMT, SMT transmission, ISOBMFF encapsulation, or based on OMAF (The expansion of omnidirectional media application format, the application format of panoramic media, does not affect the expression of the core technology of this application.
- this application provides a multi-degree-of-freedom immersive media system, which includes a sender side and a server side.
- the server includes a receiving module, a parsing module, and a data processing module.
- the server After the sender finishes sending the encapsulated media file, the server will receive the media file through the receiver. First, the encapsulated media file protocol will be parsed, and the media data content will be processed according to the parsed content. .
- a processor and a memory coupled to the processor are provided.
- the processor When executing the computer-readable program in the memory, the processor may be configured to execute the method and system for receiving multimedia data in multiple degrees of freedom described in conjunction with FIGS. 1-9.
- DSP digital signal processors
- ASIC application-specific integrated circuits
- FPGA field programmable gate arrays
- a general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine.
- the processor may also be implemented as a combination of computing devices, such as a combination of a DSP and a microprocessor, multiple microprocessors, one or more microprocessors cooperating with a DSP core, or any other such configuration.
- the steps of the method or algorithm described in conjunction with the embodiments disclosed herein may be directly embodied in hardware, in a software module executed by a processor, or in a combination of the two.
- the software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art.
- An exemplary storage medium is coupled to the processor such that the processor can read information from and write information to the storage medium.
- the storage medium may be integrated into the processor.
- the processor and the storage medium may reside in the ASIC.
- the ASIC may reside in the user terminal.
- the processor and the storage medium may reside as discrete components in the user terminal.
- the described functions may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software as a computer program product, each function can be stored as one or more instructions or codes on a computer-readable medium or transmitted through it.
- Computer-readable media includes both computer storage media and communication media, including any medium that facilitates the transfer of a computer program from one place to another.
- the storage medium may be any available medium that can be accessed by a computer.
- such computer-readable media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or can be used to carry or store instructions or data in the form of a structure Any other medium that agrees with the program code and can be accessed by a computer.
- any connection is also properly called a computer-readable medium.
- the software is transmitted from a web site, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave .
- coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of the medium.
- Disks and discs as used in this article include compact discs (CD), laser discs, optical discs, digital versatile discs (DVD), floppy disks and Blu-ray discs, in which disks are often reproduced in a magnetic manner Data, and a disc (disc) optically reproduces the data with a laser. Combinations of the above should also be included in the scope of computer-readable media.
Abstract
Description
序号Serial number | 视频类型Video type |
11 | 二维视频(传统视频)Two-dimensional video (traditional video) |
22 | 图集视频Gallery video |
33 | 动态点云Dynamic point cloud |
44 | 静态点云Static point cloud |
55 | 光场Light field |
66 | 保留(用于定义新型媒体类型)Reserved (used to define new media types) |
Claims (15)
- 一种多自由度下多媒体数据的发送方法,包括:A method for sending multimedia data with multiple degrees of freedom, including:对多媒体数据按照封装传输协议进行封装,该封装传输协议包含:The multimedia data is encapsulated according to the encapsulation transmission protocol, the encapsulation transmission protocol includes:确定多媒体数据的属性信息,包含:针对多媒体数据的不同媒体类型,确定数据类型;确定并标识媒体类型的多媒体数据所在轨道媒体流的数量和位置信息;和确定不同媒体数据中多个数据内容之间的关联关系;以及,Determining the attribute information of the multimedia data includes: determining the data type for different media types of the multimedia data; determining and identifying the number and location information of the media stream of the track where the multimedia data of the media type is located; and determining the number of data contents in different media data The relationship between; and,对所述属性信息分别确定相对应的索引方式和索引信息,Respectively determine the corresponding index mode and index information for the attribute information,将封装后的多媒体数据进行传输。Transmit the encapsulated multimedia data.
- 根据权利要求1所述的多自由度下多媒体数据的发送方法,其中,多媒体数据的数据形式包含3Dof+方式和/或6Dof方式;封装传输适用于MPEG媒体文件传输MMT方式或智能媒体传输SMT方式或基于ISO的媒体文件格式ISOBMFF或全景媒体应用OMAF的扩展方式。The method for sending multimedia data under multiple degrees of freedom according to claim 1, wherein the data form of the multimedia data includes 3Dof+ mode and/or 6Dof mode; encapsulation transmission is suitable for MPEG media file transmission MMT mode or smart media transmission SMT mode or ISO-based media file format ISOBMFF or panoramic media application OMAF extension method.
- 根据权利要求1所述的多自由度下多媒体数据的发送方法,其中,多媒体数据的不同媒体类型包括以下任一种或几种:传统二维视频、图集视频、动态点云、静态点云、光场。The method for sending multimedia data with multiple degrees of freedom according to claim 1, wherein the different media types of the multimedia data include any one or more of the following: traditional two-dimensional video, atlas video, dynamic point cloud, static point cloud , Light field.
- 根据权利要求1所述的多自由度下多媒体数据的发送方法,其中,确定多媒体数据的数据类型:The method for sending multimedia data with multiple degrees of freedom according to claim 1, wherein determining the data type of the multimedia data:当媒体类型为图集视频时,数据类型包含纹理数据和深度数据;When the media type is atlas video, the data type includes texture data and depth data;当媒体类型为动态点云时,数据类型包含纹理、几何、占用图和附加信息数据;When the media type is dynamic point cloud, the data type includes texture, geometry, occupancy map and additional information data;当媒体类型为静态点云时,数据类型包含纹理、几何、和附加信息数据;When the media type is static point cloud, the data type includes texture, geometry, and additional information data;当媒体类型为光场时,数据类型包含纹理数据、和角度数据。When the media type is light field, the data type includes texture data and angle data.
- 根据权利要求1所述的多自由度下多媒体数据的发送方法,其中,确定多媒体数据的数据类型,还包括:The method for sending multimedia data with multiple degrees of freedom according to claim 1, wherein determining the data type of the multimedia data further comprises:针对每个数据类型确定对应的数据类型的数据组数。Determine the number of data groups of the corresponding data type for each data type.
- 根据权利要求5所述的多自由度下多媒体数据的发送方法,其中,不同数据类型的数据组数之间对应关系包含:The method for sending multimedia data with multiple degrees of freedom according to claim 5, wherein the correspondence between the numbers of data groups of different data types comprises:同一结构对应同一的纹理;或者,The same structure corresponds to the same texture; or,同一结构对应的不同且互为替补关系的纹理。The same structure corresponds to different textures that are substitutes for each other.
- 根据权利要求1所述的多自由度下多媒体数据的发送方法,其中,确定并标识媒体类型的多媒体数据所在轨道媒体流的数量和位置信息,包含:The method for sending multimedia data in multiple degrees of freedom according to claim 1, wherein the number and location information of the track media stream where the multimedia data of the media type is determined and identified includes:定义轨道类型,表明每种媒体类型的多媒体数据在一个或至少两个轨道中,其中,Define the track type, indicating that the multimedia data of each media type is in one or at least two tracks, among which,单轨时:定义多媒体数据所在媒体轨道号;以及定义多媒体数据中每个数据在轨道中的具体位置;Single track: define the media track number where the multimedia data is located; and define the specific position of each data in the multimedia data in the track;至少两轨时:定义多媒体数据中包含的每个数据在媒体轨道号,以及定义多媒体数据中每个数据在轨道中的具体位置。At least two tracks: define the media track number of each data contained in the multimedia data, and define the specific position of each data in the multimedia data in the track.
- 根据权利要求1所述的多自由度下多媒体数据的发送方法,其中,确定不同媒体数据中多个数据内容之间的关联关系,该关联关系包含:The method for sending multimedia data with multiple degrees of freedom according to claim 1, wherein the association relationship between multiple data contents in different media data is determined, and the association relationship includes:数据内容之间相互依赖和/或数据内容之间单一依赖和/或数据内容之间互相替换。Data content is interdependent and/or data content is single dependent and/or data content is interchangeable.
- 根据权利要求8所述的多自由度下多媒体数据的发送方法,其中,The method for transmitting multimedia data in multiple degrees of freedom according to claim 8, wherein:相互依赖的关联关系包含:图集中纹理和深度数据相互依赖;点云中的几何、占用图、附加信息之间相互依赖共同构建出点云几何骨架;The interdependent relationship includes: the texture and depth data in the atlas are dependent on each other; the geometry, occupancy map, and additional information in the point cloud are dependent on each other to jointly construct the point cloud geometric skeleton;单一依赖的关联关系包含:点云中的纹理数据需要依赖几何、占用图、附加信息共同构建几何骨架;附加图集依赖基本图集;以及,The single dependent association relationship includes: the texture data in the point cloud needs to rely on geometry, occupancy map, and additional information to jointly construct the geometric skeleton; additional atlas depends on the basic atlas; and,互相替换的关联关系包含:针对同一个点云几何骨架,配以不同的纹理数据用于替换。The mutual replacement relationship includes: for the same point cloud geometric skeleton, different texture data are used for replacement.
- 根据权利要求1所述的多自由度下多媒体数据的发送方法,其中,索引信息包含上述属性信息的集合,该属性信息分别放在封装传输协议的不同层级来描述,或者,定义包含该媒体的所有属性信息的索引。The method for sending multimedia data with multiple degrees of freedom according to claim 1, wherein the index information includes a collection of the above-mentioned attribute information, and the attribute information is respectively placed at different levels of the encapsulation transmission protocol to describe, or the definition includes the media Index of all attribute information.
- 根据权利要求1所述的多自由度下多媒体数据的发送方法,其中,所针对的多媒体数据的数据流包含以下任一种或几种:外层信息、描述指示信息、数据内容信息;The method for sending multimedia data with multiple degrees of freedom according to claim 1, wherein the targeted multimedia data data stream includes any one or more of the following: outer layer information, description indication information, and data content information;其中,外层信息,用于定义多媒体数据的文件类型和内容兼容性;描述指示信息,用于对多媒体数据进行描述和指示;数据内容信息,用于多媒体数据的具体内容信息。Among them, the outer layer information is used to define the file type and content compatibility of the multimedia data; the description indication information is used to describe and indicate the multimedia data; the data content information is used for the specific content information of the multimedia data.
- 一种多自由度下多媒体数据的接收方法,包括:A method for receiving multimedia data under multiple degrees of freedom, including:对封装的多媒体数据进行接收,按照与权利要求1相逆的封装传输协议进行解析,根据解析内容对该多媒体数据进行相应的处理。The encapsulated multimedia data is received, analyzed according to the encapsulated transmission protocol that is inverse to claim 1, and the multimedia data is processed correspondingly according to the parsed content.
- 根据权利要求12所述的多自由度下多媒体数据的接收方法,其中,包括:The method for receiving multimedia data with multiple degrees of freedom according to claim 12, which comprises:S1:接收多媒体数据的媒体内容数据,按照封装传输协议进行解析,得到多媒体数据的描述指示信息;S1: Receive the media content data of the multimedia data, analyze it according to the encapsulation transmission protocol, and obtain the description instruction information of the multimedia data;S2:依据描述指示信息判断媒体内容数据;S2: Judging the media content data according to the description instruction information;S3:依据媒体内容类型,解析获取对应媒体内容类型下的数据组数量描述信息和/或媒体数据类型描述信息和/或轨道类型描述信息;S3: According to the media content type, analyze and obtain the data group quantity description information and/or media data type description information and/or track type description information under the corresponding media content type;S4:获取媒体数据类型描述信息,解析获取关于不同数据类型的关联关系描述信息;S4: Obtain the description information of the media data type, and analyze and obtain the description information about the association relationship of different data types;S5:基于不同媒体数据类型描述信息和数据组数量描述信息,完整获取解析后的各个数据类型对应的数量;S5: Based on the description information of different media data types and the description information of the number of data groups, the number corresponding to each data type after analysis is completely obtained;S6:依据不同类型数据的数据组数量,完整地获取解析信息中各个数据类型对应的索引信息,依据S3中获取的轨道类型描述信息、S4中获取的数据类型之间的关联关系描述信息以及S5中获取的各个数据类型的索引信息描述信息,得到所需的媒体内容。S6: Completely obtain the index information corresponding to each data type in the analysis information according to the number of data groups of different types of data, according to the track type description information obtained in S3, the association relationship description information between the data types obtained in S4, and S5 The index information description information of each data type obtained in, obtain the required media content.
- 一种媒体处理器,包括:A media processor, including:存储模块、接收模块、解析模块以及数据处理模块,用于接收多媒体数据按照封装传输协议进行解析处理,该封装传输协议包含:The storage module, the receiving module, the analysis module, and the data processing module are used to receive multimedia data for analysis and processing according to the encapsulation transmission protocol, the encapsulation transmission protocol includes:确定多媒体数据的属性信息,包含:针对多媒体数据的不同媒体类型,确定数据类型;确定并标识媒体类型的多媒体数据所在轨道媒体流的数量和位置信息;和确定不同媒体数据中多个数据内容之间的关联关系;以及,Determining the attribute information of the multimedia data includes: determining the data type for different media types of the multimedia data; determining and identifying the number and location information of the media stream of the track where the multimedia data of the media type is located; and determining the number of data contents in different media data The relationship between; and,对所述属性信息分别确定相对应的索引方式和索引信息。The corresponding index mode and index information are respectively determined for the attribute information.
- 一种播放器,包括:A player including:存储模块、接收模块、解析模块以及数据处理模块,用于接收多媒体数据按照封装传输协议进行解析处理,该封装传输协议包含:The storage module, the receiving module, the analysis module, and the data processing module are used to receive multimedia data for analysis and processing according to the encapsulation transmission protocol, the encapsulation transmission protocol includes:确定多媒体数据的属性信息,包含:针对多媒体数据的不同媒体类型,确定数据类型;确定并标识媒体类型的多媒体数据所在轨道媒体流的数量和位置信息;和确定不同媒体数据中多个数据内容之间的关联关系;以及,Determining the attribute information of the multimedia data includes: determining the data type for different media types of the multimedia data; determining and identifying the number and location information of the media stream of the track where the multimedia data of the media type is located; and determining the number of data contents in different media data The relationship between; and,对所述属性信息分别确定相对应的索引方式和索引信息。The corresponding index mode and index information are respectively determined for the attribute information.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010301699.0A CN113542907B (en) | 2020-04-16 | 2020-04-16 | Multimedia data transceiving method, system, processor and player |
CN202010301699.0 | 2020-04-16 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021209044A1 true WO2021209044A1 (en) | 2021-10-21 |
Family
ID=78084686
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/087805 WO2021209044A1 (en) | 2020-04-16 | 2021-04-16 | Multimedia data transmission and reception methods, system, processor, and player |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113542907B (en) |
WO (1) | WO2021209044A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023169003A1 (en) * | 2022-03-11 | 2023-09-14 | 腾讯科技(深圳)有限公司 | Point cloud media decoding method and apparatus and point cloud media coding method and apparatus |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116800987A (en) * | 2022-03-14 | 2023-09-22 | 中兴通讯股份有限公司 | Data processing method, apparatus, device, storage medium, and program product |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080250047A1 (en) * | 2007-04-03 | 2008-10-09 | Nokia Corporation | System and method for using multiple meta boxes in the iso base media file format |
CN101978699A (en) * | 2008-01-25 | 2011-02-16 | 电子部品研究院 | Stereoscopic video file format and computer readable recording medium in which stereoscopic video file is recorded according thereto |
CN102005231A (en) * | 2010-09-08 | 2011-04-06 | 东莞电子科技大学电子信息工程研究院 | Storage method of rich-media scene flows |
CN108271068A (en) * | 2016-12-30 | 2018-07-10 | 华为技术有限公司 | A kind of processing method and processing device of the video data based on stream media technology |
CN109155874A (en) * | 2016-05-23 | 2019-01-04 | 佳能株式会社 | The method, apparatus and computer program of the self adaptation stream transmission of virtual reality media content |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7402743B2 (en) * | 2005-06-30 | 2008-07-22 | Body Harp Interactive Corporation | Free-space human interface for interactive music, full-body musical instrument, and immersive media controller |
US9922680B2 (en) * | 2015-02-10 | 2018-03-20 | Nokia Technologies Oy | Method, an apparatus and a computer program product for processing image sequence tracks |
CN105536268A (en) * | 2015-12-15 | 2016-05-04 | 广州中国科学院先进技术研究所 | Six-degree-of-freedom virtual reality dynamic seat and seat platform |
US10389999B2 (en) * | 2016-02-17 | 2019-08-20 | Qualcomm Incorporated | Storage of virtual reality video in media files |
CN106178551B (en) * | 2016-06-27 | 2018-01-30 | 山东大学 | A kind of real-time rendering interactive movie theatre system and method based on multi-modal interaction |
GB2563865A (en) * | 2017-06-27 | 2019-01-02 | Canon Kk | Method, device, and computer program for transmitting media content |
US20190104326A1 (en) * | 2017-10-03 | 2019-04-04 | Qualcomm Incorporated | Content source description for immersive media data |
US10559126B2 (en) * | 2017-10-13 | 2020-02-11 | Samsung Electronics Co., Ltd. | 6DoF media consumption architecture using 2D video decoder |
CN113178019B (en) * | 2018-07-09 | 2023-01-03 | 上海交通大学 | Indication information identification method, system and storage medium based on video content |
CN110944222B (en) * | 2018-09-21 | 2021-02-12 | 上海交通大学 | Method and system for immersive media content as user moves |
CN110971906B (en) * | 2018-09-29 | 2021-11-30 | 上海交通大学 | Hierarchical point cloud code stream packaging method and system |
-
2020
- 2020-04-16 CN CN202010301699.0A patent/CN113542907B/en active Active
-
2021
- 2021-04-16 WO PCT/CN2021/087805 patent/WO2021209044A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080250047A1 (en) * | 2007-04-03 | 2008-10-09 | Nokia Corporation | System and method for using multiple meta boxes in the iso base media file format |
CN101978699A (en) * | 2008-01-25 | 2011-02-16 | 电子部品研究院 | Stereoscopic video file format and computer readable recording medium in which stereoscopic video file is recorded according thereto |
CN102005231A (en) * | 2010-09-08 | 2011-04-06 | 东莞电子科技大学电子信息工程研究院 | Storage method of rich-media scene flows |
CN109155874A (en) * | 2016-05-23 | 2019-01-04 | 佳能株式会社 | The method, apparatus and computer program of the self adaptation stream transmission of virtual reality media content |
CN108271068A (en) * | 2016-12-30 | 2018-07-10 | 华为技术有限公司 | A kind of processing method and processing device of the video data based on stream media technology |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023169003A1 (en) * | 2022-03-11 | 2023-09-14 | 腾讯科技(深圳)有限公司 | Point cloud media decoding method and apparatus and point cloud media coding method and apparatus |
Also Published As
Publication number | Publication date |
---|---|
CN113542907A (en) | 2021-10-22 |
CN113542907B (en) | 2022-09-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5022443B2 (en) | Method of decoding metadata used for playback of stereoscopic video content | |
CN107682688B (en) | Video real-time recording method and recording equipment based on augmented reality | |
US9224246B2 (en) | Method and apparatus for processing media file for augmented reality service | |
US20120288257A1 (en) | Image processing device, information recording medium, image processing method, and program | |
JP2020099045A (en) | Method and device for overlaying 3d graphic over 3d video | |
WO2021209044A1 (en) | Multimedia data transmission and reception methods, system, processor, and player | |
CN106713988A (en) | Beautifying method and system for virtual scene live | |
EP2242262A2 (en) | Data structure, recording medium, playback apparatus and method, and program | |
US10965928B2 (en) | Method for 360 video processing based on multiple viewpoints and apparatus therefor | |
CN106303289A (en) | A kind of real object and virtual scene are merged the method for display, Apparatus and system | |
KR20100002048A (en) | Image processing method, image outputting method, and apparatuses thereof | |
CN114697668B (en) | Encoding and decoding method of point cloud media and related products | |
KR20200017534A (en) | Method and apparatus for transmitting and receiving media data | |
JP2012244622A (en) | Content converter, content conversion method and storage medium thereof | |
CN110971906A (en) | Hierarchical point cloud code stream packaging method and system | |
US20230048715A1 (en) | Point cloud data encapsulation method and point cloud data transmission method | |
US20230224533A1 (en) | Mapping architecture of immersive technologies media format (itmf) specification with rendering engines | |
WO2023207119A1 (en) | Immersive media processing method and apparatus, device, and storage medium | |
CN113891117A (en) | Immersion medium data processing method, device, equipment and readable storage medium | |
CN102474650B (en) | Reproduction apparatus of stereovision video, integrated circuit, and reproduction method | |
US20140369422A1 (en) | Remultiplexing Bitstreams of Encoded Video for Video Playback | |
US11937070B2 (en) | Layered description of space of interest | |
JP2012513148A (en) | Method for transmitting data relating to stereoscopic video, method for reproducing stereoscopic video, and method for generating file of stereoscopic video data | |
US11122252B2 (en) | Image processing device, display device, information recording medium, image processing method, and program for virtual reality content | |
WO2023169003A1 (en) | Point cloud media decoding method and apparatus and point cloud media coding method and apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21789486 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21789486 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21789486 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17/05/2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21789486 Country of ref document: EP Kind code of ref document: A1 |