WO2019007096A1 - 一种媒体信息的处理方法及装置 - Google Patents

一种媒体信息的处理方法及装置 Download PDF

Info

Publication number
WO2019007096A1
WO2019007096A1 PCT/CN2018/078540 CN2018078540W WO2019007096A1 WO 2019007096 A1 WO2019007096 A1 WO 2019007096A1 CN 2018078540 W CN2018078540 W CN 2018078540W WO 2019007096 A1 WO2019007096 A1 WO 2019007096A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
metadata
media data
media
user
Prior art date
Application number
PCT/CN2018/078540
Other languages
English (en)
French (fr)
Inventor
邸佩云
谢清鹏
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to JP2020500115A priority Critical patent/JP2020526969A/ja
Priority to CA3069031A priority patent/CA3069031A1/en
Priority to SG11201913532YA priority patent/SG11201913532YA/en
Priority to AU2018297439A priority patent/AU2018297439A1/en
Priority to BR112020000093-0A priority patent/BR112020000093A2/pt
Priority to RU2020104035A priority patent/RU2020104035A/ru
Priority to KR1020207002474A priority patent/KR20200020913A/ko
Priority to EP18829059.7A priority patent/EP3637722A4/en
Publication of WO2019007096A1 publication Critical patent/WO2019007096A1/zh
Priority to PH12020500015A priority patent/PH12020500015A1/en
Priority to US16/734,682 priority patent/US20200145716A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/75Media network packet handling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • H04N21/2353Processing of additional data, e.g. scrambling of additional data or processing content descriptors specifically adapted to content descriptors, e.g. coding, compressing or processing of metadata
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/434Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
    • H04N21/4345Extraction or processing of SI, e.g. extracting service information from an MPEG stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/61Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
    • H04L65/612Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio for unicast
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/61Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
    • H04L65/613Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio for the control of the source by the destination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/65Network streaming protocols, e.g. real-time transport protocol [RTP] or real-time control protocol [RTCP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/70Media network packetisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/75Media network packet handling
    • H04L65/756Media network packet handling adapting media to device capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/75Media network packet handling
    • H04L65/762Media network packet handling at the source 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/55Push-based network services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/561Adding application-functional data or data for application control, e.g. adding metadata
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/21805Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/2362Generation or processing of Service Information [SI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4668Learning process for intelligent management, e.g. learning user preferences for recommending movies for recommending content, e.g. movies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server
    • H04N21/6587Control parameters, e.g. trick play commands, viewpoint selection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/85406Content authoring involving a specific file format, e.g. MP4 format
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/858Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
    • H04N21/8586Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot by using a URL

Definitions

  • the present invention relates to the field of streaming media transmission technologies, and in particular, to a method and an apparatus for processing media information.
  • the ISO/IEC 23090-2 standard specification is also known as the OMAF (Omnidirectional media format) standard specification, which defines a media application format that enables the presentation of omnidirectional media in applications, omnidirectional media. Refers to omnidirectional video (360° video) and related audio.
  • OMAF Omnidirectional media format
  • the OMAF specification first specifies a list of projection methods that can be used to convert spherical video to two-dimensional video, and secondly how to use the ISO base media file format (ISOBMFF) to store omnidirectional media associated with the media.
  • ISOBMFF ISO base media file format
  • Metadata and how to encapsulate omnidirectional media data and transmit omnidirectional media data in a streaming media system, such as through Dynamic Adaptive Streaming over HTTP based on HyperText Transfer Protocol (HTTP) , DASH), Dynamic Adaptive Streaming as specified in the ISO/IEC 23009-1 standard.
  • HTTP HyperText Transfer Protocol
  • DASH Dynamic Adaptive Streaming as specified in the ISO/IEC 23009-1 standard.
  • the ISO basic media file format is composed of a series of boxes, and other boxes can be included in the box.
  • the boxes include a metadata box and a media data box, and the metadata box (moov box) includes metadata.
  • Media data box (mdat box) includes media data, and the box of metadata and the box of media data may be in the same file or in separate files; if the metadata with time attribute is ISO
  • the basic media file format is encapsulated, and the metadata box includes metadata describing metadata having a time attribute, and the media data box includes metadata having a time attribute.
  • the client cannot accurately identify the source of the data, the client cannot fully satisfy the user's needs when selecting the media data according to the metadata, and the user experience is poor.
  • the embodiment of the invention provides a method and a device for processing media information, which can enable a client to select different processing modes according to the source of the metadata.
  • a method for processing media information comprising:
  • the metadata information includes source information of the metadata
  • the source information is used to represent a recommender of the media data
  • the media data is omnidirectional media data
  • the omnidirectional media data of the embodiment of the present invention may be video data or audio data.
  • related examples of omnidirectional media may refer to the relevant provisions of the ISO/IEC 23090-2 standard specification.
  • the metadata refers to some attribute information of the video data, such as the duration of the corresponding video data, the code rate, the frame rate, the position in the spherical coordinate system, and the like.
  • the area of the omnidirectional video refers to an area in the video space corresponding to the omnidirectional video.
  • the source information of the metadata may indicate that the video data corresponding to the metadata is recommended by the author of the omnidirectional video, or may indicate that the video data corresponding to the metadata is recommended by a certain user of the omnidirectional video. Or, it may be that the video data corresponding to the metadata is recommended after counting the viewing results of the plurality of users of the omnidirectional video.
  • the client can refer to the information of the recommender of the media data when performing data processing, thereby enriching the user's selection and enhancing the user experience.
  • the obtaining metadata information of the media data includes:
  • the address information of the metadata track can be obtained through the media presentation description file, and then the information acquisition request is sent to the address, and the metadata track of the media data is received.
  • the address information of the metadata track can be obtained through a separate file, and then an information acquisition request is sent to the address, and the metadata track of the media data is received.
  • the server sends a metadata trace of the media data to the client.
  • a track refers to a series of samples with time attributes in accordance with the ISO base media file format (ISOBMFF) encapsulation method.
  • ISOBMFF ISO base media file format
  • a video track is obtained by encapsulating a code stream generated after encoding each frame by a video encoder according to the ISOBMFF specification.
  • the specific definition of the trajectory can be referred to the relevant description in ISO/IEC 14496-12.
  • the relevant attributes and data structures of the media presentation description file can be referred to the relevant description in ISO/IEC 23009-1.
  • the source information of the metadata may be stored in a newly added box in the metadata track, and the source information of the metadata is obtained by parsing the data of the box.
  • the source information of the metadata may be an attribute added in an existing box in the metadata track, and the source information of the metadata is obtained by parsing the attribute.
  • the client can obtain the source information of the metadata when obtaining the metadata track, so that the client can comprehensively consider other attributes of the metadata and the source information of the metadata for subsequent operation. Processing of related media data.
  • the obtaining metadata information of the media data includes:
  • the client can obtain the media presentation description file by sending an HTTP request to the server, or the server can directly push the media presentation description file to the client.
  • the client can also obtain the media presentation description file by other possible means, for example, obtaining the media presentation description file from other client side interactions.
  • the related attributes and data structures of the media presentation description file can refer to the relevant description in ISO/IEC23009-1.
  • the source information of the metadata may be the information indicated in the descriptor, and the source information of the metadata may also be an attribute information.
  • the source information of the metadata may be in an adaptation set hierarchy in the media presentation description file or in a representation hierarchy.
  • the obtaining metadata information of the media data includes:
  • code stream including the media data
  • code stream further includes supplementary enhancement information (SEI), where the auxiliary enhancement information includes source information of the metadata.
  • SEI supplementary enhancement information
  • the client may send a media data acquisition request to the server, and then receive the media data sent by the server.
  • the client can construct a Uniform Resource Locator (URL) through the related attribute and address information in the media presentation description file, and then send an HTTP request to the URL, and then receive the corresponding media data.
  • URL Uniform Resource Locator
  • the client may also receive the media data stream pushed by the server.
  • the source information of the metadata is a source type identifier.
  • the value of the source type identifier or a different source type identifier can indicate the corresponding source type.
  • a bit flag can be used to indicate the source type, or a more bit field can be used to identify the source type.
  • the client side stores a file of the correspondence between the source type identifier and the source type, and the client may determine the corresponding source type according to different values of the source type identifier or different source type identifiers.
  • one source type corresponds to one recommender, for example, the source type may be an author recommendation of a video, or a certain user recommendation, or a recommendation after counting the viewing results of multiple users.
  • the source information of the metadata includes a semantic representation of a recommender of the media data.
  • codewords in ISO-639-2/T can be used to represent various semantics.
  • processing the media data corresponding to the metadata according to the source information of the metadata may include the following implementation manners:
  • the client side may request corresponding media data from the server side or other terminal side according to the user's selection of the source information.
  • the client side may present or transmit the media data according to the user's selection of the source information.
  • the server may push the media data to the client according to the source information of the metadata.
  • the server may determine the media data to be pushed according to the source information of the received multiple metadata. For example, selecting from a plurality of recommendations according to a certain standard, and then pushing the media data according to the selection result. Or calculate a plurality of recommendations according to a certain standard, and then push the media data according to the calculated result.
  • a second aspect of the present invention provides a device for processing media information.
  • the device includes:
  • An information obtaining module configured to obtain metadata information of the media data, where the metadata information includes source information of the metadata, where the source information is used to represent a recommender of the media data, and the media data is an omnidirectional media data.
  • a processing module configured to process the media data according to source information of the metadata.
  • the client can refer to the information of the recommender of the media data when performing data processing, thereby enriching the user's selection and enhancing the user experience.
  • the information acquiring module is specifically configured to: obtain a metadata track of the media data, where the metadata track includes source information of the metadata.
  • the information obtaining module is specifically configured to: obtain a media presentation description file of the media data, where the media presentation description file includes source information of the metadata.
  • the information acquiring module is specifically configured to: obtain a code stream that includes the media data, where the code stream further includes supplementary enhancement information (SEI), the auxiliary The source information of the metadata is included in the enhancement information.
  • SEI supplementary enhancement information
  • the source information of the metadata is a source type identifier.
  • the source information of the metadata includes a semantic representation of a recommender of the media data.
  • a third aspect of the present invention discloses a method for processing media information, where the method includes:
  • statistical analysis can be performed on the viewing angles of multiple users viewing the same video, thereby providing an effective perspective recommendation manner for subsequent users to view the video, thereby enhancing the user experience.
  • the method is performed by the server side, for example, by a content preparation server, a content distribution network (CDN) or a proxy server.
  • a content preparation server for example, a content preparation server, a content distribution network (CDN) or a proxy server.
  • CDN content distribution network
  • the information of the user view sent by the client may be sent through a separate file, or may be included in other data files sent by the client.
  • the target viewing angle is determined according to the information of all the user perspectives, and the target viewing angle may be selected according to a predetermined criterion according to a statistical principle, or may be a plurality of viewing angles according to a certain manner.
  • the data is calculated to obtain a target perspective.
  • the media data corresponding to the target perspective may be directly pushed to the client, or the media data corresponding to the target perspective may be pushed to the distribution server, or may be received by the client for the omnidirectional media data.
  • the media data corresponding to the target perspective is fed back to the client.
  • a fourth aspect of the present invention discloses a method for processing media information, where the method includes:
  • statistical analysis can be performed on the viewing angles of multiple users viewing the same video, thereby providing an effective perspective recommendation manner for subsequent users to view the video, thereby enhancing the user experience.
  • the method is performed by the server side, for example, by a content preparation server, a content distribution network (CDN) or a proxy server.
  • a content preparation server for example, a content preparation server, a content distribution network (CDN) or a proxy server.
  • CDN content distribution network
  • the information of the user view sent by the client may be sent through a separate file, or may be included in other data files sent by the client.
  • the target viewing angle is determined according to the information of all the user perspectives, and the target viewing angle may be selected according to a predetermined criterion according to a statistical principle, or may be a plurality of viewing angles according to a certain manner.
  • the data is calculated to obtain a target perspective.
  • the fifth aspect of the present invention discloses a device for processing media information, where the device includes:
  • a receiver configured to receive information of a user perspective sent by a plurality of clients, where the user perspective information is used to indicate a viewing angle when the user views the omnidirectional media data, and the processor is configured to use all the users according to the user
  • the information of the perspective determines a target perspective; the transmitter is configured to send the media data corresponding to the target perspective.
  • a sixth aspect of the present invention discloses a device for processing media information, where the device includes:
  • a receiver configured to receive information of a user perspective sent by a plurality of clients, where the user perspective information is used to indicate a viewing angle when the user views the omnidirectional media data, and the processor is configured to use all the users according to the user
  • the information of the perspective determines a target perspective, and generates metadata information of the media data according to the target perspective.
  • a seventh aspect of the present invention discloses a device for processing media information, where the device includes: one or more processors, and a memory.
  • the memory is coupled to one or more processors; the memory is for storing computer program code, the computer program code comprising instructions, and when the one or more processors execute the instructions, the processing device performs the first aspect, or the third aspect, Or the fourth aspect, or the method for processing media information according to any of the possible implementation manners of the foregoing aspects.
  • An embodiment of the eighth aspect of the present invention discloses a computer readable storage medium, wherein the computer readable storage medium stores an instruction, wherein when the instruction is run on a device, causing the device to perform the first step as described above.
  • FIG. 1 is a diagram showing an example of a view change in an omnidirectional video according to an embodiment of the present invention.
  • FIG. 2 is a diagram showing an example of dividing a space corresponding to an omnidirectional video into a spatial object according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of relative positions of spatial objects in a space corresponding to omnidirectional video according to an embodiment of the present invention.
  • FIG. 4 is an illustration of a coordinate system describing a spatial object in accordance with an embodiment of the present invention.
  • FIG. 5 is another example of a coordinate system describing a spatial object according to an embodiment of the present invention.
  • FIG. 6 is another example of a coordinate system describing a spatial object in accordance with an embodiment of the present invention.
  • FIG. 8 is a schematic flowchart diagram of a method for processing media information according to an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a device for processing media information according to an embodiment of the present invention.
  • FIG. 10 is a schematic diagram of specific hardware of a device for processing media information according to an embodiment of the present invention.
  • FIG. 11 is a schematic diagram of a mapping relationship between a spatial object and video data according to an embodiment of the present invention.
  • a track is a series of samples of a time base attribute in accordance with the ISO base media file format (ISOBMFF).
  • ISOBMFF ISO base media file format
  • Track is defined in the standard ISO/IEC 14496-12 as "timed sequence of related samples(q.v.) in an ISO base media file
  • a track corresponds to a sequence of images or sampled audio; for hint tracks, a track corresponds to a streaming channel.”
  • a track is a sequence of images or audio samples; for a cue track, a track corresponds to a stream channel.
  • An ISOBMFF file is composed of a plurality of boxes, wherein one box can include other boxes.
  • the supplementary enhancement information is a network abstract layer unit (NALU) defined in the video coding and decoding standard h.264, h.265 issued by the International Telecommunication Union (ITU). )type.
  • NALU network abstract layer unit
  • the Media Presentation description is a document specified in the standard ISO/IEC 23009-1, in which the metadata of the HTTP-URL constructed by the client is included. Include one or more period elements in the MPD, each period element includes one or more adaptation sets, each adaptation set includes one or more representations, each representation includes One or more segments, the client selects the expression based on the information in the MPD and builds the segmented http-URL.
  • the spatial region of the VR video (the spatial region may also be called a spatial object) is a 360-degree panoramic space (or full).
  • the azimuth space, or the panoramic space object exceeds the normal visual range of the human eye. Therefore, the user changes the viewing angle (ie, the angle of view, FOV) at any time during the process of watching the video.
  • FIG. 1 is a schematic diagram of a perspective corresponding to a change in viewing angle.
  • Box 1 and Box 2 are two different perspectives of the user, respectively.
  • the video image viewed when the user's perspective is box 1 is a video image presented by the one or more spatial objects corresponding to the perspective at the moment.
  • the user's perspective is switched to box 2.
  • the video image viewed by the user should also be switched to the video image presented by the space object corresponding to box 2 at that moment.
  • the server may divide a panoramic space (or a panoramic spatial object) within a range of viewing angles corresponding to the omnidirectional video to obtain a plurality of spatial objects,
  • Each spatial object can correspond to a sub-view of the user, and the stitching of the plurality of sub-views forms a complete human eye viewing angle, and each spatial object corresponds to a sub-area of the panoramic space.
  • the human eye angle of view (hereinafter referred to as the angle of view) may correspond to one or more divided spatial objects, and the spatial object corresponding to the perspective is all the spatial objects corresponding to the content objects within the scope of the human eye.
  • the viewing angle of the human eye can be dynamically changed, but generally the viewing angle range may be 120 degrees*120 degrees, and the spatial object corresponding to the content object within the human eye angle range of 120 degrees*120 degrees may include one or more divided objects.
  • the spatial object is, for example, the viewing angle 1 corresponding to the frame 1 of FIG. 1 and the viewing angle 2 corresponding to the frame 2.
  • the client may obtain the spatial information of the video code stream prepared by the server for each spatial object through the MPD, and then request the video code stream corresponding to one or more spatial objects in a certain period of time according to the requirement of the perspective.
  • the segment outputs the corresponding spatial object according to the perspective requirements.
  • the client outputs the video stream segment corresponding to all the spatial objects within the 360-degree viewing angle range in the same time period, and then displays the complete video image in the entire 360-degree panoramic space.
  • the server may first map the spherical surface into a plane, and divide the spatial object on the plane. Specifically, the server may map the spherical surface into a latitude and longitude plan by using a latitude and longitude mapping manner.
  • FIG. 2 is a schematic diagram of a spatial object according to an embodiment of the present invention. The server can map the spherical surface into a latitude and longitude plan, and divide the latitude and longitude plan into a plurality of spatial objects such as A to I.
  • the server may also map the spherical surface into a cube, expand the plurality of faces of the cube to obtain a plan view, or map the spherical surface to other polyhedrons, and expand the plurality of faces of the polyhedron to obtain a plan view or the like.
  • the server can also map the spherical surface to a plane by using more mapping methods, which can be determined according to the requirements of the actual application scenario, and is not limited herein. The following will be described in conjunction with FIG. 2 in a latitude and longitude mapping manner. As shown in FIG. 2, after the server can divide the spherical panoramic space into a plurality of spatial objects such as A to I, a set of video code streams can be prepared for each spatial object.
  • a set of video code streams corresponding to each spatial object When the client user switches the viewing angle of the video viewing, the client can obtain the code stream corresponding to the new spatial object according to the new perspective selected by the user, and then the video content of the new spatial object code stream can be presented in the new perspective.
  • the video creator (hereafter referred to as the author) produces the video
  • he can design a main plot route for the video playback according to the video storyline needs.
  • the user only needs to watch the video image corresponding to the main plot route to understand the storyline, and other video images can be seen or not.
  • the client can selectively play the video image corresponding to the storyline, and other video images may not be presented, which can save the transmission resource and storage space resources of the video data, and improve the processing efficiency of the video data. .
  • the video image to be presented to the user at each play time during video playback can be set according to the above-mentioned main plot route, and the video sequence of each play time can be obtained by stringing the time series to obtain the above main plot route.
  • Storyline The video image to be presented to the user at each of the playing times is a video image presented on a spatial object corresponding to each playing time, that is, a video image to be presented by the spatial object during the time period.
  • the angle of view corresponding to the video image to be presented at each of the playing times may be set as the author's perspective
  • the spatial object that presents the video image in the perspective of the author may be set as the author space object.
  • the code stream corresponding to the author view object can be set as the author view code stream.
  • the video stream data of the plurality of video frames (encoded data of the plurality of video frames) is included in the code stream of the author, and each video frame can be presented as one image, that is, corresponding to the plurality of images in the author's view code stream.
  • the image presented by the author's perspective is only part of the panoramic image (or VR image or omnidirectional image) that the entire video is to present.
  • the spatial information of the spatial objects associated with the image corresponding to the author's perspective may be different or the same.
  • the region information corresponding to the perspective can be encapsulated into a metadata track.
  • the client can request the server and the region carried in the metadata track.
  • the corresponding video code stream is decoded, and then the story scene picture corresponding to the author's perspective can be presented to the user.
  • the server does not need to transmit the code stream of other perspectives other than the author's perspective (set to the non-author perspective, that is, the static view stream) to the client, which can save resources such as the transmission bandwidth of the video data.
  • the author's perspective is an image of the preset space object set by the author according to the video storyline
  • the author's spatial objects at different playback moments may be different or the same, and thus the author's perspective is a perspective that changes with the playing time.
  • the author space object is a dynamic space object with changing positions, that is, the position of the author space object corresponding to each play time is different in the panoramic space.
  • Each of the spatial objects shown in FIG. 2 is a spatial object divided according to a preset rule, and is a spatial object with a relative position in the panoramic space.
  • the author space object corresponding to any play time is not necessarily fixed as shown in FIG. 2 .
  • the spatial information may include location information of a center point of the spatial object or location information of an upper left point of the spatial object, and the spatial information may further include a width of the spatial object and the space. The height of the object.
  • the spatial information when the coordinate system corresponding to the spatial information is an angular coordinate system, the spatial information may be described by using a yaw angle.
  • the spatial information When the coordinate system corresponding to the spatial information is a pixel coordinate system, the spatial information may be described by a spatial position of the latitude and longitude map, or It is described by other geometric solid figures, and no limitation is imposed here. It is described by the yaw angle method, such as the pitch angle ⁇ (pitch), the yaw angle yaw, and the roll angle ⁇ (roll), which are used to indicate the width of the angle range and to indicate the height of the angle range. 3, FIG. 3 is a schematic diagram of the relative positions of the center points of the spatial objects in the panoramic space. In FIG.
  • the point O is the center of the 360-degree VR panoramic video spherical image, which can be considered as the position of the human eye when viewing the VR panoramic image.
  • Point A is the center point of the target space object
  • C and F are the boundary points of the target space object along the horizontal coordinate axis of the target space object
  • E and D are the target space objects passing the point A along the target space.
  • the boundary point of the longitudinal coordinate axis of the object B is the projection point of the A point along the spherical meridian line on the equator line
  • I is the starting coordinate point of the horizontal direction on the equator line.
  • Pitch angle the center position of the image of the target space object is mapped to the vertical direction of the point on the panoramic spherical (ie global space) image, such as ⁇ AOB in FIG. 3;
  • Yaw angle the center position of the image of the target space object is mapped to the horizontal deflection angle of the point on the panoramic spherical image, such as ⁇ IOB in FIG. 3;
  • Rolling angle the center position of the image of the yaw angle space object is mapped to the rotation angle of the point on the panoramic spherical image and the connection direction of the spherical center, as shown in FIG. 3 ⁇ DOB;
  • the height of the angular range (the height of the target space object in the angular coordinate system): the image of the spatial object in the field of view of the panoramic spherical image, expressed as the maximum vertical angle of the field of view, as shown in Figure 3 ⁇ DOE;
  • the width of the angle range (width of the target space object in the angular coordinate system): the image of the target space object is represented by the maximum field of view of the panoramic spherical image, as shown in Figure 3, ⁇ COF.
  • the spatial information may include location information of an upper left point of the spatial object, and location information of a lower right point of the spatial object.
  • the spatial information when the spatial object is not a rectangle, the spatial information may include at least one of a shape type, a radius, and a perimeter of the spatial object.
  • the spatial information can include spatial rotation information for the spatial object.
  • the spatial information may be encapsulated in spatial information data or a spatial information track, which may be a code stream of video data, metadata of video data, or a file independent of video data, spatial information.
  • the trajectory can be a trajectory that is independent of the video data.
  • the spatial information may be encapsulated in the spatial mate data of the video, such as in a same box, such as a covi box.
  • the coordinate system for describing the width and height of the target space object is as shown in FIG. 4, and the shaded portion of the spherical surface is the target space object, and the vertices of the four corners of the target space object are B, E, G, respectively.
  • O is the sphere corresponding to the 360-degree VR panoramic video spherical image
  • the apex BEGI is the circle of the spherical center (the circle is centered on the center of the sphere O, and the radius of the circle is 360 degrees VR panorama
  • the radius of the sphere corresponding to the video sphere image, the circle passing the z-axis, the number of the circle is two, one passing point BAIO, one passing point EFGO), and a circle parallel to the coordinate axis x-axis and y-axis (the circle is not With the center of the sphere O as the center, the number of the circle is two, and the two circles are parallel to each other, one passing point BDE, one passing point IHG) on the spherical surface, C is the center point of the target space object, and the DH side corresponds to The angle is expressed as the height of the target space object, the angle corresponding to the AF side is expressed as the width of the target space object, and the DH side and the
  • the target space object may also be obtained by intersecting two large rings of the spherical center and two parallel rings; or two yaw angle rings and two elevation angles. The intersection of the rings is obtained, the yaw angle ring is the same yaw angle of the points on the ring, the pitch angle ring is the same as the pitch angle of the points on the ring; or two longitude circles and two latitude circles Intersect
  • the coordinate system for describing the width and height of the target space object is as shown in FIG. 5, the shaded portion of the spherical surface is the target space object, and the vertices of the four corners of the target space object are B, E, G, respectively.
  • O is the sphere corresponding to the 360-degree VR panoramic video spherical image
  • the vertex BEGI is the circle passing the z-axis (the circle is centered on the center of the sphere O, and the radius of the circle is 360 degrees VR panorama
  • the radius of the sphere corresponding to the video sphere image, the number of the circle is two, one passing point BAI, one passing point EFG), and a circle passing the y axis (the circle is centered on the center of the sphere O, and the radius of the circle is 360 degree VR panoramic video spherical image corresponding to the radius of the sphere, the number of the circle is two, one passing point BDE, one passing point IHG) on the spherical surface
  • C is the center point of the target space object, DH side corresponding
  • the angle is expressed as the height of the target space object, the angle corresponding to the AF side is represented as the width of the target space object, and the DH side and the AF side pass the C point, wherein the
  • intersection point, the vertex of the angle corresponding to the AF side is the O point
  • the vertex of the angle corresponding to the BI side is the L point
  • the point L is the intersection of the circle passing through the BI point and parallel to the z-axis and the x-axis and the y-axis
  • the EG side corresponds to
  • the apex of the corner is the intersection of the circle passing the two points of EG and parallel to the z-axis and the x-axis
  • the apex of the angle corresponding to the DH side is also the point O.
  • the target space object can also be obtained by intersecting two circles passing the x-axis and two circles passing the z-axis.
  • the target space object can also be two circles passing the x-axis and The intersection of the two circles of the y-axis is obtained, or the intersection of the four circles of the spherical center is obtained.
  • the coordinate system for describing the width and height of the target spatial object is as shown in FIG. 6.
  • the shaded portion of the spherical surface is the target spatial object, and the vertices of the four corners of the target spatial object are B, E, G, respectively.
  • O is the sphere corresponding to the 360-degree VR panoramic video spherical image
  • the vertices BEGI are respectively parallel to the x-axis and the z-axis of the coordinate axis (the circle is not centered on the center of the sphere O, the circle The number is two, and the two circles are parallel to each other, the number of the circle is two, one passing point BAI, one passing point EFG), and a circle parallel to the coordinate axis x-axis and y-axis (the circle does not take the ball
  • the heart O is the center of the circle.
  • the number of the circle is two, and the two circles are parallel to each other, one passing point BDE, one passing point IHG) on the spherical surface, C is the center point of the target space object, and the angle corresponding to the DH side Expressed as the height of the target space object, the angle corresponding to the AF edge is represented as the width of the target space object, and the DH edge and the AF edge pass the C point, wherein the BI edge, the EG edge, and the DH edge correspond to the same angle; the BE edge, the IG edge, and The AF edge corresponds to the same angle; the edge of the corner corresponding to the BE edge, the IG edge, and the AF edge is the O point, the BI edge, and the EG DH vertices and edges corresponding to the angle of the point is also O.
  • the target space object may also be parallel to the y-axis and the z-axis and the two circles of the spherical center and the y-axis and the x-axis are parallel, but the two circles of the spherical center intersect.
  • the target space object may also be parallel to the y-axis and the z-axis and the two circles of the spherical center are parallel to the z-axis and the x-axis and the two circles of the spherical center intersect.
  • the J point and the L point in FIG. 5 are the same as the J point in FIG. 4, the apex of the corner corresponding to the BE side is the J point, and the apex of the angle corresponding to the BI side is the L point; in FIG. 6, the BE side The vertices corresponding to the BI side are all O points.
  • FIG. 11 is a schematic diagram of a mapping relationship between a spatial object and video data according to an embodiment of the present invention.
  • Figure 11 (a) shows an omnidirectional video (left large image) and omnidirectional video sub-region (right panel), and
  • Figure 11 (b) shows omnidirectional video corresponding video space (spherical) and full A spatial object corresponding to a sub-region of the video (a dark portion on the sphere).
  • a timed metadata track having a time attribute of a region on a spherical surface in which a box describing a spherical region is included in a box of metadata.
  • the data in the media data box, includes the information of the spherical area, and the intent of the metadata track with the time attribute is described in the box of the metadata, that is, what the spherical area is used for, and two are described in the standard.
  • the metadata track with time attributes - The recommended viewport timed metadata track and the initial viewpoint timed metadata track.
  • the recommended view trajectory describes the area of the view that is recommended for presentation to the terminal, and the initial view trajectory describes the initial presentation direction for omnidirectional video viewing.
  • the server side 701 includes a content preparation 7011 and a content service 7012.
  • the content preparation 7011 may be a transcoder of the media data collection device or the media data, and is responsible for generating media content of the streaming media and related metadata, such as compression, encapsulation and storage/transmission of the media files (video, audio, etc.). .
  • the content preparation 7011 can generate metadata information and a file in which the metadata source is located. Metadata can be encapsulated as a metadata track, and metadata can also be encapsulated in the SEI of the video data track.
  • a sample in a metadata trajectory is a partial region in an omnidirectional video specified by a content creator or a partial region in an omnidirectional video specified by a content creator, and the metadata source is encapsulated in a metadata trajectory or carried in In the MPD.
  • the source information of the metadata can be carried in the SEI.
  • the source information of the metadata may indicate that the metadata indicates the producer of the content or the viewing area recommended by the director.
  • the content service 7012 can be a network node, such as a content distribution network (CDN) or a proxy server.
  • the content service 7012 may obtain the stored or transmitted data from the content preparation 7011, and forward the data to the terminal side 702; or obtain the area information fed back by the terminal from the terminal side 702, and generate the area metadata track or the area SEI information according to the feedback information, and Generate a file that carries the source of the zone information.
  • the generated area metadata trajectory or the area SEI information may be view information fed back by each area of the omnidirectional video, and the sample of the user's area of interest is generated according to one or more areas of the statistically selected area, and the sample is encapsulated in the sample.
  • the metadata track is either encapsulated in the SEI and encapsulates the area metadata source information into the track or carried in the MPD, or carries the source information of the area metadata in the SEI.
  • the source of information indicates that the regional metadata information is derived from server statistics, indicating that the region described in the metadata track is the region of interest to most users.
  • the area metadata track or the area information in the area SEI may also be area information fed back by a certain user specified by the server, generate an area metadata track or a region SEI according to the feedback information, and carry the source information of the area metadata in the area metadata.
  • the trajectory is either carried in the MPD or in the SEI, and the source of the region information describes that the region metadata is from a certain user.
  • content preparation 7011 and the content service 7012 can be on the same server hardware device or different hardware devices. Both content preparation 7011 and content service 7012 may include one or more hardware devices.
  • the terminal side 702 obtains the media data and presents the media data, and the terminal side 702 obtains the area information of the content presented by the user in the omnidirectional video, the terminal side 702 feeds the area information to the content service side 701, or the terminal side 702 obtains the media data.
  • the metadata and the data carrying the metadata source information, the terminal side 702 analyzes the metadata source information, parses the corresponding metadata according to the metadata source selected by the terminal user, and obtains the regional information for media presentation.
  • the module processing method related to the source information of the metadata track is as follows;
  • the source information may indicate that the area related to the metadata is recommended by the producer of the content or recommended by the director or may be recommended by a specified user, or may be a user according to relevant statistics The area of interest; the source information may also indicate the perspective of the omni-directional video that the content creator or director recommends to the user, or the most interesting area recommended by the server or the perspective recommended by a certain user.
  • the information of the area is mainly referred to as some metadata of the area, and the information of the area may indicate the area recommended by the creator or the director of the content, or may be the most interesting area of the user obtained by the statistical user feedback information, or The area where the end user views the omnidirectional video.
  • the region may be a two-dimensional planar region or a spherical region, and the two-dimensional planar region information is represented by a coordinate position of the upper left pixel in the two-dimensional plane and a width and height of the region in the two-dimensional plane.
  • the information of the area is represented by the position of the center point of the area on the spherical surface and the wide and high coverage angle of the area on the spherical surface.
  • Figure 1- The manner shown in Figure 6.
  • the region may also be a direction on the sphere or a point on the sphere, in which case the representation of the region has no width and height information.
  • the MPD file is generated; or the source of the metadata and metadata is encapsulated in the SEI to generate a code stream file.
  • the file generated by the module can be stored locally or sent to the receiving end.
  • the receiving end can be the terminal side or the content service side.
  • the module for the source information of the metadata track may be the content preparation 7011 of FIG. 7, the content service 7012, a separate sub-module of the terminal side 702, or may integrate related functions in the above device.
  • the technical solution of the embodiment of the present invention mainly falls on the content preparation side (transcoder), the intelligent network node (CND, proxy server), and the terminal player side.
  • the transcoding server, the network server, and the terminal When the transcoding server, the network server, and the terminal generate the area metadata, the metadata is encapsulated into independent tracks or encapsulated in the SEI, and the source of the metadata is encapsulated in the metadata track or the SEI or the MPD file.
  • an embodiment of the present invention discloses a method for processing media information S80, and the method S80 includes:
  • S801 Obtain metadata information of the media data, where the metadata information includes source information of the metadata, where the source information is used to represent a recommender of the media data, and the media data is omnidirectional media data;
  • S802 Process the media data according to source information of the metadata.
  • an embodiment of the present invention discloses a media information processing apparatus 90.
  • the apparatus 90 includes an information acquiring module 901 and a processing module 902.
  • the information obtaining module 901 is configured to obtain metadata information of the media data, where the metadata information includes source information of the metadata, where the source information is used to represent a recommender of the media data, and the media data is omnidirectional media data.
  • the processing module 902 is configured to process the media data according to the source information of the metadata.
  • the source information of the metadata is carried in the metadata track.
  • a description box of the sample data source in the metadata track is added, and the track source is described in the box.
  • the format of the box added in this embodiment is as follows.
  • SourceInformationBox extends Box(‘sinf’) ⁇
  • source_type describes the source information of the track where the above box is located.
  • Presenting the user's media content to the user; when source_type 1, indicating that the area information in the track is the most interesting area of the user, or indicating the most interesting area from the statistics, the terminal side can use the track
  • the information presents the area of interest to most of the users in the omni-directional media to the user.
  • the processing flow for obtaining the information of the above metadata track on the terminal side is as follows:
  • the terminal acquires the metadata track, parses the metadata box (moov box) in the metadata track, and parses the box to obtain the sinf box;
  • Source_type 0
  • the metadata of the information description is intended to be a recommendation from the creator of the omnidirectional video, or a recommendation of the user viewing the omnidirectional video, or a recommendation to count the data of the viewing angle.
  • the client can distinguish metadata of different sources when receiving the regional metadata. In the case of multiple regional metadata, the user can select the recommended area to be viewed according to individual needs.
  • the source information of the metadata is carried in the MPD.
  • the source information descriptor is added to the standard element SupplementalProperty/EssentialProperty specified in ISO/IEC 23009-1.
  • the source is the source of the information.
  • the value of the descriptor is defined as follows:
  • the above description can be in the adaptationSet element of the MPD, or in the element of the representation of the MPD.
  • the descriptor is in the element of the representation.
  • an attribute describing the source of the representation may be added to the element of the adaptationSet element or representation, for example, the attribute is sourceType.
  • Presenting the user's media content to the user; when sourceType 1, indicating that the area information in the track is the most interesting area of the user, or indicating the most interesting area from the statistics, the terminal side can use the track
  • the information presents the area of interest to most of the users in the omni-directional media to the user.
  • the descriptors and attributes are respectively used to describe the area information in the file metadata.mp4 file described in the representation as the recommendation of the producer of the video.
  • the processing flow for obtaining the above sample information on the terminal side is as follows:
  • the terminal obtains the MPD file and parses the MPD file. If the parsing includes a descriptor whose scheme is urn:mpeg:dash:purpose in the adaptationSet or representation element, parses the value of the descriptor.
  • the source information information of the metadata is carried in the SEI.
  • the SRC in the above syntax represents a specific value, such as 190, which is not limited herein.
  • the payload type in the SEI is SRC
  • the syntax in the SEI is as described in the following table.
  • the source_type in the payload describes the source information of the area information described by the above SEI.
  • the area information presents the area of interest to most of the users in the omnidirectional media to the user.
  • the processing flow for obtaining the above video bitstream on the terminal side is as follows:
  • the terminal acquires the video code stream, parses the NALU header information in the code stream, and if the parsed header information type is the SEI type, parses the SEI NALU to obtain the payload type of the SEI;
  • the area information in the video stream is parsed, the area information is obtained, and the medium corresponding to the obtained area in the omnidirectional medium is presented to the user.
  • the semantics of the source information may be extended.
  • Language the language of the subsequent string, which uses the language codeword in ISO-639-2/T to represent various languages.
  • sourceDescription A string describing the content of the source metadata.
  • the semantics of the source information may be extended.
  • the semantics of the source information may be extended.
  • Reason_description A reason for describing the recommended metadata, or a description of the video content corresponding to the recommended metadata.
  • the semantics of the source information may be extended.
  • Person_description User age information describing the recommended metadata, or the age range counted. Such as children, youth or old age, or 0-10, 10-20, etc.
  • the SourceInformationBox may be included in the scheme information box.
  • Source_type takes an integer value, indicating the source type of the metadata.
  • the source values of the different values are as follows:
  • Date describes the time at which the metadata was generated/recommended.
  • ID_lenght describes the length of the ID_description, which is the length of the ID_description -1.
  • ID_description describes the name of the referrer
  • Reason_lenght describes the length of the reason_description, which is the length of the reason_description -1.
  • the reason_description describes the reason for recommending the metadata, or the description information of the video content corresponding to the recommended metadata.
  • SourceInformationBox can also take other names, such as natrueInformationBox.
  • natrueInformationBox is as follows:
  • natrueInformationBox The syntax of natrueInformationBox is:
  • Natrue_type takes an integer value indicating the source type of the metadata.
  • the source values of the different values are as follows:
  • Date describes the time at which the metadata is generated/recommended, and date can be an integer time in seconds, or a representation of other forms of time.
  • ID_lenght describes the length of the ID_description, which is the length of the ID_description -1.
  • ID_description describes the name of the referrer
  • Reason_lenght describes the length of the reason_description, which is the length of the reason_description -1.
  • the reason_description describes the reason for recommending the metadata, or the description information of the video content corresponding to the recommended metadata.
  • Natrue_type an integer that indicates the type of nature. The following values for natrue_type are specified:
  • the recommended viewport timed metadata track is used for indicating a particular person or user
  • natrue_type Other values are reserved.
  • Date is an integer that declares the recommended time of the metadate(in seconds since midnight,Jan.1,1904,in UTC time)
  • ID_lenght indicates the length in byte of the ID_description field minus one.
  • ID_description specifies the name of the recommended person. It is a null-terminated string in UTF-8 characters containing a file group name
  • Reason_lenght indicates the length in byte of the reason_description field minus one.
  • the reason_description specifies the recommended reason or the description of the media content corresponding to the metadata. It is a null-terminated string in UTF-8 characters.
  • a SourceInformationBox or a natrueInformationBox may be carried in a tref box of a media track, and a tref is described in the ISO/IEC 14496-12 standard. Is the track reference box, which describes the track associated with the current media track.
  • SourceInformationBox or natrueInformationBox can be an extension of the tref box. Aligned(8)class SourceInformationBox extends tref('sinf',0,0).
  • the information of the intent/source of the metadata may also be represented by a sample entry type.
  • the sample entry type of the region of interest of most users may be 'mroi', and a user recommendation may be 'proi'.
  • ' recommended by the author or director, can be 'droi'.
  • the terminal side presents the description information of the metadata trajectory that can be recommended to the user to the user, and the user selects the recommendation to be viewed according to the description information.
  • the terminal acquires the corresponding metadata track according to the user's selection, parses the obtained metadata track, obtains the area information in the track, and presents the omnidirectional medium according to the area information.
  • the terminal feeds back the recommended information of the user to the content server side, and the content service side obtains the metadata track according to the feedback user selection, parses the metadata track information to obtain the area information, and uses the area information to media the area information.
  • the area information in the metadata track can also be used to create a movable viewing environment for the user, and the viewing environment rotates the user's viewing environment according to the yaw angle, the pitch angle and the rotation angle in the area information.
  • the simulation for example, the viewing environment can be a rotatable seat that can be moved left, right, pitched or rotated according to the area information.
  • FIG. 10 is a schematic diagram showing the hardware structure of a computer device 100 according to an embodiment of the present invention.
  • the computer device 100 can be used as an implementation manner of a processing device for streaming media information, and can also be used as an implementation manner of a method for processing information of a streaming media.
  • the computer device 100 includes a processor 101 and a memory 102.
  • the input/output interface 103 and the bus 105 may also include a communication interface 104.
  • the processor 101, the memory 102, the input/output interface 103, and the communication interface 104 realize a communication connection with each other through the bus 105.
  • the processor 101 can be a general-purpose central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), or one or more integrated circuits for executing related programs.
  • Processor 101 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method may be completed by an integrated logic circuit of hardware in the processor 101 or an instruction in a form of software.
  • the processor 101 described above may be a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, or discrete hardware. Component.
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present invention may be implemented or carried out.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present invention may be directly implemented by the hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor.
  • the software modules can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in the memory 102, and the processor 101 reads the information in the memory 102, combines the hardware to complete the functions required by the modules included in the processing device of the streaming media information provided by the embodiment of the present invention, or executes the present invention.
  • Method for processing information of streaming media provided by the method embodiment.
  • the memory 102 can be a read only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM).
  • the memory 102 can store operating systems as well as other applications.
  • the function to be performed by the module included in the processing device for implementing the information of the streaming media provided by the embodiment of the present invention, or the method for processing the information of the streaming media provided by the embodiment of the method of the present invention is stored in the memory 102, and the processor 101 performs an operation performed by a module included in the processing device of the information of the streaming media, or performs the method embodiment of the present invention.
  • the processing method of media data is stored in the memory 102, and the processor 101 performs an operation performed by a module included in the processing device of the information of the streaming media, or performs the method embodiment of the present invention.
  • the input/output interface 103 is for receiving input data and information, and outputting data such as an operation result.
  • Communication interface 104 enables communication between computer device 100 and other devices or communication networks using transceivers such as, but not limited to, transceivers. It can be used as an acquisition module or a transmission module in the processing device.
  • Bus 105 may include a path for communicating information between various components of computer device 100, such as processor 101, memory 102, input/output interface 103, and communication interface 104.
  • computer device 100 shown in FIG. 10 only shows the processor 101, the memory 102, the input/output interface 103, the communication interface 104, and the bus 105, those skilled in the art will understand in the specific implementation process.
  • the computer device 100 also includes other devices necessary to achieve normal operation, for example, a display may also be included for displaying video data to be played.
  • computer device 100 may also include hardware devices that implement other additional functions, depending on the particular needs.
  • computer device 100 may also only include the components necessary to implement embodiments of the present invention, and does not necessarily include all of the devices shown in FIG.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

本发明实施例公开了一种媒体信息的处理方法和装置。所述方法包括:得到媒体数据的元数据信息,所述元数据信息包括元数据的来源信息,所述来源信息用于表示媒体数据的推荐方,所述媒体数据是全向媒体数据;根据所述元数据的来源信息处理所述媒体数据。根据本发明实施例的媒体信息的处理方法和装置,客户端在进行数据处理时可以参考媒体数据的推荐方的信息,从而丰富了用户的选择,增强了用户体验。

Description

一种媒体信息的处理方法及装置 技术领域
本发明涉及流媒体传输技术领域,尤其涉及一种媒体信息的处理方法及装置。
背景技术
ISO/IEC 23090-2标准规范又称为OMAF(Omnidirectional media format,全向媒体格式)标准规范,该规范定义了一种媒体应用格式,可以在应用中实现全向媒体的呈现,全向媒体主要是指全向视频(360°视频)和相关音频。OMAF规范首先指定了可以用于将球面视频转换为二维视频的投影方法的列表,其次是如何使用ISO基本媒体文件格式(ISO base media file format,ISOBMFF)存储全向媒体和该媒体相关联的元数据,以及如何在流媒体系统中封装全向媒体的数据和传输全向媒体的数据,例如通过基于超文本传输协议(HyperText Transfer Protocol,HTTP)的动态自适应流传输(Dynamic Adaptive Streaming over HTTP,DASH),ISO/IEC 23009-1标准中规定的动态自适应流传输。
ISO基本媒体文件格式是由一系列的盒(box)组成,在box中可以包括其他的box,在这些box包括元数据box和媒体数据box,元数据box(moov box)中包括的是元数据,媒体数据box(mdat box)中包括的是媒体数据,元数据的box和媒体数据的box可以是在同一个文件中,也可以是在分开的文件中;如果具有时间属性的元数据采用ISO基本媒体文件格式封装,元数据box中包括的是描述具有时间属性的元数据的元数据,媒体数据box中包括的是具有时间属性的元数据。
在现有的技术中,因为客户端无法准确识别数据的来源,所以客户端在根据元数据选择媒体数据时,并不能充分的满足用户的需求,用户体验较差。
发明内容
本发明实施例提供了一种媒体信息的处理方法及装置,可以使得客户端能够根据元数据的来源选择不同的处理方式。
在本发明第一方面的实施例中,公开了一种媒体信息的处理方法,所述方法包括:
得到媒体数据的元数据信息,所述元数据信息包括元数据的来源信息,所述来源信息用于表示所述媒体数据的推荐方,所述媒体数据是全向媒体数据;
根据所述元数据的来源信息处理所述媒体数据。
本发明实施例的全向媒体数据可以为视频数据或者音频数据。在一种可能的实现方式中,全向媒体的有关示例可以参考ISO/IEC 23090-2标准规范的有关规定。
一种可能的实现方式中,元数据指的是视频数据的一些属性信息,例如对应的视频数据的时长,码率,帧率,在球面坐标系中的位置等。
在一种可能的实现方式中,全向视频的区域指的是全向视频所对应的视频空间中的区域。
一种可能的实现方式中,元数据的来源信息可以表示元数据对应的视频数据是由全向视频的作者推荐的,或者可以表示元数据对应的视频数据是由全向视频的某一用户推荐的,或者可以表示元数据对应的视频数据是统计全向视频的多个用户的观看结果后进行推荐的。
根据本发明实施例的媒体信息的处理方法,客户端在进行数据处理时可以参考媒体数据的推荐方的信息,从而丰富了用户的选择,增强了用户体验。
在本发明实施例的一种可能的实现方式中,所述得到媒体数据的元数据信息,包括:
得到所述媒体数据的元数据轨迹(track),所述元数据轨迹中包括所述元数据的来源信息。
一种可能的实现方式中,可以通过媒体展示描述文件得到元数据轨迹的地址信息,然后向该地址发送信息获取请求,接收得到媒体数据的元数据轨迹。
一种可能的实现方式中,可以通过单独的文件得到元数据轨迹的地址信息,然后向该地址发送信息获取请求,接收得到媒体数据的元数据轨迹。
一种可能的实现方式中,服务器向客户端发送媒体数据的元数据轨迹。
一种可能的实现方式中,轨迹(track)是指一系列有时间属性的按照ISO基本媒体文件格式(ISO base media file format,ISOBMFF)的封装方式的样本。比如视频track,视频样本是通过将视频编码器编码每一帧后产生的码流按照ISOBMFF的规范封装后得到的。轨迹的具体定义可以参考ISO/IEC 14496-12中的有关说明。
一种可能的实现方式中,媒体展示描述文件的有关属性和数据结构可以参考ISO/IEC 23009-1中的有关说明。
一种可能的实现方式中,元数据的来源信息可以存储在元数据轨迹中的一个新增的盒(box)中,通过解析该box的数据得到元数据的来源信息。
一种可能的实现方式中,元数据的来源信息可以是在元数据轨迹中现有的一个盒中增加的一个属性,通过解析该属性得到元数据的来源信息。
通过将元数据的来源信息封装在元数据轨迹中,客户端在得到元数据轨迹时就可以得到元数据的来源信息,使得客户端可以综合考虑元数据的其它属性和元数据的来源信息进行后续对相关媒体数据的处理。
在本发明实施例的一种可能的实现方式中,所述得到媒体数据的元数据信息,包括:
得到所述媒体数据的媒体展示描述文件,所述媒体展示描述文件中包括所述元数据的来源信息。
客户端可以通过向服务器发送HTTP请求的方式得到媒体展示描述文件,也可以由服务器直接将媒体展示描述文件推送给客户端。客户端也可以通过其它可能的方式得到媒体展示描述文件,例如,从其它客户端侧交互得到媒体展示描述文件。
一种可能的实现方式中,媒体展示描述文件的有关属性和数据结构可以参考ISO/IEC23009-1中的有关说明。
一种可能的实现方式中,元数据的来源信息可以是描述子中指示的信息,元数据的来源信息也可以是一个属性信息。
一种可能的实现方式中,元数据的来源信息可以是在媒体展示描述文件中的自适应集(adaptation set)层级中或者是在表示(representation)层级中。
在本发明实施例一种可能的实现方式中,所述得到媒体数据的元数据信息,包括:
得到包括所述媒体数据的码流,其中,所述码流还包括辅助增强信息(supplementary enhancement information,SEI),所述辅助增强信息中包括所述元数据的来源信息。
一种可能的实现方式中,客户端可以通过向服务器发送媒体数据获取请求,然后接收服务器发送的媒体数据。例如客户端可以通过媒体展示描述文件中的有关属性和地址信息构建统一资源定位符(Uniform Resource Locator,URL),然后向该URL发送HTTP请求,然后接收相应的媒体数据。
一种可能的实现方式中,客户端也可以接收服务器推送的媒体数据流。
在本发明实施例的一种可能的实现方式中,所述元数据的来源信息是一个来源类型标识。来源类型标识的值或不同的来源类型标识可以表示对应的来源类型。例如,可以用一个比特的flag来表示来源类型,或者用更多比特的字段来标识来源类型。在一个示例中,客户端侧存储了来源类型标识和来源类型的对应关系的文件,客户端可以根据来源类型标识的不同的值,或者不同的来源类型标识确定对应的来源类型。
在一种可能的实现方式中,一种来源类型对应一个推荐方,例如来源类型可以是视频的作者推荐,或者是某一个用户推荐,或者是统计多个用户的观看结果后的推荐。
在本发明实施例的一种可能的实现方式中,所述元数据的来源信息包括媒体数据的推荐方的语义表示。例如可以使用ISO-639-2/T中的码字来表示各种语义。
在本发明实施例的一种可能的实现方式中,根据所述元数据的来源信息处理所述元数据对应的媒体数据,可以包括如下几种实现方式:
如果客户端侧还没有得到元数据对应的媒体数据,客户端侧可以根据用户对来源信息的选择,从服务器侧或者其它终端侧请求相对应的媒体数据。
如果客户端侧已经得到元数据对应的媒体数据,客户端侧可以根据用户对来源信息的选择,呈现或者传输该媒体数据。
如果该方法是在服务器侧执行的,服务器可以根据元数据的来源信息将所述媒体数据推送给客户端。
在一种可能的实现方式中,服务器可以根据收到的多个元数据的来源信息确定要推送的媒体数据。例如根据某一标准从多个推荐中进行选择,然后根据选择结果进行媒体数据的推送。或者根据某一标准对多个推荐进行计算,然后根据计算的结果进行媒体数据的推送。
本发明第二方面实施例提供了一种媒体信息的处理装置。所述装置包括:
信息获取模块,用于得到媒体数据的元数据信息,所述元数据信息包括元数据的来源信息,所述来源信息用于表示所述媒体数据的推荐方,所述媒体数据是全向媒体数据;处理模块,用于根据所述元数据的来源信息处理所述媒体数据。
根据本发明实施例的媒体信息的处理装置,客户端在进行数据处理时可以参考媒体数据的推荐方的信息,从而丰富了用户的选择,增强了用户体验。
在一种可能的实现方式中,所述信息获取模块具体用于:得到所述媒体数据的元数据轨迹(track),所述元数据轨迹中包括所述元数据的来源信息。
在一种可能的实现方式中,所述信息获取模块具体用于:得到所述媒体数据的媒体展示描述文件,所述媒体展示描述文件中包括所述元数据的来源信息。
在一种可能的实现方式中,所述信息获取模块具体用于:得到包括所述媒体数据的码流,其中,所述码流还包括辅助增强信息(supplementary enhancement information,SEI),所述辅助增强信息中包括所述元数据的来源信息。
在一种可能的实现方式中,所述元数据的来源信息是一个来源类型标识。
在一种可能的实现方式中,所述元数据的来源信息包括媒体数据的推荐方的语义表示。
本发明装置实施例的具体示例和实现方式可以参考上述第一方面方法实施例中的相关举例,在此不再赘述。
本发明第三方面实施例公开了一种媒体信息的处理方法,所述方法包括:
接收多个客户端发送的用户视角的信息,所述用户视角信息用于指示用户观看全向媒体数据时的视角;根据全部的用户视角的信息确定目标视角;发送所述目标视角对应的媒体数 据。
根据本发明实施例的媒体信息的处理方法,可以对多位用户观看同一视频的视角进行统计分析,从而为后续用户观看该视频时提供有效的视角推荐方式,增强了用户体验。
一种可能的实现方式中,该方法由服务器侧执行,例如由内容准备服务器,内容分发网络(Content distribution network,CDN)或者代理服务器。
一种可能的实现方式中,客户端发送的用户视角的信息可以是通过单独的文件发送的,也可以包括在客户端发送的其它数据文件中。
一种可能的实现方式中,全向媒体和视角的说明和示例可以参考前面的第一方面实施例和具体实施方式部分的示例,在此不再赘述。
一种可能的实现方式中,根据全部的用户视角的信息确定目标视角,可以是根据统计学原理从多个视角中按照预设标准选择目标视角,或者可以是根据某一方式对多个视角的数据进行计算,得到目标视角。
一种可能的实现方式中,可以直接向客户端推送目标视角对应的的媒体数据,或者可以向分发服务器推送目标视角对应的媒体数据,或者可以是在接收到客户端对该全向媒体数据的获取请求时,向客户端反馈目标视角对应的媒体数据。
本发明第四方面实施例公开了一种媒体信息的处理方法,所述方法包括:
接收多个客户端发送的用户视角的信息,所述用户视角信息用于指示用户观看全向媒体数据时的视角;根据全部的用户视角的信息确定目标视角;根据所述目标视角生成媒体数据的元数据信息。
根据本发明实施例的媒体信息的处理方法,可以对多位用户观看同一视频的视角进行统计分析,从而为后续用户观看该视频时提供有效的视角推荐方式,增强了用户体验。
一种可能的实现方式中,该方法由服务器侧执行,例如由内容准备服务器,内容分发网络(Content distribution network,CDN)或者代理服务器。
一种可能的实现方式中,客户端发送的用户视角的信息可以是通过单独的文件发送的,也可以包括在客户端发送的其它数据文件中。
一种可能的实现方式中,全向媒体和视角的说明和示例可以参考前面的第一方面实施例和具体实施方式部分的示例,在此不再赘述。
一种可能的实现方式中,根据全部的用户视角的信息确定目标视角,可以是根据统计学原理从多个视角中按照预设标准选择目标视角,或者可以是根据某一方式对多个视角的数据进行计算,得到目标视角。
本发明第五方面实施例公开了一种媒体信息的处理装置,所述装置包括:
接收器,所述接收器用于接收多个客户端发送的用户视角的信息,所述用户视角信息用于指示用户观看全向媒体数据时的视角;处理器,所述处理器用于根据全部的用户视角的信息确定目标视角;发送器,所述发送器用于发送所述目标视角对应的媒体数据。
本发明第六方面实施例公开了一种媒体信息的处理装置,所述装置包括:
接收器,所述接收器用于接收多个客户端发送的用户视角的信息,所述用户视角信息用于指示用户观看全向媒体数据时的视角;处理器,所述处理器用于根据全部的用户视角的信息确定目标视角,以及根据所述目标视角生成媒体数据的元数据信息。
本发明第五方面,第六方面装置实施例的具体示例和实现方式可以参考第三方面、第四方面方法实施例中的相关举例,在此不再赘述。
本发明第七方面实施例公开了一种媒体信息的处理装置,所述装置包括:一个或多个处理器、存储器。该存储器与一个或多个处理器耦合;存储器用于存储计算机程序代码,计算机程序代码包括指令,当一个或多个处理器执行指令时,处理装置执行如上述第一方面,或第三方面,或第四方面,或上述各方面的任意一种可能的实现方式所述的媒体信息的处理方法。
本发明第八方面实施例公开了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,其特征在于,当所述指令在设备上运行时,使得设备执行如上述第一方面,或第三方面,或第四方面,或上述各方面的任意一种可能的实现方式所述的媒体信息的处理方法。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例的全向视频中的视角变换的示例图。
图2是本发明实施例的将全向视频对应的空间划分为空间对象的示例图。
图3是本发明实施例的空间对象在全向视频对应的空间中的相对位置的示意图。
图4是本发明实施例的描述空间对象的坐标系的一种示例。
图5是本发明实施例的描述空间对象的坐标系的另一种示例。
图6是本发明实施例的描述空间对象的坐标系的另一种示例。
图7是本发明实施例的方法和装置的应用的一种场景示例。
图8是本发明实施例的一种媒体信息的处理方法的流程示意图。
图9是本发明实施例的一种媒体信息的处理装置的结构示意图。
图10是本发明实施例的一种媒体信息的处理装置的具体硬件的示意图。
图11是本发明实施例的一种空间对象和视频数据之间的映射关系的示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述。
在本发明的一些实施例中,
轨迹(track)是指一系列有时间属性的按照ISO基本媒体文件格式(ISO base media file format,ISOBMFF)的封装方式的样本。比如视频track,视频样本是通过将视频编码器编码每一帧后产生的码流按照ISOBMFF的规范封装后得到的。
Track在标准ISO/IEC 14496-12中的定义为“timed sequence of related samples(q.v.)in an ISO base media file
NOTE:For media data,a track corresponds to a sequence of images or sampled audio;for hint tracks,a track corresponds to a streaming channel.”
中文翻译“ISO媒体文件中相关样本的时间属性序列,
注:对于媒体数据,一个track就是个图像或者音频样本序列;对于提示轨迹,一个轨迹对应一个流频道。
ISOBMFF文件是由多个盒子(box)构成,其中,一个box可以包括其它的box。
box在ISO/IEC 14496-12标准中的定义:“object-or iented building block defined by a unique type identifier and length
NOTE:Called‘atom’in some specifications,including the first definition of MP4.”。
中文翻译:“面向对象的构建块,由唯一的类型标识符和长度定义;注意在某些规范中称为“原子”,包括MP4的第一个定义。”
辅助增强信息(supplementary enhancement information,SEI)是国际通信联盟(International Telecommunication Union,ITU)发布的视频编解码标准h.264,h.265中定义的一种网络接入单元(Network Abstract Layer Unit,NALU)的类型。
媒体展示描述(Media presentation description,MPD)是标准ISO/IEC 23009-1中规定的一种文档,在该文档中包括了客户端构造HTTP-URL的元数据。在MPD中包括一个或者多个period(周期)元素,每个period元素包括有一个或者多个自适应集(adaptationset),每个adaptationset中包括一个或者多个表示(representation),每个representation中包括一个或者多个分段,客户端根据MPD中的信息,选择表达,并构建分段的http-URL。
当前随着360度视频等VR视频的观看应用的日益普及,越来越多的用户加入到大视角的VR视频观看的体验队伍中。这种新的视频观看应用给用户带来了新的视频观看模式和视觉体验的同时,也带来了新的技术挑战。由于360度(本发明实施例将以360度为例进行说明)等大视角的视频观看过程中,VR视频的空间区域(空间区域也可以叫做空间对象)为360度的全景空间(或称全方位空间,或称全景空间对象),超过了人眼正常的视觉范围,因此,用户在观看视频的过程中随时都会变换观看的角度(即视角,FOV)。用户观看的视角不同,看到的视频图像也将不同,故此视频呈现的内容需要随着用户的视角变化而变化。如图1,图1是视角变化对应的视角示意图。框1和框2分别为用户的两个不同的视角。用户在观看视频的过程中,可通过眼部或者头部转动,或者视频观看设备的画面切换等操作,将视频观看的视角由框1切换到框2。其中,用户的视角为框1时所观看的视频图像为该视角对应的一个或者多个空间对象在该时刻所呈现的视频图像。下一个时刻用户的视角切换为框2,此时用户观看到的视频图像也应该切换为框2对应的空间对象在该时刻所呈现视频图像。
在一些可行的实施方式中,对于360度大视角的视频图像的输出,服务器可将全向视频对应的视角范围内的全景空间(或者称为全景空间对象)进行划分以得到多个空间对象,每个空间对象可以对应用户的一个子视角,多个子视角的拼接形成一个完整的人眼观察视角,每个空间对象对应全景空间的一个子区域。即人眼视角(下面简称视角)可对应一个或者多个划分得到的空间对象,视角对应的空间对象是人眼视角范围内的内容对象所对应的所有的空间对象。其中,人眼观察视角可以动态变化的,但是通常视角范围可为120度*120度,120度*120度的人眼视角范围内的内容对象对应的空间对象可包括一个或者多个划分得到的空间对象,例如上述图1该的框1对应的视角1,框2对应的视角2。进一步的,客户端可通过MPD获取服务器为每个空间对象准备的视频码流的空间信息,进而可根据视角的需求向服务器请求某一时间段某个或者多个空间对象对应的视频码流分段并按照视角需求输出对应的空间对象。客户端在同一个时间段内输出360度的视角范围内的所有空间对象对应的视频码流分段,则可在整个360度的全景空间内输出显示该时间段内的完整视频图像。
具体实现中,在360度的空间对象的划分中,服务器可首先将球面映射为平面,在平面上对空间对象进行划分。具体的,服务器可采用经纬度的映射方式将球面映射为经纬平面图。 如图2,图2是本发明实施例提供的空间对象的示意图。服务器可将球面映射为经纬平面图,并将经纬平面图划分为A~I等多个空间对象。进一步的,服务器可也将球面映射为立方体,再将立方体的多个面进行展开得到平面图,或者将球面映射为其他多面体,在将多面体的多个面进行展开得到平面图等。服务器还可采用更多的映射方式将球面映射为平面,具体可根据实际应用场景需求确定,在此不做限制。下面将以经纬度的映射方式,结合图2进行说明。如图2,服务器可将球面的全景空间划分为A~I等多个空间对象之后,则可为每个空间对象准备一组视频码流。其中,每个空间对象对应的一组视频码流。客户端用户切换视频观看的视角时,客户端则可根据用户选择的新视角获取新空间对象对应的码流,进而可将新空间对象码流的视频内容呈现在新视角内。
视频的制作者(以下简称作者)制作视频时,可根据视频的故事情节需求为视频播放设计一条主要情节路线。视频播放过程中,用户只需要观看该主要情节路线对应的视频图像则可了解到该故事情节,其他视频图像可看可不看。由此可知,视频播放过程中,客户端可选择性的播放该故事情节对应的视频图像,其他的视频图像可以不呈现,可节省视频数据的传输资源和存储空间资源,提高视频数据的处理效率。作者设计故事的主要情节之后,可根据上述主要情节路线设定视频播放时每个播放时刻所要呈现给用户的视频图像,将每个播放时刻的视频图像按照时序串起来则可得到上述主要情节路线的故事情节。其中,上述每个播放时刻所要呈现给用户的视频图像为在每个播放时刻对应的空间对象上呈现的视频图像,即该空间对象在该时间段所要呈现的视频图像。具体实现中,上述每个播放时刻所要呈现的视频图像对应的视角可设为作者视角,呈现作者视角上的视频图像的空间对象可设为作者空间对象。作者视角对象对应的码流可设为作者视角码流。作者视角码流中包括多个视频帧的视频帧数据(多个视频帧的编码数据),每个视频帧呈现时可为一个图像,即作者视角码流中对应多个图像。在视频播放过程中,在每个播放时刻,作者视角上呈现的图像仅是整个视频所要呈现的全景图像(或称VR图像或者全方位图像)中的一部分。在不同的播放时刻,作者视角对应的图像所关联的空间对象的空间信息可以不同,也可以相同。
作者设计了每个播放时刻的作者视角之后,可以将视角对应的区域信息封装为元数据轨迹,客户端接收到该元数据轨迹后,客户端可以向服务器请求和元数据轨迹中所携带的区域相对应的视频码流进行解码,之后,则可呈现作者视角对应的故事情节画面给用户。服务器无需传输作者视角以外其他视角(设为非作者视角,即静态视角码流)的码流给客户端,可节省视频数据的传输带宽等资源。
由于作者视角是作者根据视频故事情节设定的呈现预设空间对象的图像,不同的播放时刻上的作者空间对象可不同也可相同,由此可知作者视角是一个随着播放时刻不断变化的视角,作者空间对象是个不断变化位置的动态空间对象,即每个播放时刻对应的作者空间对象在全景空间中的位置不尽相同。上述图2所示的各个空间对象是按照预设规则划分的空间对象,是在全景空间中的相对位置固定的空间对象,任一播放时刻对应的作者空间对象不一定是图2所示的固定空间对象中的某一个,而且在全局空间中相对位置不断变化的空间对象。
在空间信息一种可能的实现方式中,该空间信息可以包括该空间对象的中心点的位置信息或者该空间对象的左上点的位置信息,该空间信息还可以包括该空间对象的宽和该空间对象的高。
其中,在空间信息对应的坐标系为角度坐标系时,空间信息可以采用偏航角来描述,在空间信息对应的坐标系为像素坐标系时,空间信息可以采用经纬图的空间位置描述,或者采用其他几何立体图形来描述,在此不做限制。采用偏航角方式描述,如俯仰角θ(pitch)、 偏航角ψ(yaw)、滚转角Φ(roll),用于表示角度范围的宽和用于表示角度范围的高。如图3,图3是空间对象的中心点在全景空间中的相对位置的示意图。在图3中,O点为360度VR全景视频球面图像对应的球心,可认为是观看VR全景图像时人眼的位置。A点为目标空间对象的中心点,C、F为目标空间对象中过A点的沿该目标空间对象横向坐标轴的边界点,E、D为目标空间对象中过A点的沿该目标空间对象纵向坐标轴的边界点,B为A点沿球面经线在赤道线的投影点,I为赤道线上水平方向的起始坐标点。各个元素的含义解释如下:
俯仰角:目标空间对象的图像的中心位置映射到全景球面(即全局空间)图像上的点的竖直方向的偏转角,如图3中的∠AOB;
偏航角:目标空间对象的图像的中心位置映射到全景球面图像上的点的水平方向的偏转角,如图3中的∠IOB;
滚转角:偏航角空间对象的图像的中心位置映射到全景球面图像上的点与球心连线方向的旋转角,如图3中的∠DOB;
用于表示角度范围的高(在角度坐标系中的目标空间对象的高):空间对象的图像在全景球面图像的视场高度,以视场纵向最大角度表示,如图3中∠DOE;用于表示角度范围的宽(在角度坐标系中的目标空间对象的宽):目标空间对象的图像在全景球面图像的视场宽度,以视场横向最大角度表示,如图3中∠COF。
在空间信息另一种可能的实现方式中,空间信息可以包括空间对象的左上点的位置信息,和空间对象的右下点的位置信息。
在空间信息另一种可能的实现方式中,在空间对象不是矩形时,空间信息可以包括空间对象的形状类型、半径、周长中至少一种。
在一些实施例中,空间信息可以包括空间对象的空间旋转信息。
在一些实施例中,空间信息可以封装在空间信息数据或者空间信息轨迹(track)中,该空间信息数据可以为视频数据的码流、视频数据的元数据或者独立于视频数据的文件,空间信息轨迹可以为独立于视频数据的轨迹。
在一些实施例中,空间信息可以封装在视频的空间信息元数据中(track matedata),比如封装在同一个box中,例如,covi box中。
在一些实施例中,用于描述目标空间对象的宽高的坐标系如图4所示,球面的阴影部分是目标空间对象,目标空间对象的四个角的顶点分别是B,E,G,I;在图4中,O为360度VR全景视频球面图像对应的球心,顶点BEGI分别为过球心的圆(该圆以球心O为圆心,并且该圆的半径为360度VR全景视频球面图像对应的球体的半径,该圆过z轴,该圆的数量为两个,一个经过点BAIO,一个经过点EFGO),和平行于坐标轴x轴和y轴的圆(该圆不以球心O为圆心,该圆的数量为两个,且两个圆互相平行,一个经过点BDE,一个经过点IHG)在球面上的交点,C为目标空间对象的中心点,DH边对应的角度表示为目标空间对象的高度,AF边对应的角度表示为目标空间对象的宽度,DH边和AF边过C点,其中BI边、EG边和DH边对应的角度相同;BE边、IG边和AF边对应的角度相同;BE边对应的角的顶点是J,J是上述圆中BDE所在圆和z轴的交点,相应的,IG边对应的角的顶点为上述圆中IHG所在的圆和z轴的交点,AF边对应的角的顶点为O点,BI边、EG边和DH边对应的角的顶点也为O点。
需要说明的是,以上只是一种示例,目标空间对象也可以是过球心的两个大圆环和两个平行的圆环相交获得;或者是两个偏航角圆环和两个俯仰角圆环相交获得,偏航角圆环就是圆环上的点的偏航角都相同,俯仰角圆环就是圆环上的点的俯仰角都相同;或者是两个经度圈和两个纬度圈相交获得
在一些实施例中,用于描述目标空间对象的宽高的坐标系如图5所示,球面的阴影部分是目标空间对象,目标空间对象的四个角的顶点分别是B,E,G,I;在图5中,O为360度VR全景视频球面图像对应的球心,顶点BEGI分别为过z轴的圆(该圆以球心O为圆心,并且该圆的半径为360度VR全景视频球面图像对应的球体的半径,该圆的数量为两个,一个经过点BAI,一个经过点EFG),和过y轴的圆(该圆以球心O为圆心,并且该圆的半径为360度VR全景视频球面图像对应的球体的半径,该圆的数量为两个,一个经过点BDE,一个经过点IHG)在球面上的交点,C为目标空间对象的中心点,DH边对应的角度表示为目标空间对象的高度,AF边对应的角度表示为目标空间对象的宽度,DH边和AF边过C点,其中BI边、EG边和DH边对应的角度相同;BE边、IG边和AF边对应的角度相同;BE边对应的角的顶点为J点,J点为过BE两点并与x轴和y轴平行的圆与z轴的交点,IG边对应的角的顶点为过IG两点并与x轴和y轴平行的圆与z轴的交点,AF边对应的角的顶点为O点,BI边对应的角的顶点为L点,L点为过BI两点并与z轴和x轴平行的圆与y轴的交点,EG边对应的角的顶点为过EG两点并与z轴和x轴平行的圆与y轴的交点,DH边对应的角的顶点也为O点。
需要说明的是,以上只是一种示例,目标空间对象也可以是过x轴的两个圆和过z轴的两个圆相交获得,目标空间对象也可以是过x轴的两个圆和过y轴的两个圆相交获得,或者是4个过球心的圆环相交获得。
在一些实施例中,用于描述目标空间对象的宽高的坐标系如图6所示,球面的阴影部分是目标空间对象,目标空间对象的四个角的顶点分别是B,E,G,I;在图6中,O为360度VR全景视频球面图像对应的球心,顶点BEGI分别为平行于坐标轴x轴和z轴的圆(该圆不以球心O为圆心,该圆的数量为两个,且两个圆互相平行,,该圆的数量为两个,一个经过点BAI,一个经过点EFG),和平行于坐标轴x轴和y轴的圆(该圆不以球心O为圆心,该圆的数量为两个,且两个圆互相平行,一个经过点BDE,一个经过点IHG)在球面上的交点,C为目标空间对象的中心点,DH边对应的角度表示为目标空间对象的高度,AF边对应的角度表示为目标空间对象的宽度,DH边和AF边过C点,其中BI边、EG边和DH边对应的角度相同;BE边、IG边和AF边对应的角度相同;BE边、IG边和AF边对应的角的顶点为O点,BI边、EG边和DH边对应的角的顶点也为O点。
需要说明的是,以上只是一种示例,目标空间对象也可以是平行于y轴和z轴的且不过球心两个圆和平行于y轴和x轴的且不过球心两个圆相交获得,目标空间对象也可以是平行于y轴和z轴的且不过球心两个圆和平行于z轴和x轴的且不过球心两个圆相交获得。
在图5中的J点和L点和图4中的J点获取方式相同,BE边对应的角的顶点是J点,BI边对应的角的顶点是L点;在图6中,BE边和BI边对应的顶点都是O点。
图11是本发明实施例的一种空间对象和视频数据的映射关系的示意图。图11(a)示出了一个全向视频(左边大图)和全向视频的子区域(右边小图),图11(b)示出了全向视频对应的视频空间(球面)和全向视频的子区域对应的空间对象(球面上的深色部分)。
在现有的OMAF标准中规定了在球面上的区域(region)的具有时间属性的元数据轨迹(timed metadata track),在该元数据轨迹中元数据的box中包括的是描述球面region的元数据,在媒体数据box中包括的是球面区域的信息,在元数据的box中描述了有时间属性的元数据轨迹的意图,也就是球面区域是用来做什么的,在标准中描述了两种有时间属性的元数据轨迹——推荐视角元数据轨迹(The recommended viewport timed metadata track)和初始视点轨迹(the initial viewpoint timed metadata track)。推荐视角轨迹描述了推 荐给终端呈现的视角的区域,初始视点轨迹描述了全向视频观看时的初始呈现方向。
下面结合图7描述本发明的实施例应用的一种场景。
如图7所示,服务器侧701包括内容准备7011和内容服务7012。
内容准备7011可以是媒体数据采集设备或媒体数据的转码器,负责流媒体的媒体内容以及相关的元数据等信息的生成,比如媒体文件(视频,音频等)的压缩,封装和存储/发送。内容准备7011可以生成元数据信息,以及元数据来源所在的文件。元数据可以封装为元数据轨迹,元数据也可以封装在视频数据轨迹的SEI中。元数据轨迹中的样本(sample)是内容生成者指定的全向视频中的部分区域或者是由内容制作者指定的全向视频中的部分区域,元数据来源封装在元数据轨迹中或者携带在MPD中。如果元数据封装在SEI中,则元数据的来源信息可以携带在SEI中。在一种实施方式中,元数据的来源信息可以表示元数据指示的是内容的制作者或者导演推荐的观看区域。
内容服务7012可以是网络节点,比如内容分发网络(Content distribution network,CDN)或者代理服务器。内容服务7012可以从内容准备7011获取存储或者发送的数据,将数据转发到终端侧702;或者是从终端侧702获取终端反馈的区域信息,根据反馈信息生成区域元数据轨迹或者区域SEI信息,并生成携带区域信息来源的文件。生成区域元数据轨迹或者区域SEI信息可以是统计全向视频的各区域反馈的观看信息,根据统计的选择区域观看数量最多的一个或者多个区域生成用户感兴趣区域的样本,将该样本封装在元数据轨迹或者封装在SEI中,并封装区域元数据来源信息到该轨迹中或者携带在MPD中,或者将区域元数据的来源信息携带在SEI中。该信息来源表示区域元数据信息来自于服务器统计,表示元数据轨迹中描述的区域是大部分用户感兴趣的区域。区域元数据轨迹或者区域SEI中的区域信息也可以是服务器指定的某个用户反馈的区域信息,根据反馈信息生成区域元数据轨迹或者区域SEI,并将区域元数据的来源信息携带在区域元数据轨迹或者携带在MPD中或者SEI中,该区域信息来源描述了区域元数据来自于某个用户。
可以理解的是,内容准备7011和内容服务7012可以在同一个服务器硬件设备上,也可以是不同的硬件设备。内容准备7011和内容服务7012都可以包括一个或多个硬件设备。
终端侧702获得媒体数据,呈现媒体数据,同时终端侧702获得用户呈现的内容在全向视频中的区域信息,终端侧702将区域信息反馈到内容服务侧701;或者终端侧702获得媒体数据,元数据和携带有元数据来源信息的数据,终端侧702解析元数据来源信息,根据终端用户选定的元数据来源,解析对应的元数据,获得区域信息进行媒体呈现。
在一种可能的实现方式中,有关元数据轨迹的来源信息的模块处理方式如下;
获取元数据的来源信息,该来源信息可以指示该元数据相关的区域是内容的制作者推荐或者导演推荐的或者可以是某个指定的用户推荐的,或者还可以是根据相关统计得出的用户感兴趣的区域;来源信息还可以指示内容制作者或者导演推荐给用户的观看全向视频的视角,或者服务器推荐的最感兴趣区域或者某个用户推荐的视角。
获取区域的信息,这里主要指的是区域的一些元数据,该区域的信息可以指示内容的制作者或者导演推荐指定的区域,也可以是统计用户反馈信息获得的用户最感兴趣区域,或者是终端用户观看全向视频的区域。区域可以是二维平面的区域也可以是球面区域,二维平面区域信息是由区域在二维平面中的左上像素点在二维平面中的坐标位置和区域的宽高表示的。如果区域是在球面上,区域的信息是由区域的中心点在球面上的位置和区域在球面上的宽高覆盖角度表示的,可以参见前述在球面上指示区域的相关示例,例如图1-图6中示出的方式。在一种实现方式中,区域也可以是球面上的一个方向或者是球面上的一个点,在这种 情况下,区域的表示没有宽高信息。
将相关的元数据和元数据的来源封装在元数据轨迹中,产生元数据轨迹文件;或者将元数据封装在元数据轨迹中,产生元数据轨迹文件,将元数据的来源携带在MPD中,产生MPD文件;或者将元数据和元数据的来源封装在SEI中,产生码流文件。该模块产生的文件可以存储在本地,也可以发送的接收端,接收端可以是终端侧,也可以是内容服务侧。
有关元数据轨迹的来源信息的模块可以是图7的内容准备7011,内容服务7012,终端侧702中单独的一个子模块,也可以是将相关的功能集成在上述设备中。
本发明的实施例的技术方案主要落地在内容准备侧(转码器),智能网络节点(CND,代理服务器),终端播放器侧。
在转码服务器,网络服务器以及终端产生区域元数据的时候,将元数据封装为独立的轨迹或者封装在SEI中,同时将元数据的来源封装在元数据轨迹或者SEI中或者MPD文件中。
如图8所示,本发明一方面的实施例公开了一种媒体信息的处理方法S80,方法S80包括:
S801:得到媒体数据的元数据信息,所述元数据信息包括元数据的来源信息,所述来源信息用于表示所述媒体数据的推荐方,所述媒体数据是全向媒体数据;
S802:根据所述元数据的来源信息处理所述媒体数据。
如图9所示,本发明一方面的实施例公开了一种媒体信息的处理装置90,装置90包括:信息获取模块901和处理模块902。信息获取模块901用于得到媒体数据的元数据信息,所述元数据信息包括元数据的来源信息,所述来源信息用于表示所述媒体数据的推荐方,所述媒体数据是全向媒体数据;处理模块902用于根据所述元数据的来源信息处理所述媒体数据。
在本发明实施例的一种实现方式中,元数据的来源信息携带在元数据轨迹中。
在元数据轨迹中,新增一个元数据轨迹中的样本数据来源的描述box,在该box中描述轨迹来源。在本实施例中新增的box格式如下,
SourceInformationBox extends Box(‘sinf’){
Unsigned int(8)source_type;//指示元数据来源:导演预设/预先统计/受欢迎的个人}
在本样例中source_type描述了上述box所在的轨迹的来源信息。当source_type=0时,表示轨迹中的区域信息是视频的制作者的推荐的,或者表示来源于内容制作者或者导演,比如导演推荐的视角;终端侧可以使用该轨迹中的信息将导演要呈现给用户的媒体内容呈现给用户;当source_type=1时,表示轨迹中的区域信息是大部分用户的感兴趣区域,或者表示来源于统计到的最感兴趣区域,终端侧可以使用该轨迹中的信息将全向媒体中的大部分用户感兴趣的区域呈现给用户。当source_type=2时,表示轨迹中的区域信息是某个终端用户观看全向媒体的区域,或者表示来源于某个特定的人,终端侧可以重现某个用户观看全向媒体的视角。
可以理解的是,上述type只是为了帮助理解本发明实施例而做出的一种示例,而不是一种具体限制。type的值可以取其它的数值,或者用以表示其它的来源类型。
在终端侧获得上述元数据轨迹的信息的处理流程如下:
1、终端获取到元数据轨迹,解析元数据轨迹中的元数据box(moov box),再解析该box得到sinf box;
2、解析sinf box,获得Source-type信息,如果source_type=0,该轨迹中的区域信 息是视频的制作者的推荐的;如果source_type=1时,轨迹中的区域信息是大部分用户的感兴趣区域;如果source_type=2时,轨迹中的区域信息是某个终端用户观看全向媒体的区域。假设终端获取的元数据中的source_type=0。
3、将信息来源呈现给用户,接受用户的选择。
4、如果用户选择观看视频制作者或者是导演推荐的视角,那么解析元数据轨迹中的样本,获得区域信息,将全向媒体中与获得的区域对应的媒体呈现给用户。
在元数据轨迹中携带元数据来源的信息,该来源信息描述了元数据来源于全向视频的制作者,或者是观看全向视频的用户,或者是统计获得的感兴趣视角的数据;或者该信息描述的元数据意图是元数据来源于全向视频的制作者的推荐,或者是观看全向视频的用户的推荐,或者是统计观看视角的数据的推荐。客户端在接收到区域元数据时可以区分不同来源的元数据,在有多个区域元数据的情况下,用户可以按照个人需求选择要观看的推荐区域。
在本发明的一种实现方式中,元数据的来源信息携带在MPD中。
在ISO/IEC 23009-1中规定的标准元素SupplementalProperty/EssentialProperty中增加源信息描述子,该描述子的scheme为="urn:mpeg:dash:purpose",表示该描述子给出了MPD中的表示中的是信息来源,该描述子的value的取值定义如下表:
Figure PCTCN2018078540-appb-000001
上述的描述子可以在MPD的adaptationSet元素中,也可以在MPD的representation的元素中,在下面的具体样例中,该描述子在representation的元素中
Figure PCTCN2018078540-appb-000002
Figure PCTCN2018078540-appb-000003
在本样例中,除了采用描述子的方式来描述表示的来源信息,也可以在adaptationSet元素或者representation的元素中增加一个描述表示的来源的属性,比如属性为sourceType。当sourceType=0时,表示轨迹中的区域信息是视频的制作者的推荐的,或者表示来源于内容制作者或者导演,比如导演推荐的视角;终端侧可以使用该轨迹中的信息将导演要呈现给用户的媒体内容呈现给用户;当sourceType=1时,表示轨迹中的区域信息是大部分用户的感兴趣区域,或者表示来源于统计到的最感兴趣区域,终端侧可以使用该轨迹中的信息将全向媒体中的大部分用户感兴趣的区域呈现给用户。当sourceType=2时,表示轨迹中的区域信息是某个终端用户观看全向媒体的区域,或者表示来源于某个特定的人,终端侧可以重现某个用户观看全向媒体的视角。。
MPD的样例如下:
Figure PCTCN2018078540-appb-000004
在上述的两个MPD样例中,分别采用描述子和属性来描述表示中描述的文件metadata.mp4文件中的区域信息是视频的制作者的推荐的。
在终端侧获得上述样例信息的处理流程如下:
1、终端获取到MPD文件,解析MPD文件,如果解析在adaptationSet或者representation元素中包括scheme为urn:mpeg:dash:purpose的描述子,解析该描述子的value。
2、如果value=0,该表示中的区域信息是视频的制作者的推荐的;如果value=1时,表示中的区域信息是大部分用户的感兴趣区域;如果value=2时,表示中的区域信息是某个终端用户观看全向媒体的区域。假设终端获取的MPD中的value=0。
3、将信息来源呈现给用户,接受用户的选择。
4、如果用户选择观看视频制作者或者是导演推荐的视角,那么根据MPD中的信息构造representation分段请求,获得分段,解析分段的区域信息,获得区域信息,将全向媒体中与获得的区域对应的媒体呈现给用户。
在本发明的一个实施例中,元数据的来源信息信息携带在SEI中。
样例:
Figure PCTCN2018078540-appb-000005
上述语法中的SRC表示一个具体取值,比如190,这里不作限定,当SEI中的载荷类型为SRC时SEI中的语法如下表中的描述。
Figure PCTCN2018078540-appb-000006
该负载中的source_type描述了上述SEI描述的区域信息的来源信息。当source_type=0时,表示SEI描述的区域信息是视频的制作者的推荐的,或者表示来源于内容制作者或者导演,比如导演推荐的视角;终端侧可以使用该SEI描述的区域信息将导演要呈现给用户的媒体内容呈现给用户;当source_type=1时,表示SEI描述的区域信息是大部分用户的感兴趣区域,或者表示来源于统计到的最感兴趣区域,终端侧可以使用该SEI描述的区域信息将全向媒体中的大部分用户感兴趣的区域呈现给用户。当source_type=2时,表示SEI描述的区域信息某个终端用户观看全向媒体的区域,或者表示来源于某个特定的人,终端侧可以重现某个用户观看全向媒体的视角。
在终端侧获得上述视频码流的处理流程如下:
1、终端获取到视频码流,解析码流中的NALU头信息,如果解析到的头信息类型是SEI类型,解析SEI NALU,获得SEI的载荷类型;
2、如果解析到的载荷类型为190,表示SEI中携带了区域元数据的来源信息;继续解析获得Source-type信息,如果source_type=0,该轨迹中的区域信息是视频的制作者的推荐的;如果source_type=1时,轨迹中的区域信息是大部分用户的感兴趣区域;如果source_type=2 时,轨迹中的区域信息是某个终端用户观看全向媒体的区域。假设终端获取的SEI中的source_type=0;
3、将信息来源呈现给用户,接受用户的选择。
4、如果用户选择观看视频制作者或者是导演推荐的视角,那么解析视频码流中的区域信息,获得区域信息,将全向媒体中与获得的区域对应的媒体呈现给用户。
在本发明的一个实施例中,除了上述实施例所列出的来源信息的类型外,还可以对来源信息的语义进行扩展。
例如:
1、在元数据轨迹中的语法扩展:
Figure PCTCN2018078540-appb-000007
语义:
Language:后续的字符串的语言,该值采用ISO-639-2/T中的语言码字来表示各种语言
sourceDescription:字符串,具体描述区域元数据来源的内容,对来源的描述的描述,比如该值可以是“a director's cut”,表示元数据来源于作者或者是作者推荐;或者推荐人的名称,比如取值“Tom”,表示来源于Tom或者是Tom推荐。
2、在MPD中的扩展:
Figure PCTCN2018078540-appb-000008
3、在SEI中的扩展:(语法的语义和上述语义相同)
Figure PCTCN2018078540-appb-000009
在本发明的一个实施例中,除了上述实施例所列出的来源信息的类型外,还可以对来源信息的语义进行扩展。
例如:
1、在元数据轨迹中的语法扩展:
Figure PCTCN2018078540-appb-000010
Figure PCTCN2018078540-appb-000011
语义:
Data:描述了元数据轨迹产生/推荐的时间,比如Mon,04 Jul 2011 05:50:30 GMT。
2、在MPD中的扩展:
Figure PCTCN2018078540-appb-000012
3、在SEI中的扩展:(语法的语义和上述语义相同)
Figure PCTCN2018078540-appb-000013
在本发明的一个实施例中,除了上述实施例所列出的来源信息的类型外,还可以对来源信息的语义进行扩展。
例如:
1、在元数据轨迹中的语法扩展:
Figure PCTCN2018078540-appb-000014
语义:
reason_description:描述了推荐元数据的理由,或者是对推荐的元数据对应的视频内容的描述信息。
2、在MPD中的扩展:
Figure PCTCN2018078540-appb-000015
3、在SEI中的扩展:(语法的语义和上述语义相同)
Figure PCTCN2018078540-appb-000016
在本发明的一个实施例中,除了上述实施例所列出的来源信息的类型外,还可以对来源信息的语义进行扩展。
例如:
1、在元数据轨迹中的语法扩展:
Figure PCTCN2018078540-appb-000017
语义:
person_description:描述了推荐元数据的用户年龄信息,或者统计到的年龄区间。比如儿童,青年或者老年,或者0-10,10-20等
2、在MPD中的扩展:
Figure PCTCN2018078540-appb-000018
3、在SEI中的扩展:(语法的语义和上述语义相同)
Figure PCTCN2018078540-appb-000019
在本发明的一个实施例中,SourceInformationBox可以包括在scheme information box中。
语法:
Figure PCTCN2018078540-appb-000020
语义:
source_type取整数值,表示元数据的来源类型,不同的取值表示的来源类型如下:
0:来源于内容制作者或者导演
1:来源于统计到的最感兴趣区域
2:来源于某个特定的人
其他的值保留.
date描述了产生/推荐元数据的时间。
ID_lenght描述了ID_description长度,该值是ID_description的长度-1.
ID_description描述了推荐人的名称
reason_lenght描述了reason_description长度,该值是reason_description的长度-1.。
reason_description描述了推荐元数据的理由,或者是对推荐的元数据对应的视频内容的描述信息。
可以理解的是,SourceInformationBox也可以采用其它的名称,例如可以为natrueInformationBox。
在一种可能的实现方式中,natrueInformationBox的示例如下:
Box Type:'ninf'
Container:Scheme Information box(‘schi’)
Mandatory:No
Quantity:Zero or one
natrueInformationBox的语法为:
Figure PCTCN2018078540-appb-000021
其中,
natrue_type取整数值,表示元数据的来源类型,不同的取值表示的来源类型如下:
1:来源于内容制作者或者导演
2:来源于统计到的最感兴趣区域
3:来源于某个特定的人
其他的值保留.
date描述了产生/推荐元数据的时间,date可以是按秒计算的整数时间,或其它形式的时间的表示方式。
ID_lenght描述了ID_description长度,该值是ID_description的长度-1.
ID_description描述了推荐人的名称
reason_lenght描述了reason_description长度,该值是reason_description的长度-1.。
reason_description描述了推荐元数据的理由,或者是对推荐的元数据对应的视频内容的描述信息。
在一个具体示例中,
natrue_type an integer that indicates the type of nature.The following values for natrue_type are specified:
1:The recommended viewport timed metadata track is used for indicating a director's cut
2:The recommended viewport timed metadata track is used for indicating the statistically most-viewed viewport
3:The recommended viewport timed metadata track is used for indicating a particular person or user
Other values of natrue_type are reserved.
date is an integer that declares the recommended time of the metadate(in seconds since midnight,Jan.1,1904,in UTC time)
ID_lenght indicates the length in byte of the ID_description field minus one.
ID_description specifies the name of the recommended person.It is a null-terminated string in UTF-8 characters containing a file group name
reason_lenght indicates the length in byte of the reason_description field minus one.
reason_description specifies the recommended reason or the description of the media content corresponding to the metadata.It is a null-terminated string in UTF-8characters containing a file group name。
在本发明的上述所有实施例中的语法都可以携带在媒体轨迹(media)中,比如可以将SourceInformationBox或者natrueInformationBox携带在media track的tref box中,在ISO/IEC 14496-12的标准中描述了tref是track reference box,该box描述了和当前媒体track相关联的track。SourceInformationBox或者natrueInformationBox可以是tref box的扩展。aligned(8)class SourceInformationBox extends tref('sinf',0,0)。
在本发明的一个实施例中,元数据的意图/来源的信息也可用sample entry type表示,比如大部分用户的感兴趣区域的sample entry type可以是’mroi’,某个用户推荐可以是‘proi’,作者或者导演推荐,可以是‘droi’。
在本发明的一个实施例中,终端侧将可以推荐给用户的元数据轨迹的描述信息呈现给用户,用户根据描述信息,选择要观看的推荐。终端根据用户的选择获取选择对应的元数据轨迹,解析获得的元数据轨迹,得到轨迹中的区域信息,根据区域信息呈现全向媒体。或者终端将用户的选择的推荐的信息反馈到内容服务器侧,内容服务侧根据反馈的用户选择,获取元数据轨迹,解析元数据轨迹信息获得区域信息,按照区域信息,将区域信息对应的媒体数据发送到终端。在终端侧,元数据轨迹中的区域信息还可以用来为用户创建一个可以移动的观看环境,该观看环境按照区域信息中的偏航角,俯仰角和旋转角,对用户的观看环境进行旋转模拟,比如该观看环境可以是一个可以转动的座椅,该座椅可以按照按照区域信息进行左右,俯仰或者旋转的运动。
图10是本发明实施例提供的计算机设备100的硬件结构示意图。如图10所示,计算机设备100可以作为流媒体的信息的处理装置的一种实现方式,也可以作为流媒体的信息的处理方法的一种实现方式,计算机设备100包括处理器101、存储器102、输入/输出接口103和总线105,还可以包括通信接口104。其中,处理器101、存储器102、输入/输出接口103和通信接口104通过总线105实现彼此之间的通信连接。
处理器101可以采用通用的中央处理器(Central Processing Unit,CPU),微处理器,应用专用集成电路(Application Specific Integrated Circuit,ASIC),或者一个或多个集成电路,用于执行相关程序,以实现本发明实施例所提供的流媒体的信息的处理装置中的模块所需执行的功能,或者执行本发明方法实施例对应的流媒体的信息的处理方法。处理器101可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器101中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器101可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本发明实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本发明实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可 以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器102,处理器101读取存储器102中的信息,结合其硬件完成本发明实施例所提供的流媒体的信息的处理装置中包括的模块所需执行的功能,或者执行本发明方法实施例提供的流媒体的信息的处理方法。
存储器102可以是只读存储器(Read Only Memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(Random Access Memory,RAM)。存储器102可以存储操作系统以及其他应用程序。在通过软件或者固件来实现本发明实施例提供的流媒体的信息的处理装置中包括的模块所需执行的功能,或者执行本发明方法实施例提供的流媒体的信息的处理方法时,用于实现本发明实施例提供的技术方案的程序代码保存在存储器102中,并由处理器101来执行流媒体的信息的处理装置中包括的模块所需执行的操作,或者执行本发明方法实施例提供的媒体数据的处理方法。
输入/输出接口103用于接收输入的数据和信息,输出操作结果等数据。
通信接口104使用例如但不限于收发器一类的收发装置,来实现计算机设备100与其他设备或通信网络之间的通信。可以作为处理装置中的获取模块或者发送模块。
总线105可包括在计算机设备100各个部件(例如处理器101、存储器102、输入/输出接口103和通信接口104)之间传送信息的通路。
应注意,尽管图10所示的计算机设备100仅仅示出了处理器101、存储器102、输入/输出接口103、通信接口104以及总线105,但是在具体实现过程中,本领域的技术人员应当明白,计算机设备100还包括实现正常运行所必须的其他器件,例如还可以包括显示器,用于显示要播放的视频数据。同时,根据具体需要,本领域的技术人员应当明白,计算机设备100还可包括实现其他附加功能的硬件器件。此外,本领域的技术人员应当明白,计算机设备100也可仅仅包括实现本发明实施例所必须的器件,而不必包括图10中所示的全部器件。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本发明所必须的。本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,上述的程序可存储于一种计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,上述的存储介质可为磁碟、光盘、只读存储记忆体(ROM:Read-Only Memory)或随机存储记忆体(RAM:Random Access Memory)等。
尽管在此结合各实施例对本发明进行了描述,然而,在实施所要保护的本发明的过程中,本领域技术人员通过查看所述附图、公开内容、以及所附权利要求书,可理解并实现所述公开实施例的其它变化。在权利要求中,“包括”(comprising)一词不排除其它组成部分或步骤,“一”或“一个”不排除多个的可能性。单个处理器或其它单元可以实现权利要求中列举的若干项功能。互相不同的从属权利要求中记载了某些措施,但这并不代表这些措施不能组合起来产生良好的效果。计算机程序可以存储/分布在合适的介质中,例如:光存储介质或固态介质,与其它硬件一起提供或作为硬件的一部分,也可以采用其它分布形式,如通过Internet或其它有线或无线电信系统。

Claims (16)

  1. 一种媒体信息的处理方法,其特征在于,所述方法包括:
    得到媒体数据的元数据信息,所述元数据信息包括元数据的来源信息,所述来源信息用于表示所述媒体数据的推荐方,所述媒体数据是全向媒体数据;
    根据所述元数据的来源信息处理所述媒体数据。
  2. 根据权利要求1所述的方法,其特征在于,所述得到媒体数据的元数据信息,包括:
    得到所述媒体数据的元数据轨迹(track),所述元数据轨迹中包括所述元数据的来源信息。
  3. 根据权利要求1所述的方法,其特征在于,所述得到媒体数据的元数据信息,包括:
    得到所述媒体数据的媒体展示描述文件,所述媒体展示描述文件中包括所述元数据的来源信息。
  4. 根据权利要求1所述的方法,其特征在于,所述得到媒体数据的元数据信息,包括:
    得到包括所述媒体数据的码流,其中,所述码流还包括辅助增强信息(supplementary enhancement information,SEI),所述辅助增强信息中包括所述元数据的来源信息。
  5. 根据权利要求1-4任意之一所述的方法,其特征在于,所述元数据的来源信息是一个来源类型标识。
  6. 根据权利要求1-4任意之一所述的方法,其特征在于,所述元数据的来源信息包括媒体数据的推荐方的语义表示。
  7. 一种媒体信息的处理装置,其特征在于,所述装置包括:
    信息获取模块,用于得到媒体数据的元数据信息,所述元数据信息包括元数据的来源信息,所述来源信息用于表示所述媒体数据的推荐方,所述媒体数据是全向媒体数据;
    处理模块,用于根据所述元数据的来源信息处理所述媒体数据。
  8. 根据权利要求7所述的装置,其特征在于,所述信息获取模块具体用于:得到所述媒体数据的元数据轨迹(track),所述元数据轨迹中包括所述元数据的来源信息。
  9. 根据权利要求7所述的装置,其特征在于,所述信息获取模块具体用于:得到所述媒体数据的媒体展示描述文件,所述媒体展示描述文件中包括所述元数据的来源信息。
  10. 根据权利要求7所述的装置,其特征在于,所述信息获取模块具体用于:得到包括所述媒体数据的码流,其中,所述码流还包括辅助增强信息(supplementary enhancement information,SEI),所述辅助增强信息中包括所述元数据的来源信息。
  11. 根据权利要求7-10任意之一所述的装置,其特征在于,所述元数据的来源信息是一个来源类型标识。
  12. 根据权利要求7-10任意之一所述的装置,其特征在于,所述元数据的来源信息包括媒体数据的推荐方的语义表示。
  13. 一种媒体信息的处理方法,其特征在于,所述方法包括:
    接收多个客户端发送的用户视角的信息,所述用户视角信息用于指示用户观看全向媒体数据时的视角;
    根据全部的用户视角的信息确定目标视角;
    发送所述目标视角对应的媒体数据。
  14. 一种媒体信息的处理方法,其特征在于,所述方法包括:
    接收多个客户端发送的用户视角的信息,所述用户视角信息用于指示用户观看全向媒体数据时的视角;
    根据全部的用户视角的信息确定目标视角;
    根据所述目标视角生成媒体数据的元数据信息。
  15. 一种媒体信息的处理装置,其特征在于,所述装置包括:
    接收器,所述接收器用于接收多个客户端发送的用户视角的信息,所述用户视角信息用于指示用户观看全向媒体数据时的视角;
    处理器,所述处理器用于根据全部的用户视角的信息确定目标视角;
    发送器,所述发送器用于发送所述目标视角对应的媒体数据。
  16. 一种媒体信息的处理装置,其特征在于,所述装置包括:
    接收器,所述接收器用于接收多个客户端发送的用户视角的信息,所述用户视角信息用于指示用户观看全向媒体数据时的视角;
    处理器,所述处理器用于根据全部的用户视角的信息确定目标视角,以及根据所述目标视角生成媒体数据的元数据信息。
PCT/CN2018/078540 2017-07-07 2018-03-09 一种媒体信息的处理方法及装置 WO2019007096A1 (zh)

Priority Applications (10)

Application Number Priority Date Filing Date Title
JP2020500115A JP2020526969A (ja) 2017-07-07 2018-03-09 メディア情報処理方法および装置
CA3069031A CA3069031A1 (en) 2017-07-07 2018-03-09 Media information processing method and apparatus
SG11201913532YA SG11201913532YA (en) 2017-07-07 2018-03-09 Media information processing method and apparatus
AU2018297439A AU2018297439A1 (en) 2017-07-07 2018-03-09 Method and apparatus for processing media information
BR112020000093-0A BR112020000093A2 (pt) 2017-07-07 2018-03-09 método e aparelho de processamento de informações de mídia
RU2020104035A RU2020104035A (ru) 2017-07-07 2018-03-09 Способ и устройство обработки мультимедийной информации
KR1020207002474A KR20200020913A (ko) 2017-07-07 2018-03-09 미디어 정보를 처리하는 방법 및 장치
EP18829059.7A EP3637722A4 (en) 2017-07-07 2018-03-09 METHOD AND APPARATUS FOR PROCESSING MULTIMEDIA INFORMATION
PH12020500015A PH12020500015A1 (en) 2017-07-07 2020-01-02 Media information processing method and apparatus
US16/734,682 US20200145716A1 (en) 2017-07-07 2020-01-06 Media information processing method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710551238.7 2017-07-07
CN201710551238.7A CN109218274A (zh) 2017-07-07 2017-07-07 一种媒体信息的处理方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/734,682 Continuation US20200145716A1 (en) 2017-07-07 2020-01-06 Media information processing method and apparatus

Publications (1)

Publication Number Publication Date
WO2019007096A1 true WO2019007096A1 (zh) 2019-01-10

Family

ID=64950588

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/078540 WO2019007096A1 (zh) 2017-07-07 2018-03-09 一种媒体信息的处理方法及装置

Country Status (12)

Country Link
US (1) US20200145716A1 (zh)
EP (1) EP3637722A4 (zh)
JP (1) JP2020526969A (zh)
KR (1) KR20200020913A (zh)
CN (1) CN109218274A (zh)
AU (1) AU2018297439A1 (zh)
BR (1) BR112020000093A2 (zh)
CA (1) CA3069031A1 (zh)
PH (1) PH12020500015A1 (zh)
RU (1) RU2020104035A (zh)
SG (1) SG11201913532YA (zh)
WO (1) WO2019007096A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113691883A (zh) * 2019-03-20 2021-11-23 北京小米移动软件有限公司 在vr360应用中传输视点切换能力的方法和装置
JP2022538799A (ja) * 2019-06-25 2022-09-06 北京小米移動軟件有限公司 パノラマメディア再生方法、機器及びコンピュータ読み取り可能な記憶媒体
US11831861B2 (en) * 2019-08-12 2023-11-28 Intel Corporation Methods for viewport-dependent adaptive streaming of point cloud content
CN111770182B (zh) * 2020-06-30 2022-05-31 北京百度网讯科技有限公司 数据推送方法和装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105578199A (zh) * 2016-02-22 2016-05-11 北京佰才邦技术有限公司 虚拟现实全景多媒体处理系统、方法及客户端设备
CN105898254A (zh) * 2016-05-17 2016-08-24 亿唐都科技(北京)有限公司 节省带宽的vr全景视频布局方法、装置及展现方法、系统
CN106331732A (zh) * 2016-09-26 2017-01-11 北京疯景科技有限公司 生成、展现全景内容的方法及装置
CN106341600A (zh) * 2016-09-23 2017-01-18 乐视控股(北京)有限公司 一种全景视频播放处理方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9997199B2 (en) * 2014-12-05 2018-06-12 Warner Bros. Entertainment Inc. Immersive virtual reality production and playback for storytelling content
CN106504196B (zh) * 2016-11-29 2018-06-29 微鲸科技有限公司 一种基于空间球面的全景视频拼接方法及设备
CN106846245B (zh) * 2017-01-17 2019-08-02 北京大学深圳研究生院 基于主视点的全景视频映射方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105578199A (zh) * 2016-02-22 2016-05-11 北京佰才邦技术有限公司 虚拟现实全景多媒体处理系统、方法及客户端设备
CN105898254A (zh) * 2016-05-17 2016-08-24 亿唐都科技(北京)有限公司 节省带宽的vr全景视频布局方法、装置及展现方法、系统
CN106341600A (zh) * 2016-09-23 2017-01-18 乐视控股(北京)有限公司 一种全景视频播放处理方法及装置
CN106331732A (zh) * 2016-09-26 2017-01-11 北京疯景科技有限公司 生成、展现全景内容的方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3637722A4 *

Also Published As

Publication number Publication date
RU2020104035A (ru) 2021-08-09
EP3637722A4 (en) 2020-07-15
CA3069031A1 (en) 2019-01-10
PH12020500015A1 (en) 2020-11-09
KR20200020913A (ko) 2020-02-26
CN109218274A (zh) 2019-01-15
US20200145716A1 (en) 2020-05-07
SG11201913532YA (en) 2020-01-30
EP3637722A1 (en) 2020-04-15
AU2018297439A1 (en) 2020-01-30
JP2020526969A (ja) 2020-08-31
BR112020000093A2 (pt) 2020-07-07

Similar Documents

Publication Publication Date Title
CN110121734B (zh) 一种信息的处理方法及装置
JP6735415B2 (ja) オーディオビジュアルコンテンツの観察点および観察向きの制御された選択のための方法および装置
RU2711591C1 (ru) Способ, устройство и компьютерная программа для адаптивной потоковой передачи мультимедийного контента виртуальной реальности
CN108965929B (zh) 一种视频信息的呈现方法、呈现视频信息的客户端和装置
CN109155873B (zh) 改进虚拟现实媒体内容的流传输的方法、装置和计算机程序
US11902350B2 (en) Video processing method and apparatus
TW201924323A (zh) 用於浸入式媒體資料之內容來源描述
CN109218755B (zh) 一种媒体数据的处理方法和装置
US11095936B2 (en) Streaming media transmission method and client applied to virtual reality technology
US20200145716A1 (en) Media information processing method and apparatus
WO2018058773A1 (zh) 一种视频数据的处理方法及装置
US20210250568A1 (en) Video data processing and transmission methods and apparatuses, and video data processing system
US20190230388A1 (en) Method and apparatus for processing video data
TW201909625A (zh) 用於在經由超文本傳輸協定(http)之動態自適應串流(dash)中之魚眼虛擬實境視訊之增強的高階發信號
WO2020107998A1 (zh) 视频数据的处理方法、装置、相关设备及存储介质
WO2018058993A1 (zh) 一种视频数据的处理方法及装置
WO2018120474A1 (zh) 一种信息的处理方法及装置
CN108271084B (zh) 一种信息的处理方法及装置
WO2019195460A1 (en) Associating file format objects and dynamic adaptive streaming over hypertext transfer protocol (dash) objects
WO2023169003A1 (zh) 点云媒体的解码方法、点云媒体的编码方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18829059

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 3069031

Country of ref document: CA

Ref document number: 2020500115

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112020000093

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 20207002474

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2018829059

Country of ref document: EP

Effective date: 20200107

ENP Entry into the national phase

Ref document number: 2018297439

Country of ref document: AU

Date of ref document: 20180309

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 112020000093

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20200103