WO2020062700A1 - 处理媒体数据的方法、客户端和服务器 - Google Patents

处理媒体数据的方法、客户端和服务器 Download PDF

Info

Publication number
WO2020062700A1
WO2020062700A1 PCT/CN2018/125807 CN2018125807W WO2020062700A1 WO 2020062700 A1 WO2020062700 A1 WO 2020062700A1 CN 2018125807 W CN2018125807 W CN 2018125807W WO 2020062700 A1 WO2020062700 A1 WO 2020062700A1
Authority
WO
WIPO (PCT)
Prior art keywords
overlay
overlay layer
area
trigger
layer
Prior art date
Application number
PCT/CN2018/125807
Other languages
English (en)
French (fr)
Inventor
范宇群
邸佩云
王业奎
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN202310199909.3A priority Critical patent/CN116248947A/zh
Priority to CN201880098171.9A priority patent/CN112771878B/zh
Priority to EP18935304.8A priority patent/EP3846481A4/en
Publication of WO2020062700A1 publication Critical patent/WO2020062700A1/zh
Priority to US17/214,056 priority patent/US20210218908A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/272Means for inserting a foreground image in a background image, i.e. inlay, outlay
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/4728End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for selecting a Region Of Interest [ROI], e.g. for requesting a higher resolution version of a selected region
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/172Processing image signals image signals comprising non-image signal components, e.g. headers or format information
    • H04N13/178Metadata, e.g. disparity information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/172Processing image signals image signals comprising non-image signal components, e.g. headers or format information
    • H04N13/183On-screen display [OSD] information, e.g. subtitles or menus
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • H04N21/4316Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations for displaying supplemental content in a region of the screen, e.g. an advertisement in a separate window
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/4722End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for requesting additional data associated with the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04842Selection of displayed objects or displayed text elements

Definitions

  • the present application relates to the technical field of streaming media transmission, and more particularly, to a method, a client, and a server for processing media data.
  • the ISO / IEC 23090-2 standard specification is also called the OMAF (Omnidirectional Media Format) panoramic specification.
  • This specification defines a media application format that can implement the presentation of panoramic media in applications. Mainly refers to panoramic video (360-degree video) and related audio.
  • the OMAF specification first specifies a list of projection methods that can be used to convert spherical video into two-dimensional video, and secondly, how to use ISO base media file format (ISOBMFF) to store panoramic media and the metadata associated with that media.
  • ISO base media file format ISO base media file format
  • the ISO basic media file format is composed of a series of boxes.
  • a box can also include other boxes.
  • the box includes a metadata box and a media data box.
  • the metadata box contains metadata.
  • Data, media data box includes media data
  • the metadata box and the media data box can be in the same file, or they can be in separate files; if the metadata with time attributes is used ISO basic media file format encapsulation, then, the metadata box contains metadata describing metadata with time attributes, and the media data box contains metadata with time attributes.
  • the basic data structure and carrying method of the overlay image are defined, but the display method for the overlay layer is relatively single and not flexible enough.
  • This application provides a method and apparatus for processing media data to provide more flexibility and diversity in overlay display.
  • a method for processing media data may include: obtaining area information associated with an overlay layer, and the area information associated with the overlay layer is used to indicate an area associated with the overlay layer. ; When a trigger operation is detected for an area associated with the overlay layer, the overlay layer is displayed.
  • the area information associated with the overlay layer may enable a user to switch between display of the overlay layer and display of the overlay layer being closed.
  • the method may be executed by a client.
  • the overlay layer is a video, an image, or a text for displaying on a background video or a background image.
  • the overlay layer may be an image for a panoramic video image.
  • the method may further include: obtaining a background video or a background image.
  • the displaying the overlay layer may include overlaying the overlay layer on the background video or background image, and displaying the superimposed video image.
  • the background video or the background image may be an image used for a panoramic video image.
  • the triggering operation for the area associated with the overlay layer may include a triggering operation for the area associated with the overlay layer of the background video or background image.
  • the method may further include displaying the background video or the background image when a trigger operation is not detected for the area associated with the overlay layer. It should be understood that, in order to exclude displaying the overlay layer, displaying the background video or background image may be displaying only the background video or background image.
  • the content displayed by the background video or background image may be a target object, and the content of the overlay layer may be text information of the target object.
  • the trigger operation on the area associated with the overlay layer is used to control the display of the overlay layer or to close the display. It should be understood that turning off display means not displaying.
  • the triggering operation for the area associated with the overlay layer may include: a click operation in the area associated with the overlay layer, or a user A trigger operation where the line of sight is located in an area associated with the overlay layer.
  • the trigger operation for the area associated with the overlay layer may include: a click operation in the area associated with the overlay layer.
  • the method may further include displaying whether to display the prompt information of the overlay layer by a click operation in an area associated with the overlay layer.
  • the area information associated with the overlay layer may be located in an overlay control structure.
  • the area information associated with the overlay layer may be Including location information of an area associated with the overlay.
  • the position information of the area associated with the overlay layer may include position information of a center point of the area associated with the overlay layer, or position information of an upper left corner point of the area associated with the overlay layer.
  • the area information associated with the overlay layer may include a width of the area associated with the overlay layer.
  • the area associated with the cover layer is high.
  • the area information associated with the overlay layer is planar area information or spherical area information.
  • the method may further include: acquiring trigger type information.
  • the trigger operation for the area associated with the overlay layer may include a trigger operation indicated by the trigger type information for the area associated with the overlay layer.
  • the trigger type information may enable the user to use different trigger operations to trigger the display of the overlay or close the display. It should be understood that the trigger type information is used to indicate a trigger type for triggering a trigger operation for displaying the overlay or closing the display.
  • the trigger type information is located in an overlay control structure.
  • the trigger type information is located in a media presentation description MPD.
  • the trigger type information may be attribute information of an overlay descriptor in the MPD.
  • the method may further include: acquiring a condition trigger identifier, When the value is the first preset value, it is detected whether there is a trigger operation for an area associated with the overlay layer.
  • the value of the condition trigger identifier may be a first preset value that may be used to indicate that the display of the overlay layer or the closed display is controlled by a trigger operation for triggering the display of the overlay layer or the closed display.
  • the value of the condition trigger identifier is a second preset value for indicating that the display or closing of the display of the overlay is not controlled by a trigger operation for triggering the display or closing of the overlay.
  • the value of the condition trigger identifier may be a second preset value, which may be used to indicate that the display or light ratio display of the overlay layer is not controlled by the trigger operation.
  • This condition-triggering flag further increases the variety of interactions regarding the overlay display.
  • condition trigger identifier may be located in an overlay control structure for user interaction control.
  • the area information associated with the overlay layer is located in a media presentation description MPD.
  • the area information associated with the overlay layer is an overlay descriptor in the MPD. Attribute information.
  • a method for processing media data may include: acquiring an overlay layer, a background video or a background image, and area information associated with the overlay layer, and the area information associated with the overlay layer is used to indicate The region associated with the overlay layer; obtaining an initial state identifier; and when the value of the initial state identifier indicates that the display of the overlay layer is turned off by default, perform the following operations:
  • the displaying the background video or the background image may be performed only when a triggering operation for the area associated with the overlay layer is not detected.
  • the initial state identification further increases the diversity of the display mode of the overlay.
  • the method may be executed by a client.
  • the overlay layer is a video, an image, or a text for displaying on a background video or a background image.
  • the overlay layer may be an image for a panoramic video image.
  • the background video or background image may be an image for a panoramic video image.
  • the content displayed by the background video or background image may be a target object, and the content of the overlay layer may be text information of the target object.
  • the trigger operation for the area associated with the overlay layer may include: a click operation within the area associated with the overlay layer, or a trigger operation where a user's eyes are located in the area associated with the overlay layer.
  • the method may also be performed in a case where the value of the initial state identifier indicates that the overlay is displayed by default, and the following operation is performed: the overlay is superimposed on the background video or background image, and after the overlay is displayed, Video image; when a trigger operation is detected for an area associated with the overlay layer, the background video or background image is displayed.
  • closing the display of the overlay layer by default can be understood as that the overlay layer is in a closed display state in an initial state, and the initial state identifier is based on a difference in value and is used to indicate that the overlay layer is displayed in an initial state State, or used to indicate that the overlay is in a closed display state in an initial state.
  • the trigger operation on the area associated with the overlay layer is used to control the display of the overlay layer or to close the display. It should be understood that turning off display means not displaying.
  • the triggering operation for the area associated with the overlay layer may include: a click operation within the area associated with the overlay layer, the The method may further include: if the value of the initial state identifier indicates that the display of the overlay layer is turned off by default, displaying whether to display prompt information of the overlay layer by a click operation in an area associated with the overlay layer. .
  • the second possible implementation manner of the second aspect only when it is detected that at least a part of the area associated with the overlay layer is within the scope of the current user's perspective Whether the display displays the prompt information of the overlay layer by a click operation in an area associated with the overlay layer.
  • the area information associated with the overlay layer and the initial state identifier are located in the overlay layer control.
  • Structure overlay control structure
  • the area information associated with the overlay layer and the initial state identifier are located in a media presentation description MPD.
  • the area information associated with the overlay layer and the initial state identifier are the overlay description in the MPD Word (overlay descriptor) attribute information.
  • the method may further include: acquiring trigger type information.
  • the trigger operation for the area associated with the overlay layer may include a trigger operation indicated by the trigger type information for the area associated with the overlay layer.
  • the trigger type information is used to indicate a trigger type for triggering a trigger operation for displaying the overlay or closing the display.
  • the trigger type information is located in an overlay control structure.
  • the trigger type information is located in a media presentation description MPD.
  • the trigger type information is attribute information of an overlay descriptor in the MPD.
  • the method may further include: obtaining a condition trigger identifier.
  • the value of the condition trigger identifier is a first preset value, it is detected whether there is a trigger operation for an area associated with the overlay layer.
  • the value of the condition trigger identifier is a first preset value for indicating that the display of the overlay layer or the closed display is controlled by a trigger operation for triggering the display of the overlay layer or the closed display.
  • the value of the condition trigger identifier is a second preset value for indicating that the display or closing of the display of the overlay is not controlled by a trigger operation for triggering the display or closing of the overlay.
  • the value of the condition trigger identifier may be a second preset value, which may be used to indicate that the display or light ratio display of the overlay layer is not controlled by the trigger operation.
  • condition trigger identifier is located in an overlay control structure for user interaction control.
  • a method for processing media data may include: obtaining an overlay layer, a background video or a background image, and area information associated with the overlay layer, and the area information associated with the overlay layer is used to indicate The area associated with the overlay layer; obtaining an initial state identifier; and in a case where the value of the initial state identifier indicates that the overlay layer is displayed by default, perform the following operations:
  • the overlay layer will be superimposed on the background video or background image, and the superimposed video image will be displayed; when a trigger operation is detected for the area associated with the overlay layer, the background video or background image is displayed.
  • the display superimposed video image may be executed only when a trigger operation for the area associated with the overlay layer is not detected.
  • the method may be executed by a client.
  • the overlay layer is a video, an image, or a text for displaying on a background video or a background image.
  • the overlay layer may be an image for a panoramic video image.
  • the background video or background image may be an image for a panoramic video image.
  • the content displayed by the background video or background image may be a target object, and the content of the overlay layer may be text information of the target object.
  • the trigger operation for the area associated with the overlay layer may include: a click operation within the area associated with the overlay layer, or a trigger operation where a user's eyes are located in the area associated with the overlay layer.
  • the default display of the overlay layer can be understood as that the overlay layer is in a display state in an initial state, and the initial state identifier is based on different values and is used to indicate that the overlay layer is in a display state in the initial state, or It is used to indicate that the cover layer is in a closed display state in an initial state.
  • the trigger operation on the area associated with the overlay layer is used to control the display of the overlay layer or to close the display. It should be understood that turning off display means not displaying.
  • the trigger operation for the area associated with the overlay layer may include: a click operation in the area associated with the overlay layer, the The method may further include: in a case where the value of the initial state identifier indicates that the overlay is displayed by default, displaying prompt information of whether to close the display of the overlay by a click operation in an area associated with the overlay. .
  • the second possible implementation manner of the third aspect only when it is detected that at least a part of the area associated with the overlay layer is within the scope of the current user's perspective, Performing a prompt message of whether the displaying closes the display of the overlay layer by a click operation in an area associated with the overlay layer.
  • the area information associated with the overlay layer and the initial state identifier are located in the overlay layer control.
  • Structure overlay control structure
  • the area information associated with the overlay layer and the initial state identifier are located in a media presentation description MPD.
  • the area information associated with the overlay layer and the initial state identifier are the overlay description in the MPD Word (overlay descriptor) attribute information.
  • the method may further include: acquiring trigger type information;
  • the trigger operation of the area associated with the layer may include a trigger operation indicated by the trigger type information for the area associated with the overlay layer.
  • the trigger type information is used to indicate a trigger type for triggering a trigger operation for displaying the overlay or closing the display.
  • the trigger type information is located in an overlay control structure.
  • the trigger type information is located in a media presentation description MPD.
  • the trigger type information is attribute information of an overlay descriptor in the MPD.
  • the method may further include: acquiring a condition trigger identifier, and triggering the condition trigger identifier When the value of is the first preset value, it is detected whether there is a trigger operation for an area associated with the overlay layer.
  • the value of the condition trigger identifier is a first preset value for indicating that the display of the overlay layer or the closed display is controlled by a trigger operation for triggering the display of the overlay layer or the closed display.
  • the value of the condition trigger identifier is a second preset value for indicating that the display or closing of the display of the overlay is not controlled by a trigger operation for triggering the display or closing of the overlay.
  • condition trigger identifier is located in an overlay control structure for user interaction control.
  • a method for processing media data may include: determining area information associated with an overlay layer, where the area information associated with the overlay layer is used to indicate the area associated with the overlay layer; and sending to the client. Area information associated with the overlay layer.
  • determining the area information associated with the overlay layer may be determined by detecting position information of a target object (for example, the target object may be a person). It can also be determined by detecting the area information input by the user.
  • the method may be executed by a server.
  • the trigger operation on the area associated with the overlay layer is used to control the display of the overlay layer or to close the display.
  • the trigger operation may include a click operation in an area associated with the overlay layer, or a trigger operation where a user's eyes are located in the area associated with the overlay layer.
  • the overlay layer is a video, an image, or a text for displaying on a background video or a background image (which may be at least a part of an area) for display.
  • the method may further include obtaining an overlay layer, encoding the overlay layer to obtain code stream data of the overlay layer, and sending the code stream data of the overlay layer to the client.
  • the overlay layer may be an image for a panoramic video image.
  • the method may further include obtaining a background video or a background image, encoding the background video or the background image to obtain code stream data of the background video or the background image, and sending the code stream data of the background video or the background image to the client.
  • the background video or the background image may be an image used for a panoramic video image.
  • the content displayed by the background video or background image may be a target object, and the content of the overlay layer may be text information of the target object.
  • the area information associated with the overlay layer is located in an overlay control structure.
  • the area information associated with the overlay layer may include the position of the area associated with the overlay layer information.
  • the position information of the area associated with the overlay layer may include position information of a center point of the area associated with the overlay layer, or position information of an upper left corner point of the area associated with the overlay layer.
  • the area information associated with the overlay layer may include information about the area associated with the overlay layer.
  • the width is the height of the area associated with the cover layer.
  • the area information associated with the overlay layer is planar area information or spherical area information.
  • the method may further include: sending trigger type information to the client, so The trigger type information is used to indicate a trigger type for triggering a trigger operation of the overlay display or closing the display.
  • the trigger type information is located in an overlay control structure.
  • the trigger type information is located in a media presentation description MPD.
  • the trigger type information is attribute information of an overlay descriptor in the MPD.
  • the method may further include: sending a condition trigger identifier to the client, so
  • the value of the condition trigger identifier is a first preset value for indicating that the display of the overlay layer or the closed display is controlled by a trigger operation for triggering the display of the overlay layer or the closed display.
  • the value of the condition trigger identifier is a second preset value for indicating that the display of the overlay layer or the closed display is not controlled by a trigger operation for triggering the display of the overlay layer or the closed display.
  • condition trigger identifier is located in an overlay control structure for user interaction control.
  • the area information associated with the overlay layer is located in a media presentation description MPD.
  • the area information associated with the overlay layer is an overlay descriptor in the MPD. ) Attribute information.
  • the method may further include: sending an initial state identifier to the client,
  • the initial state identifier is used to indicate that the cover layer is in a display state in an initial state, or is used to indicate that the cover layer is in a closed display state in an initial state.
  • the initial state identifier is located in an overlay control structure.
  • the initial state identifier is located in a media presentation description MPD.
  • the initial state identifier is an attribute of an overlay descriptor in the MPD information.
  • a client configured to include: an obtaining module, configured to obtain area information associated with an overlay layer and the overlay layer, and the area information associated with the overlay layer is used to indicate the overlay layer Associated area; a display module, configured to display the overlay layer when a trigger operation is detected for the area associated with the overlay layer.
  • the triggering operation for the area associated with the overlay layer may include: a click operation in the area associated with the overlay layer, or a user A trigger operation where the line of sight is located in an area associated with the overlay layer.
  • the area information associated with the overlay layer may be located in an overlay control structure.
  • the area information associated with the overlay layer may be Including location information of an area associated with the overlay.
  • the area information associated with the overlay layer may include a width of the area associated with the overlay layer.
  • the area associated with the cover layer is high.
  • the area information associated with the overlay layer is planar area information or spherical area information.
  • the acquiring module is further configured to acquire trigger type information.
  • the trigger operation for the area associated with the overlay layer may include a trigger operation indicated by the trigger type information for the area associated with the overlay layer.
  • the trigger type information is located in an overlay control structure.
  • the trigger type information is located in a media presentation description MPD.
  • the trigger type information may be attribute information of an overlay descriptor in the MPD.
  • the obtaining module is further configured to obtain a condition trigger identifier; the client may also A detection module is included for detecting whether there is a trigger operation for a region associated with the overlay layer when the value of the condition trigger identifier is a first preset value.
  • condition trigger identifier may be located in an overlay control structure for user interaction control.
  • the area information associated with the overlay layer is located in a media presentation description MPD.
  • the area information associated with the overlay layer is an overlay descriptor in the MPD. Attribute information.
  • a client configured to include: an obtaining module configured to obtain an overlay layer, a background video or a background image, and area information associated with the overlay layer, and the area information associated with the overlay layer is used For indicating an area associated with the overlay layer; obtaining an initial state identifier; and a display module, configured to perform the following operations when the value of the initial state identifier indicates that the display of the overlay layer is turned off by default:
  • the displaying the background video or the background image may be performed only when a triggering operation for the area associated with the overlay layer is not detected.
  • the triggering operation for the area associated with the overlay layer may include: a click operation within the area associated with the overlay layer, the The display module is further configured to: when the value of the initial state identifier indicates that the display of the overlay layer is turned off by default, display a prompt of whether to display the overlay layer by a click operation in an area associated with the overlay layer. information.
  • the detection is performed only when it is detected that at least a part of the area associated with the overlay layer is within the current user's perspective. Whether the display displays the prompt information of the overlay layer by a click operation in an area associated with the overlay layer.
  • the area information associated with the overlay layer and the initial state identifier are located in the overlay layer control.
  • Structure overlay control structure
  • the area information associated with the overlay layer and the initial state identifier are located in a media presentation description MPD.
  • the area information associated with the overlay layer and the initial state identifier are the overlay layer description in the MPD Word (overlay descriptor) attribute information.
  • the acquiring module is further configured to acquire trigger type information.
  • the trigger operation for the area associated with the overlay layer may include a trigger operation indicated by the trigger type information for the area associated with the overlay layer.
  • the trigger type information is located in an overlay control structure.
  • the trigger type information is located in a media presentation description MPD.
  • the trigger type information is attribute information of an overlay descriptor in the MPD.
  • the obtaining module is further configured to obtain a condition trigger identifier.
  • the client may further include a detection module for detecting whether there is a trigger operation for a region associated with the overlay layer when the value of the condition trigger identifier is a first preset value.
  • condition trigger identifier is located in an overlay control structure for user interaction control.
  • a client may include: an obtaining module, configured to obtain an overlay layer, a background video or a background image, and area information associated with the overlay layer, and the area information associated with the overlay layer is used. Indicating an area associated with the overlay layer; obtaining an initial status identifier; and a display module, configured to perform the following operations when the value of the initial status identifier indicates that the overlay layer is displayed by default:
  • the overlay layer will be superimposed on the background video or background image, and the superimposed video image will be displayed; when a trigger operation is detected for the area associated with the overlay layer, the background video or background image is displayed.
  • the display superimposed video image may be executed only when a trigger operation for the area associated with the overlay layer is not detected.
  • the triggering operation for the area associated with the overlay layer may include: a click operation within the area associated with the overlay layer, the The display module is further configured to: when the value of the initial state identifier indicates that the overlay layer is displayed by default, display a prompt as to whether to close the display of the overlay layer by a click operation in an area associated with the overlay layer. information.
  • the display module at least a part of the region associated with the detection of the overlay layer, is located within a current user's perspective Only when the display is performed is a prompt message indicating whether to close the display of the overlay layer by a click operation in the area associated with the overlay layer.
  • the area information associated with the overlay layer and the initial state identifier are located in the overlay layer control.
  • Structure overlay control structure
  • the area information associated with the overlay layer and the initial state identifier are located in a media presentation description MPD.
  • the area information associated with the overlay layer and the initial state identifier are the overlay description in the MPD Word (overlay descriptor) attribute information.
  • the obtaining module is further configured to: obtain trigger type information;
  • the trigger operation of the area associated with the overlay layer may include a trigger operation indicated by the trigger type information for the area associated with the overlay layer.
  • the trigger type information is located in an overlay control structure.
  • the trigger type information is located in a media presentation description MPD.
  • the trigger type information is attribute information of an overlay descriptor in the MPD.
  • the obtaining module is further configured to: obtain a condition trigger identifier; the client further A detection module may be included to detect whether there is a trigger operation for an area associated with the overlay layer when the value of the condition trigger identifier is a first preset value.
  • condition trigger identifier is located in an overlay control structure for user interaction control.
  • a server may include: a determining module configured to determine area information associated with an overlay layer, the area information associated with the overlay layer being used to indicate an area associated with the overlay layer; a sending module, It is used to send to the client the area information associated with the overlay layer.
  • the area information associated with the overlay layer is located in an overlay control structure.
  • the area information associated with the overlay layer may include a position of the area associated with the overlay layer. information.
  • the area information associated with the overlay layer may include information about the area associated with the overlay layer.
  • the width is the height of the area associated with the cover layer.
  • the area information associated with the overlay layer is planar area information or spherical area information.
  • the sending module is further configured to: send trigger type information to the client,
  • the trigger type information is used to indicate a trigger type for a trigger operation for triggering the display of the overlay or closing the display.
  • the trigger type information is located in an overlay control structure.
  • the trigger type information is located in a media presentation description MPD.
  • the trigger type information is attribute information of an overlay descriptor in the MPD.
  • the sending module is further configured to send a condition trigger identifier to the client,
  • the value of the condition trigger identifier is a first preset value for indicating that the display of the overlay layer or the closed display is controlled by a trigger operation for triggering the display of the overlay layer or the closed display.
  • the value of the condition trigger identifier is a second preset value for indicating that the display of the overlay layer or the closed display is not controlled by a trigger operation for triggering the display of the overlay layer or the closed display.
  • condition trigger identifier is located in an overlay control structure for user interaction control.
  • the area information associated with the overlay layer is located in a media presentation description MPD.
  • the area information associated with the overlay layer is an overlay descriptor in the MPD. ) Attribute information.
  • the sending module is further configured to: send an initial status identifier to the client
  • the initial state identifier is used to indicate that the cover layer is in a display state in an initial state, or is used to indicate that the cover layer is in a closed display state in the initial state.
  • the initial state identifier is located in an overlay control structure.
  • the initial state identifier is located in a media presentation description MPD.
  • the initial state identifier is an attribute of an overlay descriptor in the MPD information.
  • a client which may include: a non-volatile memory and a processor coupled to each other; wherein the processor is configured to call program code stored in the memory to execute the first aspect or the second aspect Part or all of the steps of the method in any one of the aspects or the third aspect.
  • a server which may include: a non-volatile memory and a processor coupled to each other; wherein the processor is configured to call program code stored in the memory to execute any one of the fourth aspect Some or all steps of the method in one implementation.
  • a computer-readable storage medium stores program code, where the program code may include instructions for executing the first aspect, the second aspect, the third aspect, and the first aspect. Instructions for implementing some or all steps of the method in any one of the four aspects.
  • a computer program product is provided, and when the computer program product runs on a computer, the computer is caused to execute any one of the first aspect, the second aspect, the third aspect, and the fourth aspect. Instruction for some or all steps of a method in a manner.
  • FIG. 1 is a schematic structural diagram of a media data processing system according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of an implementation scenario provided by an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of a method for processing media data according to an embodiment of the present application
  • FIG. 4 is a schematic flowchart of a method for processing media data according to an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of a method for processing media data according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
  • FIG. 7 is a schematic block diagram of a client according to an embodiment of the present invention.
  • FIG. 8 is a schematic block diagram of a client according to an embodiment of the present invention.
  • FIG. 9 is a schematic block diagram of a client according to an embodiment of the present invention.
  • FIG. 10 is a schematic block diagram of a server according to an embodiment of the present invention.
  • Panoramic video Also known as 360-degree panoramic video, or omnidirectional video, it consists of a series of panoramic pictures. The content of the panoramic picture covers the entire sphere surface in three-dimensional space. With the rapid development of Virtual Reality (VR) technology, panoramic video has become more and more widely used. VR technology based on 360-degree panoramic video can create a simulation environment and bring interactive 3D dynamics to users. Visual experience. Panoramic video is composed of a series of panoramic images. These panoramic images can be generated by computer rendering, or video images taken by multiple cameras from multiple different angles can be stitched by a stitching algorithm. Generally, when watching a panoramic video, the image content viewed by the user at each moment only takes up a small part of the entire panoramic image. In order to save transmission bandwidth, when providing a panoramic image to the user through a remote server, Users transmit what they watch at every moment.
  • VR Virtual Reality
  • Track refers to a series of time-based samples in accordance with the ISO Basic Media File Format (ISOBMFF) packaging method.
  • ISOBMFF ISO Basic Media File Format
  • video track video samples are obtained by encapsulating the bitstream generated by the video encoder after encoding each frame according to the ISOBMFF specification.
  • the trajectory is defined in the standard ISO / IEC 14496-12 as: "timed sequence of related samples (q.v.) in ISO file”, the Chinese translation of this definition is “temporal sequence of related samples in ISO media files”.
  • a track is a sequence of images or audio samples; for a cue track, a track corresponds to a stream channel (For media data, a track to a sequence of images, or a sampled audio; for hint tracks, a track, and tracks to streaming channel).
  • a sample In a track, there are no two samples corresponding to the same timestamp (No two samples, track track can share the same time-stamp).
  • a sample In a non-hint track, a sample can be a video frame or a series of video frames in decoding order, or a compressed audio frame; in a hue track, a sample defines the format of one or more stream packets (In non-hint tracks, a sample sample, for example, an individual frame of video, a series of video frames in coding order, or a compressed section, in audio coding in order; in hint tracks, a free sample, definitions, and order formation. or more streaming packets).
  • Sample entry The sample entry is used to describe the format of the sample, and the type of the sample entry determines the decoding method of the sample.
  • MMT MPEG Media Transport, which defines the encapsulation format, transmission protocol and message sending mechanism of multimedia services based on packet transmission networks.
  • the track may include a metadata box (moov box) and / or a media data box (mdat box).
  • box in the ISO / IEC 14496-12 standard is: "object-oriented building block defined by unique type identifier and length”, the Chinese translation of this definition is “object-oriented building block, which consists of a unique type identifier and Length definition. "
  • box is called “atomic” in some specifications, and can include the first definition of MP4 (Called “atom” in some specifications, including the first definition of MP4).
  • SEI Supplementary Enhancement Information
  • a network access unit (Network Abstract Abstract Layer Unit) defined in the video codec standards h.264 and h.265 issued by the International Telecommunication Union (ITU). NALU).
  • Timed metadata track Time-related information metadata stream.
  • Overlay An overlay that overlays a layer of video or picture or text (which can have temporal properties) on top of a background video or a certain area of the background image. (piece of visual media rendered render over omnidirectional video video image image or viewport)
  • MPD Media Presentation Description
  • the MPD includes one or more period elements. Each period element may include one or more adaptation sets. Each adaptation set may include one or more representations. Each representation It can include one or more segments, and the client selects a representation based on the information in the MPD, and constructs the http-URL of the segment to request the corresponding segment.
  • the OMAF standard specifies a timed metadata track for a region on a sphere with a time attribute.
  • the metadata box in the metadata track contains the metadata describing the spherical surface.
  • the metadata box describes the intention of the metadata track with time attributes, that is, what the spherical area is used for.
  • the OMAF standard describes two types of metadata tracks with temporal attributes: the recommended view metadata track (the recommended viewport timed metadata track) and the initial view point track (the original viewpoint timed metadata track). Among them, the recommended perspective track describes the area of the perspective recommended for presentation to the client, and the initial perspective track describes the initial presentation direction when the panoramic video is viewed.
  • sample Entry The format of the spherical area sample entry (Sample Entry) specified in the existing OMAF standard is as follows:
  • Shape_type used to describe the shape type of the spherical area
  • dynamic_range_flag When the value is 0, it means that the area has the same horizontal and vertical range. When the value is 1, it means that the area's horizontal and vertical areas are described in the sample.
  • static_azimuth_range azimuth coverage of the area
  • static_elevation_range the elevation coverage of the area
  • num_regions the number of regions in the metadata track.
  • OMAF defines two types of spherical area shapes. One is a shape formed by combining four large circles (Azimuth Circle), and its shape_type value is 0. The other is a combination of two large circles and two small circles (Elevation Circle). The shape has a shape_type value of 1.
  • the spherical sample format defined in the existing OMAF standard is defined as follows:
  • the existing OMAF standard defines a method for representing a region on a sphere, and the specific syntax is defined as follows:
  • center_azimuth, center_elevation the position of the center point of the spherical area
  • Center_tilt the tilt angle of the area
  • azimuth_range azimuth coverage of the area
  • Elevation_range The elevation coverage of the area.
  • the basic data structure and carrying method of the overlay are defined in the existing OMAF standard.
  • the format of the data structure used to represent the overlay is defined as follows:
  • OverlayStruct () includes information about the overlay. OverlayStruct () can be located in the media data box or in the metadata box.
  • SingleOverlayStruct () defines an overlay.
  • num_overlays defines the number of overlays described in the structure. The value of num_overlays is 0 reserved.
  • num_flag_bytes defines how many bytes the overlay_control_flag [i] element occupies in total. The value of num_flag_bytes is 0 reserved.
  • overlay_id represents a unique identification of the overlay. Two different overlays cannot have the same overlay_id value.
  • overlay_control_flag [i] When the value is 1, it means that the structure defined by the i-th overlay_control_struct [i] will appear.
  • the OMAF player should support all possible values of overlay_control_flag [i] for all i values.
  • overlay_control_essential_flag When the value is 0, it means that the OMAF player does not need to process the structure defined by the i-th overlay_control_struct [i].
  • overlay_control_essential_flag [i] When the value is 1, it indicates that the OMAF player needs to process the structure defined by the i-th overlay_control_struct [i]. When the value is 1 and the OMAF player does not have the ability to process the structure defined by the i-th overlay_control_struct [i], the OMAF player should not display overlays and background video streams.
  • byte_count [i] represents the number of bytes occupied by the i-th overlay_control_struct [i] structure.
  • overlay_control_struct [i] [byte_count [i]] defines the ith structure with the number of bytes represented by byte_count [i], where each structure can be called an overlay control structure.
  • Each An overlay control structure describes the different attributes of the overlay.
  • the overlay_control_struct defines the overlay display area, content source, priority, transparency, and other attributes.
  • the specific attributes are as follows:
  • the parameter is defined to indicate that the overlay display position is relative to the user's viewport.
  • the parameter is defined to indicate that the overlay display position is relative to the panoramic sphere.
  • the parameter indicates that the position of the 2D overlay display is relative to the panoramic sphere.
  • a parameter is defined to indicate the content source of the overlay (overlay). This structure indicates that the content source of the overlay is derived from the decoded image.
  • a parameter is defined to indicate the content source of the overlay, and this structure indicates that the content source of the overlay is from a recommended perspective.
  • 0-2 indicates the rendering position of the overlay
  • 3 and 4 indicate where the content of the overlay comes from.
  • the above syntax defines a representation method and related parameters of one or more overlays.
  • the overlay is static, its related structure OverlayStruct is carried in the Overlay Configuration Box, where the Overlay Configuration Box is located in the media data track.
  • the overlay is dynamic, its related structure OverlayStruct is carried in the sample entry and sample of the overlay timed metadata.
  • the OMAF standard also defines the format of the overlay in DASH and MPD (Media Presentation Description).
  • the OMAF standard defines the overlay descriptor in the MPD. Its @schemeIdUri is "urn: mpeg: mpegI: omaf: 2018: ovly ", at most one of the descriptors can appear in the adaptation set of the MPD, and is used to indicate the overlay associated with the adaptation.
  • an attribute value is used to indicate the identity of the overlay, as follows:
  • this application proposes a method for processing media data. By carrying the area information associated with the overlay layer, it can support conditional display of the overlay layer, and thus can display the overlay layer more flexibly.
  • conditional display or the conditional trigger display refers to displaying or turning off the display after a trigger operation is detected.
  • FIG. 1 is a schematic structural diagram of a media data processing system according to an embodiment of the present invention.
  • the media data processing system may include a server 10 and a client 20.
  • Server 10 may include a pre-encoding processor, a video encoder, a stream encapsulation device (can be used to generate MPD, of course, server 10 may also include additional components to generate MPD), and at least one of a transmission device for transmitting panoramic video Perform pre-processing, encoding, or transcoding operations, and at the same time encapsulate the encoded stream data into a transportable file and transmit it to the client or content distribution network via the network; in addition, the server can respond to the information returned by the client (Such as a user perspective, a segmentation request established based on the MPD sent by the server 10, etc.), selecting content to be transmitted for signal transmission.
  • a pre-encoding processor may include a video encoder, a stream encapsulation device (can be used to generate MPD, of course, server 10 may also include additional components to generate MPD), and at least one of a transmission device for transmitting panoramic video Perform pre-processing, encoding, or transcoding operations, and at the same time encapsulate the encode
  • the pre-encoding processor may be used to perform pre-processing operations such as cropping, color format conversion, color correction, or denoising of the panoramic video image.
  • the video encoder may be used to encode (may include partitioning) the obtained video image to form code stream data.
  • the code stream encapsulation device may be used to encapsulate code stream data and corresponding metadata into a file format for transmission or storage, for example, an ISO basic media file format.
  • the sending and transmitting device may be an input / output interface or a communication interface, and may be used to send encapsulated code stream data, MPD and media data transmission related information to the client.
  • the transmitting and transmitting device may also be a receiving device.
  • the receiving device may be an input / output interface or a communication interface, and may be used to receive segment request information, user perspective information, or other media data transmission related information sent by the client 20.
  • the server 10 may use the receiving device to obtain a panoramic video image, and may also include an image source.
  • the image source may be a camera or an imaging device, etc., for generating a panoramic video image.
  • Client 20 It can be VR glasses, mobile phone, tablet, TV, computer and other electronic devices that can be connected to the network.
  • the client 20 receives the MPD or media data sent by the server 10, and performs stream decapsulation, decoding, and display.
  • the client 20 may include at least one of a receiving device, a code stream decapsulating device, a video decoder, and a display device.
  • the receiving device may be an input / output interface or a communication interface, and may be used to receive encapsulated code stream data, MPD and media data transmission related information.
  • the code stream decapsulating device can be used to obtain the required code stream data and corresponding metadata.
  • the video decoder can be used to decode the video image according to the corresponding metadata and code stream data.
  • the display device may be used for displaying a video image, or displaying a video image according to corresponding metadata.
  • the receiving device may also be a sending device, configured to send user perspective information, other media data transmission-related information, or send segment request information according to the MPD to the server 10.
  • the receiving device may also receive instructions from the user.
  • the receiving device may be an input interface connected to a mouse.
  • the display device may also be a touch display frequency, which is used to receive user instructions at the same time as the displayed video image to achieve interaction with the user.
  • pre-encoding processor video encoder, code stream packaging device, code stream decapsulation device or video decoder may be implemented by the processor reading instructions in the memory and executing the instructions, or may be implemented by a chip circuit.
  • the method for processing media data provided by the embodiment of the present invention may be applied to the server 10 or the client 20.
  • the server 10 may put a code stream to encapsulate a video file according to the description of the overlay area associated with the encoded overlay image association. Format or MPD description.
  • the client 20 may use the corresponding decapsulating device for the code stream to obtain the encapsulated information about the area associated with the overlay layer overlay, thereby guiding the client player (which may include a receiving device, a code stream decapsulating device, and video decoding). At least one of the device) uses the display device to conditionally display the overlay image and display the background video or the background image.
  • conditional overlay refers to an overlay that is displayed or closed only when a trigger operation is detected for an area associated with the overlay.
  • the conditional display of the overlay layer means that the overlay layer is displayed or turned off only when a trigger operation is detected for an area associated with the overlay layer.
  • FIG. 2 is a schematic diagram of an implementation scenario provided by an embodiment of the present invention.
  • the embodiment of the present invention is applied to the description of a conditional overlay, and a user can click on a background video or a background image to click
  • the area associated with the overlay is used to switch the display of the overlay.
  • the user can trigger an overlay describing the attributes of the player, such as name and age, by clicking on the area where the player appears.
  • FIG. 3 is a schematic flowchart of a method for processing media data according to an embodiment of the present application.
  • the method shown in FIG. 3 may be executed by a client.
  • the client may be a program provided on the client device to provide a video playback service for the client, and the client may be a device having a function of playing a panoramic video, for example, a VR device.
  • the method shown in FIG. 3 may include steps 310 and 320. Steps 310 and 320 are described in detail below.
  • the coverage layer can be obtained by first obtaining the MPD, and then obtaining the coverage layer from the server through the MPD.
  • the coverage layer can also be obtained from the server through the MPD, which is not limited herein.
  • the overlay layer is a video, an image, or a text for displaying on a background video or a background image (which may be a background video or at least a part of the background image) for display.
  • the trigger operation for the area associated with the overlay layer may include: a click operation within the area associated with the overlay layer, or a trigger operation where a user's eyes are located in the area associated with the overlay layer.
  • the area information associated with the overlay layer may be located in an overlay control structure.
  • the area information associated with the overlay layer may be located in a new overlay control structure different from the nine overlay control structures described above, and its name may be AssociatedSphereRegionStruct.
  • the area information associated with the overlay layer may also be located in the media presentation description MPD.
  • the area information associated with the overlay layer is attribute information of an overlay descriptor in the MPD.
  • the @schemeIdUri of the overlay descriptor can be "urn: mpeg: mpegI: omaf: 2018: ovly".
  • the descriptor can appear in the adaptive set of the MPD and is used to indicate the overlay associated with the adaptive set ( overlay).
  • the area information associated with the cover layer is planar area information or spherical area information.
  • the area information associated with the overlay layer may include position information of the area associated with the overlay layer.
  • the location information of the area associated with the overlay layer may include the following situations:
  • the area associated with the cover layer is a spherical area
  • the area information associated with the cover layer is a spherical coordinate value of a center point of the area associated with the cover layer.
  • the spherical coordinate value of the center point of the current perspective is (X, Y, Z), where X corresponds to the azimuth or yaw angle of spherical coordinates, and Y corresponds to the pitch angle (pitch or yaw) of spherical coordinates. elevation), Z corresponds to the tilt angle or roll angle of spherical coordinates.
  • the area associated with the overlay layer is a planar area, and the area information associated with the overlay layer is a plane coordinate value of a center point of the area associated with the overlay layer.
  • the two-dimensional coordinate values of the center points of the areas associated with the cover layer are (X, Y), where X and Y respectively represent the horizontal points of the center points of the areas associated with the cover layer in a two-dimensional rectangular coordinate system. Coordinates and ordinates.
  • the area associated with the cover layer is a flat area, and the area information associated with the cover layer is a two-dimensional coordinate value of the upper left corner / upper right corner / lower left corner / lower right corner of the area associated with the overlay layer.
  • the area associated with the overlay layer is a flat area
  • the area information associated with the overlay layer is the two-dimensional coordinate value of the upper-left corner of the area associated with the overlay layer, where X and Y are respectively It indicates that the area associated with the overlay layer is a flat area, and the area information associated with the overlay layer is the abscissa and ordinate of the upper-left corner of the overlay-associated area in a two-dimensional rectangular coordinate system.
  • the area information associated with the overlay layer may include the width of the area associated with the overlay layer and the height of the area associated with the overlay layer.
  • the width of the area associated with the overlay layer and the overlay layer The height of the associated area can include the following situations:
  • the area associated with the cover layer is a spherical area, and the azimuth range (yaw angle range) and pitch range of the area associated with the cover layer.
  • the azimuth range (yaw angle range) of the area associated with the cover layer is 110 degrees
  • the pitch angle range is 90 degrees
  • the area associated with the cover layer is a flat area, and the coverage of the area associated with the cover layer may include the width and height of the area associated with the cover layer.
  • the method may further include: acquiring trigger type information; the trigger operation for the area associated with the overlay layer may include a trigger operation indicated by the trigger type information for the area associated with the overlay layer.
  • the trigger type information is located in an overlay control structure.
  • it can be the AssociatedSphereRegionStruct.
  • the trigger type information may also be located in a media presentation description MPD.
  • the trigger type information is attribute information of an overlay descriptor in the MPD.
  • the @schemeIdUri of the overlay descriptor can be "urn: mpeg: mpegI: omaf: 2018: ovly".
  • the method may further include: acquiring a condition trigger identifier, and detecting whether there is a trigger operation for an area associated with the overlay layer when the value of the condition trigger identifier is a first preset value.
  • the value of the condition trigger identifier may be a first preset value, which may be used to indicate that the display of the overlay layer or the closed display is controlled by a trigger operation for triggering the display of the overlay layer or the closed display.
  • the value of the condition trigger identifier may be a second preset value, which may be used to indicate that the display or closing of the display of the overlay is not controlled by a trigger operation for triggering the display or closing of the overlay.
  • the condition trigger identifier is located in an overlay control structure for user interaction control.
  • the condition trigger identifier may be an overlay control structure corresponding to a bit index of 7 (for example, OverlayInteraction).
  • FIG. 4 is a schematic flowchart of a method for processing media data according to an embodiment of the present application.
  • the method shown in FIG. 4 may be executed by a client.
  • the client may be a program provided on the client device to provide video playback services for the client, and the client may be a device having a function of playing a panoramic video, for example, a VR device.
  • the method shown in FIG. 4 may include steps 410, 420, 430, and 440. Steps 410, 420, 430, and 440 are described in detail below.
  • the area information associated with the overlay layer is used to indicate an area associated with the overlay layer.
  • the displaying the background video or the background image may be performed only when a triggering operation for the area associated with the overlay layer is not detected.
  • the overlay layer is superimposed on the background video or background image, and the superimposed video image is displayed.
  • the following operations may be performed:
  • the overlay layer will be superimposed on the background video or background image, and the superimposed video image will be displayed; when a trigger operation is detected for the area associated with the overlay layer, the background video or background image is displayed.
  • the display superimposed video image may be executed only when a trigger operation for the area associated with the overlay layer is not detected.
  • the area information associated with the overlay layer and the initial state identifier may be located in an overlay control structure.
  • the area information associated with the overlay layer and the initial state identifier may be located in the associated area structure (AssociatedSphereRegionStruct) above.
  • the area information associated with the overlay layer and the initial state identifier may be located in a media presentation description MPD.
  • the area information associated with the overlay layer and the initial state identifier may be attribute information of an overlay descriptor in the MPD.
  • the @schemeIdUri of the overlay descriptor can be "urn: mpeg: mpegI: omaf: 2018: ovly".
  • the method may further include: acquiring trigger type information; the trigger operation for the area associated with the overlay layer may include a trigger operation indicated by the trigger type information for the area associated with the overlay layer.
  • the trigger type information is located in an overlay control structure.
  • it can be the AssociatedSphereRegionStruct.
  • the trigger type information may also be located in a media presentation description MPD.
  • the trigger type information is attribute information of an overlay descriptor in the MPD.
  • the @schemeIdUri of the overlay descriptor can be "urn: mpeg: mpegI: omaf: 2018: ovly".
  • the method may further include: acquiring a condition trigger identifier, and detecting whether there is a trigger operation for an area associated with the overlay layer when the value of the condition trigger identifier is a first preset value.
  • the value of the condition trigger identifier may be a first preset value, which may be used to indicate that the display of the overlay layer or the closed display is controlled by a trigger operation for triggering the display of the overlay layer or the closed display.
  • the value of the condition trigger identifier may be a second preset value, which may be used to indicate that the display of the overlay layer or the closed display is not controlled by a trigger operation for triggering the display of the overlay layer or the closed display.
  • the condition trigger identifier is located in the overlay control structure for user interaction control. For example, it can be the overlay control structure (OverlayInteraction) corresponding to the above when the bit index is 7.
  • FIG. 5 is a schematic flowchart of a method for processing media data according to an embodiment of the present application.
  • the method shown in FIG. 5 may be executed by a server.
  • the method shown in FIG. 5 may include steps 510 and 520. Steps 510 and 520 are described in detail below.
  • the information of the area can be determined by acquiring the area marked by the user. It is also possible to determine the area information of the area that can include the object graphic or character graphic by identifying the object graphic or character graphic corresponding to the overlay in the background video or background image.
  • the trigger operation on the area associated with the overlay layer is used to control the display of the overlay layer or to close the display.
  • the trigger operation may include a click operation in an area associated with the overlay layer, or a trigger operation where a user's eyes are located in the area associated with the overlay layer.
  • the overlay layer is a video, an image, or a text for displaying on a background video or a background image (which may be a background video or at least a part of the background image) for display.
  • the method may further include obtaining an overlay layer, encoding the overlay layer to obtain code stream data of the overlay layer, and sending the code stream data of the overlay layer to the client.
  • the overlay layer may be an image for a panoramic video image.
  • the method may further include obtaining a background video or a background image, encoding the background video or the background image to obtain code stream data of the background video or the background image, and sending the code stream data of the background video or the background image to the client.
  • the background video or the background image may be an image used for a panoramic video image.
  • the content displayed by the background video or background image may be a target object, and the content of the overlay layer may be text information of the target object.
  • the area information associated with the overlay layer may be located in an overlay control structure.
  • it can be the AssociatedSphereRegionStruct.
  • the area information associated with the overlay layer may also be located in the media presentation description MPD.
  • the area information associated with the overlay layer is attribute information of an overlay descriptor in the MPD.
  • the @schemeIdUri of the overlay descriptor can be "urn: mpeg: mpegI: omaf: 2018: ovly" above.
  • the area information associated with the overlay layer may include position information of the area associated with the overlay layer.
  • the position information of the area associated with the overlay layer may include the position information of the center point of the area associated with the overlay layer, or the position information of the upper left corner point of the area associated with the overlay layer.
  • the area information associated with the overlay layer may include a width of the area associated with the overlay layer and a height of the area associated with the overlay layer.
  • the area information associated with the cover layer is planar area information or spherical area information.
  • the method may further include: sending trigger type information to the client, where the trigger type information is used to indicate a trigger type of a trigger operation for triggering the display of the overlay layer or closing the display.
  • the trigger type information is located in an overlay control structure.
  • it can be the AssociatedSphereRegionStruct.
  • the trigger type information may also be located in a media presentation description MPD.
  • the trigger type information is attribute information of an overlay descriptor in the MPD.
  • the @schemeIdUri of the overlay descriptor can be "urn: mpeg: mpegI: omaf: 2018: ovly".
  • the method may further include: sending a condition trigger identifier to the client, where the value of the condition trigger identifier is a first preset value for indicating the display of the overlay layer or turning off the display for triggering
  • the trigger operation control for overlay display or close display is described.
  • the value of the condition trigger identifier is a second preset value for indicating that the display of the overlay layer or the display is turned off is not controlled by a trigger operation for triggering the display of the overlay layer or the display is turned off.
  • condition trigger identifier is located in an overlay control structure for user interaction control.
  • condition trigger identifier may be an overlay control structure (OverlayInteraction) corresponding to a bit index of 7 above.
  • the method may further include: sending an initial status identifier to the client, where the initial status identifier is used to indicate that the overlay layer is in a display state in an initial state, or is used to indicate that the overlay layer is in a The display is off in the initial state.
  • the initial state identifier may be located in an overlay control structure.
  • the initial state identifier may be located in the associated region structure (AssociatedSphereRegionStruct) above.
  • the initial state identifier may be located in a media presentation description MPD.
  • the initial state identifier may be attribute information of an overlay descriptor in the MPD.
  • the @schemeIdUri of the overlay descriptor can be "urn: mpeg: mpegI: omaf: 2018: ovly".
  • an overlay control structure required in FIG. 3 to FIG. 5 is newly defined, and a spherical area associated with the overlay is defined in the structure to represent a user. Clickable background video or spherical area in the background image.
  • the client detects that the overlay control structure appears in the code stream, it further analyzes the spherical area defined in the structure, so that when the user clicks on the area, the overlay associated with the area is triggered. ).
  • an overlay control structure is newly defined as AssociatedSphereRegionStruct, and the specific syntax is as follows:
  • SphereRegionStruct (1) defines a spherical area associated with the overlay. Area information associated with the overlay layer may be included therein, and may specifically be spherical area information.
  • the user can click on the area to trigger the display or close of the overlay associated with it.
  • the server-side steps are:
  • Step 1 The server obtains the panoramic video code stream and the corresponding one or more overlay content code streams.
  • Step 2 In the video encapsulator (stream encapsulation device), encapsulate according to the video file format.
  • the overlay control structure defined above is used to indicate that when the spherical area associated with the overlay appears within the user's perspective, the user can click the area to trigger the display of the overlay associated with it or close the display.
  • Step 3 The encapsulated code stream is sent to a transmission device for signal transmission and transmission.
  • the client steps are:
  • Step 1 The receiving device obtains the code stream after the panoramic video content is encapsulated.
  • Step 2 The code stream is sent to a code stream decapsulating device, which is decapsulated and analyzed.
  • the code stream decapsulating device searches for and parses the control structure AssociatedSphereRegionStruct of the overlay, and learns that the overlay is triggered or closed by the user clicking on its associated area.
  • Step 3 When the video decoding is played on the display device, the client configuration or the user interface prompt may include a prompt for the user to trigger or close the display for the overlay.
  • a spherical area is described in a newly defined overlay control structure, and the spherical area is associated with an overlay, so that the user can click the spherical area to control the associated spherical area.
  • Overlay display is described in a newly defined overlay control structure, and the spherical area is associated with an overlay, so that the user can click the spherical area to control the associated spherical area.
  • the embodiment of the present invention supports display of conditional overlays by adding a new structure definition, so that users can perform operations and display in a personalized manner.
  • a new overlay control structure required in FIG. 3 to FIG. 5 is newly defined as an AssociatedSphereRegionStruct, and a 2D area (planar area) is defined therein to represent When the user clicks into the area, the display of the overlay associated with the area is triggered.
  • 2DRegionStruct () defines a two-dimensional area associated with the overlay. Area information associated with the overlay layer may be included therein, and specifically may be planar area information.
  • object_x and object_y may be position information of a region associated with the overlay layer, and indicate the x, y position of the upper-left corner of the region in the two-dimensional coordinates of the background VR stream content (background video or background image).
  • object_width and object_height may be the width of the area associated with the overlay layer and the height of the area associated with the overlay layer, indicating the width and height of the area in the two-dimensional coordinates of the background VR stream content.
  • An embodiment of the present invention provides that a region associated with an overlay is represented by a two-dimensional coordinate system (planar coordinate system).
  • an overlay control structure In this embodiment, as in the first embodiment, we newly define an overlay control structure.
  • a spherical area associated with the overlay is defined to represent a user clickable area. Spherical area in the background VR video stream.
  • the client detects the overlay control structure in the code stream, it further analyzes the spherical area defined in the structure, so that the overlay associated with the area can be triggered when the user clicks on the area.
  • the overlay control structure we define a flag to mark the initial state of the overlay, which is used to indicate that the overlay defaults when the user does not perform any operations. Whether to display or not display.
  • an overlay control structure is newly defined as AssociatedSphereRegionStruct, and the specific syntax is as follows:
  • initial_status defines a flag, which can be the initial status identifier above, and indicates whether the overlay is displayed by default.
  • SphereRegionStruct (1) defines a spherical area associated with the overlay.
  • the user can click on the area to trigger the display or close of the overlay associated with it.
  • the SphereRegionStruct (1) defined in the overlay control structure AssociatedSphereRegionStruct () may be replaced with the 2DRegionStruct () defined in the second embodiment, and the syntax of the definition of the overlay control structure is as follows:
  • initial_status defines a flag, which can be the initial status identifier above, and indicates whether the overlay is displayed by default.
  • 2DRegionStruct () defines a two-dimensional area associated with the overlay.
  • object_x and object_y represent the x, y positions of the top left corner of the region in the two-dimensional coordinates of the background VR stream content.
  • object_x and object_y may be position information of a region associated with the overlay layer, and indicate the x, y position of the upper-left corner of the region in the two-dimensional coordinates of the background VR stream content (background video or background image).
  • object_width and object_height indicate the width and height of the area in two-dimensional coordinates in the background VR stream content.
  • object_width and object_height may be the width of the area associated with the overlay layer and the height of the area associated with the overlay layer, indicating the width and height of the area in the two-dimensional coordinates of the background VR stream content.
  • a spherical area is described in a newly defined overlay control structure, and the spherical area is associated with an overlay, so that the user can click the spherical area to control the associated spherical area.
  • Overlay display At the same time, a flag is defined in the overlay control structure to indicate whether the overlay is displayed by default.
  • the embodiment of the present invention supports display of conditional overlays by adding a new structure definition.
  • an overlay control structure In this embodiment, as in the first embodiment, we newly define an overlay control structure.
  • a spherical area associated with the overlay is defined to represent a user clickable area. Spherical area in the background VR video stream.
  • the client detects the overlay control structure in the code stream, it further analyzes the spherical area defined in the structure, so that the overlay associated with the area can be triggered when the user clicks on the area.
  • the overlay control structure we define a flag to mark the type of the overlay display being triggered, and define a value for the flag to indicate that the trigger type is a user click trigger.
  • an overlay control structure is newly defined as AssociatedSphereRegionStruct, and the specific syntax is as follows:
  • condition_type defines a flag, which can be the trigger type information above, and indicates the type of the overlay that is triggered to be displayed.
  • SphereRegionStruct (1) defines a spherical area associated with the overlay.
  • condition_type When the value of condition_type is 0, it means that the trigger type is user click trigger, other values are reserved.
  • the specific definitions are as follows:
  • the user can click on the area to trigger the display or closure of the overlay associated with it.
  • the SphereRegionStruct (1) defined in the overlay control structure AssociatedSphereRegionStruct () may be replaced with the 2DRegionStruct () defined in the second embodiment, and the syntax of the definition of the overlay control structure is as follows:
  • condition_type defines a flag, which can be the trigger type information above, and indicates the type of the overlay that is triggered to be displayed.
  • 2DRegionStruct () defines a two-dimensional area associated with the overlay.
  • object_x and object_y represent the x, y positions of the top left corner of the region in the two-dimensional coordinates of the background VR stream content.
  • object_x and object_y may be position information of a region associated with the overlay layer, and indicate the x, y position of the upper-left corner of the region in the two-dimensional coordinates of the background VR stream content (background video or background image).
  • object_width and object_height indicate the width and height of the area in two-dimensional coordinates in the background VR stream content.
  • object_width and object_height may be the width of the area associated with the overlay layer and the height of the area associated with the overlay layer, indicating the width and height of the area in the two-dimensional coordinates of the background VR stream content.
  • an initial_status tag can be added to indicate whether the overlay is displayed by default.
  • the specific syntax and semantics are the same as the implementation. Example three.
  • a spherical area is described in a newly defined overlay control structure, and the spherical area is associated with an overlay, so that the user can click the spherical area to control the associated spherical area.
  • Overlay display At the same time, a flag is defined in the overlay control structure to indicate the type of overlay that is triggered to be displayed.
  • the embodiment of the present invention supports display of conditional overlays by adding a new structure definition.
  • an overlay control structure In this embodiment, as in the first embodiment, we newly define an overlay control structure.
  • a spherical area associated with the overlay is defined to represent a user clickable area. Spherical area in the background VR video stream.
  • the client detects the overlay control structure in the code stream, it further analyzes the spherical area defined in the structure, so that the overlay associated with the area can be triggered when the user clicks on the area.
  • the overlay interaction structure we define a flag to indicate that the overlay is conditionally triggered to display.
  • an overlay control structure is newly defined as AssociatedSphereRegionStruct, and the specific syntax is as follows:
  • SphereRegionStruct (1) defines a spherical area associated with the overlay.
  • a mark is defined in the overlay control structure to indicate that the overlay is conditionally triggered to display.
  • conditional_switch_on_off_flag defines a flag that can trigger the condition trigger flag above, indicating that the overlay is conditionally triggered to be displayed.
  • OverlayInteraction is the above-mentioned overlay control structure for user interaction control.
  • the user can click on the area to trigger the display or close of the overlay associated with it.
  • the SphereRegionStruct (1) defined in the overlay control structure AssociatedSphereRegionStruct () may be replaced with the 2DRegionStruct () defined in the second embodiment, and the syntax of the definition of the overlay control structure is as follows:
  • 2DRegionStruct () defines a two-dimensional area associated with the overlay.
  • object_x and object_y can represent the x, y positions of the top left corner of the region in the two-dimensional coordinates of the background VR stream content.
  • object_x and object_y may be position information of a region associated with the overlay layer, and indicate the x, y position of the upper-left corner of the region in the two-dimensional coordinates of the background VR stream content (background video or background image).
  • object_width and object_height can represent the width and height of the area in two-dimensional coordinates in the background VR stream content.
  • object_width and object_height may be the width of the area associated with the overlay layer and the height of the area associated with the overlay layer, indicating the width and height of the area in the two-dimensional coordinates of the background VR stream content.
  • an initial_status tag can be added to indicate whether the overlay is displayed by default.
  • the specific syntax and semantics are the same as the implementation. Example three.
  • a spherical area is described in a newly defined overlay control structure, and the spherical area is associated with an overlay, so that the user can click the spherical area to control the associated spherical area.
  • Overlay display At the same time, a flag is defined in the overlay interaction structure to indicate that the overlay is conditionally triggered to display.
  • the embodiment of the present invention supports display of conditional overlays by adding a new structure definition.
  • an overlay descriptor is defined in the MPD. Its @schemeIdUri is "urn: mpeg: mpegI: omaf: 2018: ovly". At most one descriptor can appear in the MPD's adaptation set level. It is used to indicate the overlay associated with this adaptation.
  • the associated spherical area is described in the overlay descriptor of the MPD.
  • the specific syntax is as follows:
  • the user can click on the area to trigger the display or closure of the overlay associated with it.
  • OverlayInfo.associatedSphereRegion@center_azimuth, OverlayInfo.associatedSphereRegion@center_elevation, OverlayInfo.associatedSphereRegion@center_tilt, OverlayInfo.associatedSphereRegion@azimuth_range and OverlayInfo.associatedSphereRegion@elevation_range can be the area information related to the overlay sphere in the above.
  • OverlayInfo.associatedSphereRegion@center_azimuth, OverlayInfo.associatedSphereRegion@center_elevation, OverlayInfo.associatedSphereRegion@center_tilt may be the position information of the area associated with the overlay.
  • OverlayInfo.associatedSphereRegion@azimuth_range and OverlayInfo.associatedSphereRegion@elevation_range may be the width of the area associated with the overlay layer and the height of the area associated with the overlay layer, respectively.
  • a spherical area is described in the overlay description word of the MPD, and the spherical area is associated with the overlay, so that the user can click the spherical area to control the associated overlay ( overlay) display.
  • the embodiment of the present invention supports display of conditional overlays by adding a new structure definition.
  • the display of the overlay associated with the area can be triggered when the user clicks on the area.
  • an overlay descriptor is defined in the MPD. Its @schemeIdUri is "urn: mpeg: mpegI: omaf: 2018: ovly". At most one descriptor can appear in the MPD's adaptation set level. It is used to indicate the overlay associated with this adaptation.
  • the two-dimensional area associated with the MPD is described in an overlay descriptor of the MPD.
  • the specific syntax is as follows:
  • the user can click on the area to trigger the display or close of the overlay associated with it.
  • OverlayInfo.associated2DRegion@object_x OverlayInfo.associated2DRegion@object_x
  • OverlayInfo.associated2DRegion@object_width OverlayInfo.associated2DRegion@object_height are the above-mentioned overlay-related area information, specifically planar area information.
  • OverlayInfo.associated2DRegion@object_x may be position information of an area associated with the overlay, indicating that the top left corner of the area is in the two-dimensional coordinates of the background VR stream content (background video or background image).
  • x y position.
  • OverlayInfo.associated2DRegion@object_width and OverlayInfo.associated2DRegion@object_height can be the width of the area associated with the overlay layer and the height of the area associated with the overlay layer, indicating the width and sum of the area in two-dimensional coordinates in the background VR stream content high.
  • a two-dimensional area is described in the overlay description word of the MPD, and the two-dimensional area is associated with the overlay, so that the user can click the two-dimensional area to control the associated two-dimensional area.
  • Overlay display is described in the overlay description word of the MPD, and the two-dimensional area is associated with the overlay, so that the user can click the two-dimensional area to control the associated two-dimensional area.
  • the embodiment of the present invention supports display of conditional overlays by adding a new structure definition.
  • an overlay descriptor is defined in the MPD. Its @schemeIdUri is "urn: mpeg: mpegI: omaf: 2018: ovly". At most one descriptor can appear in the MPD's adaptation set level. It is used to indicate the overlay associated with this adaptation.
  • an overlay descriptor of the MPD is used to describe a flag indicating whether the overlay is displayed in a default state.
  • the specific syntax is as follows:
  • a mark is described in the overlay description word of the MPD, and the mark indicates whether the overlay is displayed in a default state.
  • OverlayInfo @ initial_status is the initial status identifier mentioned above.
  • the embodiment of the present invention supports display of conditional overlays by adding a new structure definition.
  • an overlay descriptor is defined in the MPD. Its @schemeIdUri is "urn: mpeg: mpegI: omaf: 2018: ovly". At most one descriptor can appear in the MPD's adaptation set level. It is used to indicate the overlay associated with this adaptation.
  • the overlay display trigger type mark is described in the overlay descriptor of the MPD, and the specific syntax is as follows:
  • a mark is described in the overlay description word of the MPD, and the mark indicates a trigger type of the overlay display.
  • OverlayInfo @ condition_type is the trigger type information mentioned above.
  • the embodiment of the present invention supports display of conditional overlays by adding a new structure definition.
  • a description word of an overlay layer is added to the MMT protocol, and a spherical area or a two-dimensional area associated with the overlay layer is described under the description word.
  • the specific syntax of the overlay description word is as follows:
  • SphereRegionStruct (1) defines a spherical area associated with the overlay. This may include area information associated with the overlay, which may specifically be spherical area information.
  • AssociatedSphereRegionStruct ()
  • the specific syntax is defined as follows:
  • 2DRegionStruct defines a two-dimensional area associated with the overlay. This can include the area information associated with the overlay, which can be flat area information, as follows:
  • object_x and object_y represent the x, y positions of the top left corner of the region in the two-dimensional coordinates of the background VR stream content.
  • object_x and object_y may be position information of a region associated with the overlay layer, and indicate the x, y position of the upper-left corner of the region in the two-dimensional coordinates of the background VR stream content (background video or background image).
  • object_width and object_height indicate the width and height of the area in two-dimensional coordinates in the background VR stream content.
  • object_width and object_height may be the width of the area associated with the overlay layer and the height of the area associated with the overlay layer, indicating the width and height of the area in the two-dimensional coordinates of the background VR stream content.
  • the user can click on the area to trigger the display or close of the overlay associated with it.
  • an area associated with an overlay layer is described in the MMT protocol, so that a user can click the area to control the display of an overlay layer associated with the area.
  • the embodiment of the present invention supports display of conditional overlays by adding a new structure definition.
  • An embodiment of the present invention provides a client.
  • the client may be the client described previously, or the client may include some elements or modules in the client described previously.
  • the client may include an acquisition module and a display module.
  • the operations performed by the modules in the client may be implemented by software, and may be located in the memory of the client as software modules and used by the processor to call and execute.
  • the operations performed by the modules in the client may also be implemented by a hardware chip.
  • An embodiment of the present invention provides a server.
  • the server may be a server described previously.
  • the server may include some elements or modules in the server described previously.
  • the server may include a determination module and a sending module.
  • the operations performed by the modules can be implemented by software, which can be located in the memory of the server as a software module and used by the processor to call and execute.
  • the operations performed by the modules in the server can also be implemented by hardware chips.
  • FIG. 6 is a schematic diagram of a hardware structure of an apparatus (electronic device) for processing media data according to an embodiment of the present application.
  • the apparatus 600 shown in FIG. 6 can be regarded as a computer device, and the apparatus 600 can be used as an implementation manner of a client or a server in the embodiment of the present application, or as a method of transmitting media data in the embodiment of the present application.
  • the apparatus 600 may include a processor 610, a memory 620, an input / output interface 630, and a bus 650, and may further include a communication interface 640.
  • the processor 610, the memory 620, the input / output interface 630, and the communication interface 640 implement a communication connection with each other through a bus 650.
  • the processor 610 may use a general-purpose central processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits for executing related programs to The functions required by the modules in the client or server in the embodiments of the present application are implemented, or the method for transmitting media data in the embodiments of the method of the present application is performed.
  • the processor 610 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method may be completed by using hardware integrated logic circuits or instructions in the form of software in the processor 610.
  • the above processor 610 may be a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA), or other programmable logic device, Discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA ready-made programmable gate array
  • Various methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the steps of the method disclosed in combination with the embodiments of the present application may be directly implemented by a hardware decoding processor, or may be performed by using a combination of hardware and software modules in the decoding processor.
  • the software module may be located in a mature storage medium such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory, or an electrically erasable programmable memory, a register, and the like.
  • the storage medium is located in the memory 620, and the processor 610 reads the information in the memory 620 and, in combination with its hardware, completes the functions required by the modules that can be included in the client or server in the embodiment of the present application, or executes Method of transmitting media data.
  • the memory 620 may be a read-only memory (Read Only Memory, ROM), a static storage device, a dynamic storage device, or a random access memory (Random Access Memory, RAM).
  • the memory 620 may store an operating system and other application programs.
  • software or firmware is used to implement the functions required by the modules that can be included in the client or server in the embodiments of this application, or the method for transmitting media data in the embodiments of the method of this application, it is used to implement the methods provided in the embodiments of this application.
  • the program code of the technical solution is stored in the memory 620, and the processor 610 executes operations required by the client or the module that may be included in the server, or executes the method for transmitting media data provided by the method embodiment of the present application.
  • the input / output interface 630 is used to receive input data and information, and output data such as operation results.
  • the communication interface 640 uses a transceiving device such as, but not limited to, a transceiver to implement communication between the device 600 and other devices or a communication network. It can be used as an acquisition module or a sending module in the processing device.
  • a transceiving device such as, but not limited to, a transceiver to implement communication between the device 600 and other devices or a communication network. It can be used as an acquisition module or a sending module in the processing device.
  • the bus 650 may include a path for transmitting information between various components of the device 600 (for example, the processor 610, the memory 620, the input / output interface 630, and the communication interface 640).
  • the device 600 shown in FIG. 6 only shows the processor 610, the memory 620, the input / output interface 630, the communication interface 640, and the bus 650, in a specific implementation process, those skilled in the art should understand that The apparatus 600 may further include other devices necessary for achieving normal operation, for example, may further include a display for displaying video data to be played. At the same time, according to specific needs, those skilled in the art should understand that the apparatus 600 may further include hardware devices that implement other additional functions. In addition, those skilled in the art should understand that the apparatus 600 may also include only the components necessary to implement the embodiments of the present application, and not necessarily all the components shown in FIG. 6.
  • the client 700 may be an implementation manner of various devices described above.
  • the client 700 may include: an obtaining module 701 and a display module 702.
  • the obtaining module 701 may be configured to obtain area information associated with the overlay layer, and the area information associated with the overlay layer is used to indicate an area associated with the overlay layer.
  • the obtaining module 701 may be the communication interface 640 or the input / output interface 630 or a receiving device described above.
  • the display module 702 may be configured to display the overlay layer when a trigger operation for an area associated with the overlay layer is detected.
  • the display module 702 may be a display or a display device described above.
  • the trigger operation for the area associated with the overlay layer may include: a click operation within the area associated with the overlay layer, or a trigger operation where a user's eyes are located in the area associated with the overlay layer.
  • the area information associated with the overlay layer may be located in an overlay control structure.
  • the area information associated with the overlay layer may include position information of the area associated with the overlay layer.
  • the area information associated with the overlay layer may include a width of the area associated with the overlay layer and a height of the area associated with the overlay layer.
  • the area information associated with the cover layer may be planar area information or spherical area information.
  • the acquiring module 701 may be further configured to acquire trigger type information.
  • the trigger operation for the area associated with the overlay layer may include a trigger operation indicated by the trigger type information for the area associated with the overlay layer.
  • the trigger type information may be located in the overlay control structure.
  • the trigger type information may be located in a media presentation description MPD.
  • the trigger type information may be attribute information of an overlay descriptor in the MPD.
  • the obtaining module 701 may be further configured to obtain a condition trigger identifier; the client 700 may further include a detecting module configured to set a value of the condition trigger identifier to a first preset value. When detecting whether there is a trigger operation for the area associated with the overlay layer.
  • the condition trigger identifier may be located in an overlay control structure for user interaction control.
  • the area information associated with the overlay layer may be located in a media presentation description MPD.
  • the area information associated with the overlay layer may be attribute information of an overlay descriptor in the MPD.
  • a client 800 is provided.
  • the client 800 may be an implementation manner of various devices described above.
  • the client 800 may include: an obtaining module 801 and a display module 802.
  • the obtaining module 801 may be configured to obtain area information associated with the overlay layer, the background video or the background image, and the area information associated with the overlay layer is used to indicate an area associated with the overlay layer; and obtain an initial state identifier.
  • the obtaining module 801 may be the communication interface 640 or the input / output interface 630 or a receiving device described above.
  • the display module 802 may be configured to perform the following operations when the value of the initial state identifier indicates that the display of the overlay is turned off by default, and display the background video or the background image; When the associated area is triggered, the overlay is superimposed on the background video or background image, and the superimposed video image is displayed. Wherein, the displaying the background video or the background image may be performed only when a triggering operation for the area associated with the overlay layer is not detected.
  • the triggering operation for the area associated with the overlay layer may include: a click operation within the area associated with the overlay layer, and the display module 802 may be further configured to indicate a value indicated in the initial state.
  • the display of the overlay layer is turned off by default, it is displayed whether or not the prompt information of the overlay layer is displayed by a click operation in an area associated with the overlay layer.
  • the display module 802 may be the above-mentioned display or display device.
  • the area information associated with the overlay layer and the initial state identifier may be located in an overlay control structure.
  • the area information associated with the overlay layer and the initial state identifier may be located in a media presentation description MPD.
  • the area information associated with the overlay layer and the initial state identifier may be attribute information of an overlay descriptor in the MPD.
  • the obtaining module 801 may be further configured to: obtain trigger type information.
  • the trigger operation for the area associated with the overlay layer may include a trigger operation indicated by the trigger type information for the area associated with the overlay layer.
  • the trigger type information may be located in the overlay control structure.
  • the trigger type information may be located in a media presentation description MPD.
  • the trigger type information may be attribute information of an overlay descriptor in the MPD.
  • the obtaining module 801 may be further configured to obtain a condition trigger identifier.
  • the client may further include a detection module for detecting whether there is a trigger operation for a region associated with the overlay layer when the value of the condition trigger identifier is a first preset value.
  • condition trigger identifier may be located in an overlay control structure for user interaction control.
  • each module of the client 800 in this embodiment may be specifically implemented according to the methods in the foregoing method embodiments, and the specific implementation process may refer to the related description of the foregoing method embodiments, and details are not described herein again.
  • the client 900 may be an implementation manner of various devices described above.
  • the client 900 may include: an obtaining module 901 and a display module 902.
  • the obtaining module 901 may be configured to obtain area information associated with the overlay layer, a background video or a background image, and the area information associated with the overlay layer is used to indicate an area associated with the overlay layer; and obtain an initial state identifier.
  • the obtaining module 901 may be the communication interface 640, the input / output interface 630, or a receiving device described above.
  • the display module 902 may be configured to perform the following operations when the value of the initial state identifier indicates that the overlay layer is displayed by default:
  • the overlay layer will be superimposed on the background video or background image, and the superimposed video image will be displayed; when a trigger operation is detected for the area associated with the overlay layer, the background video or background image is displayed.
  • the display superimposed video image may be executed only when a trigger operation for the area associated with the overlay layer is not detected.
  • the trigger operation for the area associated with the overlay layer may include: a click operation in the area associated with the overlay layer.
  • the display module 902 may be further configured to: if the value of the initial state identifier indicates that the overlay layer is displayed by default, display whether to close the overlay layer by a click operation in an area associated with the overlay layer. The displayed prompt message.
  • the display module 902 may execute whether the display is turned off by a click operation in the area associated with the overlay layer when it is detected that at least a part of the area associated with the overlay layer is within the current user's perspective Display prompt information of the cover layer.
  • the display module 902 may be the above-mentioned display or display device.
  • the area information associated with the overlay layer and the initial state identifier may be located in an overlay control structure.
  • the area information associated with the overlay layer and the initial state identifier may be located in a media presentation description MPD.
  • the area information associated with the overlay layer and the initial state identifier may be attribute information of an overlay descriptor in the MPD.
  • the obtaining module 901 may be further configured to: obtain trigger type information; the trigger operation for the area associated with the overlay layer may include the trigger for the area associated with the overlay layer The trigger action indicated by the type information.
  • the trigger type information may be located in the overlay control structure.
  • the trigger type information may be located in a media presentation description MPD.
  • the trigger type information may be attribute information of an overlay descriptor in the MPD.
  • the obtaining module 901 may be further configured to: obtain a condition trigger identifier; the client may further include a detection module, which is used when the value of the condition trigger identifier is a first preset value To detect whether there is a trigger operation for an area associated with the overlay layer.
  • condition trigger identifier may be located in an overlay control structure for user interaction control.
  • each module of the client 900 in this embodiment may be specifically implemented according to the methods in the foregoing method embodiments, and the specific implementation process may refer to the related description of the foregoing method embodiments, and details are not described herein again.
  • the server 1000 may be an implementation manner of various devices described above.
  • the server 1000 may include: a determining module 1001 and a sending module 1002.
  • the determining module 1001 may be configured to determine area information associated with the overlay layer, and the area information associated with the overlay layer is used to indicate an area associated with the overlay layer.
  • the sending module 1002 may be configured to send the area information associated with the overlay layer to the client.
  • the sending module 1002 may be the sending and transmitting device or the communication interface 640 or the input / output interface 630 described above.
  • the area information associated with the overlay layer may be located in an overlay control structure.
  • the area information associated with the overlay layer may include position information of the area associated with the overlay layer.
  • the area information associated with the overlay layer may include a width of the area associated with the overlay layer and a height of the area associated with the overlay layer.
  • the area information associated with the cover layer may be planar area information or spherical area information.
  • the sending module may be further configured to send trigger type information to the client, where the trigger type information is used to indicate a trigger type of a trigger operation for triggering the display of the overlay layer or closing the display.
  • the trigger type information may be located in the overlay control structure.
  • the trigger type information may be located in a media presentation description MPD.
  • the trigger type information may be attribute information of an overlay descriptor in the MPD.
  • the sending module 1002 may be further configured to send a condition trigger identifier to the client, and a value of the condition trigger identifier is a first preset value for indicating display of the overlay layer.
  • closing the display is controlled by a trigger operation for triggering the overlay display or closing the display.
  • the value of the condition trigger identifier may be a second preset value for indicating that the display of the overlay layer or the closed display is not controlled by a trigger operation for triggering the display of the overlay layer or the closed display.
  • condition trigger identifier may be located in an overlay control structure for user interaction control.
  • the area information associated with the overlay layer may be located in a media presentation description MPD.
  • the area information associated with the overlay layer may be attribute information of an overlay descriptor in the MPD.
  • the sending module 1002 may be further configured to send an initial status identifier to the client, where the initial status identifier is used to indicate that the overlay layer is in a display state in an initial state, or Yu indicates that the cover layer is in a closed display state in an initial state.
  • the initial state identifier may be located in an overlay control structure.
  • the initial state identifier may be located in a media presentation description MPD.
  • the initial state identifier may be attribute information of an overlay descriptor in the MPD.
  • modules of the foregoing devices may also be software modules, read by a processor to execute related methods, or units in a chip, which is not limited herein.
  • the disclosed systems, devices, and methods may be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the unit is only a logical function division.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
  • the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially a part that contributes to the existing technology or a part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium, and can Several instructions are included to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present application.
  • the foregoing storage medium may include: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, etc. medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Library & Information Science (AREA)
  • User Interface Of Digital Computer (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本申请涉及流媒体传输技术领域,并且更具体地,涉及一种处理媒体数据的方法、客户端和服务器。该方法包括:获取覆盖层和所述覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域;在检测到针对所述覆盖层关联的区域的触发操作时,显示所述覆盖层。

Description

处理媒体数据的方法、客户端和服务器
本申请要求于2018年09月27日提交美国专利局、申请号为62/737,892、申请名称为“Method,terminal and server for processing media data”的美国临时专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及流媒体传输技术领域,并且更具体地,涉及一种处理媒体数据的方法、客户端和服务器。
背景技术
ISO/IEC 23090-2标准规范又称为OMAF(Omnidirectional media format,全景媒体格式)标准规范,该规范定义了一种媒体应用格式,该媒体应用格式能够在应用中实现全景媒体的呈现,全景媒体主要是指全景视频(360度视频)和相关音频。OMAF规范首先指定了可以用于将球面视频转换为二维视频的投影方法的列表,其次是如何使用ISO基本媒体文件格式(ISO base media file format,ISOBMFF)存储全景媒体和该媒体相关联的元数据,以及如何在流媒体系统中封装全景媒体的数据和传输全景媒体的数据,例如通过基于超文本传输协议(Hyper Text Transfer Protocol,HTTP)的动态自适应流传输(Dynamic Adaptive Streaming over HTTP,DASH),ISO/IEC 23009-1标准中规定的动态自适应流传输。
ISO基本媒体文件格式是由一系列的盒子(box)组成,在一个box中还可以包括其他的box,box中包括元数据box和媒体数据box,元数据box(moov box)中包括的是元数据,媒体数据box(mdat box)中包括的是媒体数据,元数据的box和媒体数据的box可以是在同一个文件中,也可以是在分开的文件中;如果具有时间属性的元数据采用ISO基本媒体文件格式封装,那么,元数据box中包括的是描述具有时间属性的元数据的元数据,媒体数据box中包括的是具有时间属性的元数据。
在现有方案中,定义了覆盖层(overlay)图像的基本数据结构和携带方式,但对于覆盖层的显示方式较为单一,不够灵活。
发明内容
本申请提供了一种处理媒体数据的方法和装置,以提供更多覆盖层显示的灵活性和多样性。
第一方面,提供了一种处理媒体数据的方法,该方法可以包括:获取覆盖层和所述覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域;在检测到针对所述覆盖层关联的区域的触发操作时,显示所述覆盖层。
其中,所述覆盖层关联的区域信息可以使得用户实现覆盖层的显示和覆盖层关闭显示之间的切换。
其中,该方法可以由客户端执行。
其中,所述覆盖层为用于叠加在背景视频或背景图像上进行显示的视频、图像或者文 本。
其中,覆盖层可以为用于全景视频图像的图像。
其中,该方法还可以包括:获取背景视频或背景图像。所述显示所述覆盖层可以包括将在所述背景视频或背景图像上叠加所述覆盖层,并显示叠加后的视频图像。其中,背景视频或背景图像可以为用于全景视频图像的图像。
其中,所述针对所述覆盖层关联的区域的触发操作可以包括:针对所述背景视频或背景图像的所述覆盖层关联的区域的触发操作。
其中,所述方法还可以包括:在未检测到针对所述覆盖层关联的区域的触发操作时,显示所述背景视频或背景图像。应理解,为了排除显示所述覆盖层,显示所述背景视频或背景图像可以为仅显示所述背景视频或背景图像。
其中,所述背景视频或背景图像显示的内容可以为目标事物,所述覆盖层的内容可以为所述目标事物的文字信息。
其中,所述覆盖层关联的区域上的触发操作用于控制所述覆盖层的显示或者关闭显示。应理解,关闭显示指的是不显示。
结合第一方面,在第一方面第一种可能的实现方式中,所述针对所述覆盖层关联的区域的触发操作可以包括:在所述覆盖层关联的区域内的点击操作,或者,用户视线位于所述覆盖层关联的区域的触发操作。
其中,所述针对所述覆盖层关联的区域的触发操作可以包括:在所述覆盖层关联的区域内的点击操作。所述方法还可以包括:显示是否通过在所述覆盖层关联的区域内的点击操作来显示所述覆盖层的提示信息。
其中,可以在检测到所述覆盖层关联的区域中至少部分区域位于当前用户视角范围内时,才执行所述显示是否通过在所述覆盖层关联的区域内的点击操作来显示所述覆盖层的提示信息。
结合第一方面或第一方面第一种可能的实现方式,在第一方面第二种可能的实现方式中,所述覆盖层关联的区域信息可以位于覆盖层控制结构(overlay control structure)中。
结合第一方面或第一方面第一种可能的实现方式或第一方面第二种可能的实现方式,在第一方面的第三种可能的实现方式中,所述覆盖层关联的区域信息可以包括所述覆盖层关联的区域的位置信息。
其中,所述覆盖层关联的区域的位置信息可以包括所述覆盖层关联的区域中心点的位置信息,或者所述覆盖层关联的区域的左上角点的位置信息。
结合第一方面,或第一方面以上任一种可能的实现方式,在第一方面第四种可能的实现方式中,所述覆盖层关联的区域信息可以包括所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高。
结合第一方面,或第一方面以上任一种可能的实现方式,在第一方面第五种可能的实现方式中,所述覆盖层关联的区域信息为平面区域信息或者球面区域信息。
结合第一方面,或第一方面以上任一种可能的实现方式,在第一方面第六种可能的实现方式中,所述方法还可以包括:获取触发类型信息。其中,所述针对所述覆盖层关联的区域的触发操作可以包括针对所述覆盖层关联的区域的所述触发类型信息指示的触发操作。
其中,该触发类型信息可以使得用户可以使用不同的触发操作触发覆盖层的显示或者关闭显示。应理解,所述触发类型信息用于指示用于触发所述覆盖层显示或者关闭显示的触发操作的触发类型。
结合第一方面第六种可能的实现方式,在第一方面第七种可能的实现方式中,所述触发类型信息位于覆盖层控制结构中。
结合第一方面第六种可能的实现方式,在第一方面第八种可能的实现方式中,所述触发类型信息位于媒体呈现描述MPD中。
结合第一方面第八种可能的实现方式,在第一方面第九种可能的实现方式中,所述触发类型信息可以为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
结合第一方面,或第一方面以上任一种可能的实现方式,在第一方面第十种可能的实现方式中,所述方法还可以包括:获取条件触发标识,在所述条件触发标识的值为第一预设值时,检测是否有针对所述覆盖层关联的区域的触发操作。
应理解,所述条件触发标识的值为第一预设值可以用于指示所述覆盖层的显示或者关闭显示受用于触发所述覆盖层显示或者关闭显示的触发操作控制。所述条件触发标识的值为第二预设值用于指示所述覆盖层的显示或者关闭显示不受用于触发所述覆盖层显示或者关闭显示的触发操作控制。
其中,所述条件触发标识的值为第二预设值可以用于指示所述覆盖层的显示或者光比显示不受触发操作的控制。
该条件触发标识进一步增加了关于覆盖层显示的交互方式的多样性。
结合第一方面第十种可能的实现方式,在第一方面第十一种可能的实现方式中,所述条件触发标识可以位于用于用户交互控制的覆盖层控制结构中。
结合第一方面,或第一方面以上任一种可能的实现方式,在第一方面第十二种可能的实现方式中,所述覆盖层关联的区域信息位于媒体呈现描述MPD中。
结合第一方面的第十二种可能的实现方式,在第一方面第十三种可能的实现方式中,所述覆盖层关联的区域信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
第二方面,提供了一种处理媒体数据的方法,该方法可以包括:获取覆盖层,背景视频或背景图像和所述覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域;获取初始状态标识;在所述初始状态标识的值指示默认关闭所述覆盖层的显示的情况下,执行如下操作:
显示所述背景视频或背景图像;在检测到针对所述覆盖层关联的区域的触发操作时,将在所述背景视频或背景图像上叠加所述覆盖层,并显示叠加后的视频图像。其中,可以在未检测到针对所述覆盖层关联的区域的触发操作时,才执行所述显示所述背景视频或背景图像。
该初始状态标识进一步增加了关于覆盖层显示方式的多样性。
其中,该方法可以由客户端执行。
其中,所述覆盖层为用于叠加在背景视频或背景图像上进行显示的视频、图像或者文本。
其中,覆盖层可以为用于全景视频图像的图像。背景视频或背景图像可以为用于全景 视频图像的图像。
其中,所述背景视频或背景图像显示的内容可以为目标事物,所述覆盖层的内容可以为所述目标事物的文字信息。
其中,所述针对所述覆盖层关联的区域的触发操作可以包括:在所述覆盖层关联的区域内的点击操作,或者,用户视线位于所述覆盖层关联的区域的触发操作。
其中,该方法还可以是在所述初始状态标识的值指示默认显示所述覆盖层的情况下,执行如下操作:将在所述背景视频或背景图像上叠加所述覆盖层,并显示叠加后的视频图像;在检测到针对所述覆盖层关联的区域的触发操作时,显示所述背景视频或背景图像。
应理解,默认关闭所述覆盖层的显示可以理解为所述覆盖层在初始状态下为关闭显示状态,所述初始状态标识基于值的不同,用于指示所述覆盖层在初始状态下为显示状态,或者用于指示所述覆盖层在初始状态下为关闭显示状态。
其中,所述覆盖层关联的区域上的触发操作用于控制所述覆盖层的显示或者关闭显示。应理解,关闭显示指的是不显示。
结合第二方面,在第二方面的第一种可能的实现方式中,所述针对所述覆盖层关联的区域的触发操作可以包括:在所述覆盖层关联的区域内的点击操作,所述方法还可以包括:在所述初始状态标识的值指示默认关闭所述覆盖层的显示的情况下,显示是否通过在所述覆盖层关联的区域内的点击操作来显示所述覆盖层的提示信息。
结合第二方面第一种可能的实现方式,在第二方面的第二种可能的实现方式中,在检测到所述覆盖层关联的区域中至少部分区域位于当前用户视角范围内时,才执行所述显示是否通过在所述覆盖层关联的区域内的点击操作来显示所述覆盖层的提示信息。
结合第二方面或第二方面的以上任一种可能的实现方式,在第二方面的第三种可能的实现方式中,所述覆盖层关联的区域信息和所述初始状态标识位于覆盖层控制结构(overlay control structure)中。
结合第二方面或第二方面的以上任一种可能的实现方式,在第二方面的第四种可能的实现方式中,所述覆盖层关联的区域信息和所述初始状态标识位于媒体呈现描述MPD中。
结合第二方面的第四种可能的实现方式,在第二方面的第五种可能的实现方式中,所述覆盖层关联的区域信息和所述初始状态标识为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
结合第二方面或第二方面的以上任一种可能的实现方式,在第二方面的第六种可能的实现方式中,所述方法还可以包括:获取触发类型信息。所述针对所述覆盖层关联的区域的触发操作可以包括针对所述覆盖层关联的区域的所述触发类型信息指示的触发操作。
应理解,所述触发类型信息用于指示用于触发所述覆盖层显示或者关闭显示的触发操作的触发类型。
结合第二方面的第六种可能的实现方式,在第二方面的第七种可能的实现方式中,所述触发类型信息位于覆盖层控制结构中。
结合第二方面的第六种可能的实现方式,在第二方面的第八种可能的实现方式中,所述触发类型信息位于媒体呈现描述MPD中。
结合第二方面的第八种可能的实现方式,在第二方面的第九种可能的实现方式中,所述触发类型信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
结合第二方面或第二方面的以上任一种可能的实现方式,在第二方面的第十种可能的实现方式中,所述方法还可以包括:获取条件触发标识。在所述条件触发标识的值为第一预设值时,检测是否有针对所述覆盖层关联的区域的触发操作。
应理解,所述条件触发标识的值为第一预设值用于指示所述覆盖层的显示或者关闭显示受用于触发所述覆盖层显示或者关闭显示的触发操作控制。所述条件触发标识的值为第二预设值用于指示所述覆盖层的显示或者关闭显示不受用于触发所述覆盖层显示或者关闭显示的触发操作控制。
其中,所述条件触发标识的值为第二预设值可以用于指示所述覆盖层的显示或者光比显示不受触发操作的控制。
结合第二方面的第十种可能的实现方式,在第二方面的第十一种可能的实现方式中,所述条件触发标识位于用于用户交互控制的覆盖层控制结构中。
第三方面,提供了一种处理媒体数据的方法,该方法可以包括:获取覆盖层,背景视频或背景图像和所述覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域;获取初始状态标识;在所述初始状态标识的值指示默认显示所述覆盖层的情况下,执行如下操作:
将在所述背景视频或背景图像上叠加所述覆盖层,并显示叠加后的视频图像;在检测到针对所述覆盖层关联的区域的触发操作时,显示所述背景视频或背景图像。其中,可以是在未检测到针对所述覆盖层关联的区域的触发操作时,才执行所述显示叠加后的视频图像。
其中,该方法可以由客户端执行。
其中,所述覆盖层为用于叠加在背景视频或背景图像上进行显示的视频、图像或者文本。
其中,覆盖层可以为用于全景视频图像的图像。背景视频或背景图像可以为用于全景视频图像的图像。
其中,所述背景视频或背景图像显示的内容可以为目标事物,所述覆盖层的内容可以为所述目标事物的文字信息。
其中,所述针对所述覆盖层关联的区域的触发操作可以包括:在所述覆盖层关联的区域内的点击操作,或者,用户视线位于所述覆盖层关联的区域的触发操作。
应理解,默认显示所述覆盖层可以理解为所述覆盖层在初始状态下为显示状态,所述初始状态标识基于值的不同,用于指示所述覆盖层在初始状态下为显示状态,或者用于指示所述覆盖层在初始状态下为关闭显示状态。
其中,所述覆盖层关联的区域上的触发操作用于控制所述覆盖层的显示或者关闭显示。应理解,关闭显示指的是不显示。
结合第三方面,在第三方面的第一种可能的实现方式中,所述针对所述覆盖层关联的区域的触发操作可以包括:在所述覆盖层关联的区域内的点击操作,所述方法还可以包括:在所述初始状态标识的值指示默认显示所述覆盖层的情况下,显示是否通过在所述覆盖层关联的区域内的点击操作来关闭所述覆盖层的显示的提示信息。
结合第三方面的第一种可能的实现方式,在第三方面的第二种可能的实现方式中,在检测到所述覆盖层关联的区域中至少部分区域位于当前用户视角范围内时,才执行所述显 示是否通过在所述覆盖层关联的区域内的点击操作来关闭所述覆盖层的显示的提示信息。
结合第三方面或第三方面的以上任一种可能的实现方式,在第三方面的第三种可能的实现方式中,所述覆盖层关联的区域信息和所述初始状态标识位于覆盖层控制结构(overlay control structure)中。
结合第三方面或第三方面的以上任一种可能的实现方式,在第三方面的第四种可能的实现方式中,所述覆盖层关联的区域信息和所述初始状态标识位于媒体呈现描述MPD中。
结合第三方面的第四种可能的实现方式,在第三方面的第五种可能的实现方式中,所述覆盖层关联的区域信息和所述初始状态标识为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
结合第三方面或第三方面的以上任一种可能的实现方式,在第三方面的第六种可能的实现方式中,所述方法还可以包括:获取触发类型信息;所述针对所述覆盖层关联的区域的触发操作可以包括针对所述覆盖层关联的区域的所述触发类型信息指示的触发操作。
应理解,所述触发类型信息用于指示用于触发所述覆盖层显示或者关闭显示的触发操作的触发类型。
结合第三方面的第六种可能的实现方式,在第三方面的第七种可能的实现方式中,所述触发类型信息位于覆盖层控制结构中。
结合第三方面的第六种可能的实现方式,在第三方面的第八种可能的实现方式中,所述触发类型信息位于媒体呈现描述MPD中。
结合第三方面的第八种可能的实现方式,在第三方面的第九种可能的实现方式中,所述触发类型信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
结合第三方面或第三方面的以上任一种可能的实现方式,在第三方面的第十种可能的实现方式中,所述方法还可以包括:获取条件触发标识,在所述条件触发标识的值为第一预设值时,检测是否有针对所述覆盖层关联的区域的触发操作。
应理解,所述条件触发标识的值为第一预设值用于指示所述覆盖层的显示或者关闭显示受用于触发所述覆盖层显示或者关闭显示的触发操作控制。所述条件触发标识的值为第二预设值用于指示所述覆盖层的显示或者关闭显示不受用于触发所述覆盖层显示或者关闭显示的触发操作控制。
结合第三方面的第十种可能的实现方式,在第三方面的第十一种可能的实现方式中,所述条件触发标识位于用于用户交互控制的覆盖层控制结构中。
第四方面,提供了一种处理媒体数据的方法,该方法可以包括:确定覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域;向客户端发送所述覆盖层关联的区域信息。
其中,确定覆盖层关联的区域信息可以通过检测目标物体(例如,目标物体可以是人)的位置信息来确定。也可以通过检测用户输入的的区域信息来确定。
其中,该方法可以由服务器执行。
其中,所述覆盖层关联的区域上的触发操作用于控制所述覆盖层的显示或者关闭显示。其中,所述触发操作可以包括:在所述覆盖层关联的区域内的点击操作,或者,用户视线位于所述覆盖层关联的区域的触发操作。
其中,所述覆盖层为用于叠加在背景视频或背景图像(可以是至少部分区域)上进行 显示的视频、图像或者文本。
其中,该方法还可以包括获取覆盖层,编码覆盖层得到覆盖层的码流数据,向客户端发送覆盖层的码流数据。其中,覆盖层可以为用于全景视频图像的图像。
其中,该方法还可以包括获取背景视频或背景图像,编码背景视频背景图像得到背景视频或背景图像的码流数据,向客户端发送背景视频或背景图像的码流数据。其中,背景视频或背景图像可以为用于全景视频图像的图像。
其中,所述背景视频或背景图像显示的内容可以为目标事物,所述覆盖层的内容可以为所述目标事物的文字信息。
结合第四方面,在第四方面的第一种可能的实现方式中,所述覆盖层关联的区域信息位于覆盖层控制结构(overlay control structure)中。
结合第四方面或第四方面的第一种可能的实现方式,在第四方面的第二种可能的实现方式中,所述覆盖层关联的区域信息可以包括所述覆盖层关联的区域的位置信息。
其中,所述覆盖层关联的区域的位置信息可以包括所述覆盖层关联的区域中心点的位置信息,或者所述覆盖层关联的区域的左上角点的位置信息。
结合第四方面或第四方面的以上任一种可能的实现方式,在第四方面的第三种可能的实现方式中,所述覆盖层关联的区域信息可以包括所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高。
结合第四方面或第四方面的以上任一种可能的实现方式,在第四方面的第四种可能的实现方式中,所述覆盖层关联的区域信息为平面区域信息或者球面区域信息。
结合第四方面或第四方面的以上任一种可能的实现方式,在第四方面的第五种可能的实现方式中,所述方法还可以包括:向所述客户端发送触发类型信息,所述触发类型信息用于指示用于触发所述覆盖层显示或者关闭显示的触发操作的触发类型。
结合第四方面第五种可能的实现方式,在第四方面的第六种可能的实现方式中,所述触发类型信息位于覆盖层控制结构中。
结合第四方面第五种可能的实现方式,在第四方面的第七种可能的实现方式中,所述触发类型信息位于媒体呈现描述MPD中。
结合第四方面第七种可能的实现方式,在第四方面的第八种可能的实现方式中,所述触发类型信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
结合第四方面或第四方面的以上任一种可能的实现方式,在第四方面的第九种可能的实现方式中,所述方法还可以包括:向所述客户端发送条件触发标识,所述条件触发标识的值为第一预设值用于指示所述覆盖层的显示或者关闭显示受用于触发所述覆盖层显示或者关闭显示的触发操作控制。
其中,所述条件触发标识的值为第二预设值用于指示所述覆盖层的显示或者关闭显示不受用于触发所述覆盖层显示或者关闭显示的触发操作控制。
结合第四方面的第九种可能的实现方式,在第四方面的第十种可能的实现方式中,所述条件触发标识位于用于用户交互控制的覆盖层控制结构中。
结合第四方面或第四方面的以上任一种可能的实现方式,在第四方面的第十一种可能的实现方式中,所述覆盖层关联的区域信息位于媒体呈现描述MPD中。
结合第四方面的第十一种可能的实现方式,在第四方面的第十二种可能的实现方式中, 所述覆盖层关联的区域信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
结合第四方面或第四方面的以上任一种可能的实现方式,在第四方面的第十三种可能的实现方式中,所述方法还可以包括:向所述客户端发送初始状态标识,所述初始状态标识用于指示所述覆盖层在初始状态下为显示状态,或者用于指示所述覆盖层在初始状态下为关闭显示状态。
结合第四方面的第十三种可能的实现方式,在第四方面的第十四种可能的实现方式中,所述初始状态标识位于覆盖层控制结构(overlay control structure)中。
结合第四方面的第十三种可能的实现方式,在第四方面的第十五种可能的实现方式中,所述初始状态标识位于媒体呈现描述MPD中。
结合第四方面的第十五种可能的实现方式,在第四方面的第十六种可能的实现方式中,所述初始状态标识为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
第五方面,提供了一种客户端,该客户端可以包括:获取模块,用于获取覆盖层和所述覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域;显示模块,用于在检测到针对所述覆盖层关联的区域的触发操作时,显示所述覆盖层。
结合第五方面,在第五方面第一种可能的实现方式中,所述针对所述覆盖层关联的区域的触发操作可以包括:在所述覆盖层关联的区域内的点击操作,或者,用户视线位于所述覆盖层关联的区域的触发操作。
结合第五方面或第五方面第一种可能的实现方式,在第五方面第二种可能的实现方式中,所述覆盖层关联的区域信息可以位于覆盖层控制结构(overlay control structure)中。
结合第五方面或第五方面第一种可能的实现方式或第五方面第二种可能的实现方式,在第五方面的第三种可能的实现方式中,所述覆盖层关联的区域信息可以包括所述覆盖层关联的区域的位置信息。
结合第五方面,或第五方面以上任一种可能的实现方式,在第五方面第四种可能的实现方式中,所述覆盖层关联的区域信息可以包括所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高。
结合第五方面,或第五方面以上任一种可能的实现方式,在第五方面第五种可能的实现方式中,所述覆盖层关联的区域信息为平面区域信息或者球面区域信息。
结合第五方面,或第五方面以上任一种可能的实现方式,在第五方面第六种可能的实现方式中,所述获取模块还用于:获取触发类型信息。其中,所述针对所述覆盖层关联的区域的触发操作可以包括针对所述覆盖层关联的区域的所述触发类型信息指示的触发操作。
结合第五方面第六种可能的实现方式,在第五方面第七种可能的实现方式中,所述触发类型信息位于覆盖层控制结构中。
结合第五方面第六种可能的实现方式,在第五方面第八种可能的实现方式中,所述触发类型信息位于媒体呈现描述MPD中。
结合第五方面第八种可能的实现方式,在第五方面第九种可能的实现方式中,所述触发类型信息可以为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
结合第五方面,或第五方面以上任一种可能的实现方式,在第五方面第十种可能的实 现方式中,所述获取模块还用于:获取条件触发标识;所述客户端还可以包括检测模块,用于在所述条件触发标识的值为第一预设值时,检测是否有针对所述覆盖层关联的区域的触发操作。
结合第五方面第十种可能的实现方式,在第五方面第十一种可能的实现方式中,所述条件触发标识可以位于用于用户交互控制的覆盖层控制结构中。
结合第五方面,或第五方面以上任一种可能的实现方式,在第五方面第十二种可能的实现方式中,所述覆盖层关联的区域信息位于媒体呈现描述MPD中。
结合第五方面的第十二种可能的实现方式,在第五方面第十三种可能的实现方式中,所述覆盖层关联的区域信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
第六方面,提供了一种客户端,该客户端可以包括:获取模块,用于获取覆盖层,背景视频或背景图像和所述覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域;获取初始状态标识;显示模块,用于在所述初始状态标识的值指示默认关闭所述覆盖层的显示的情况下,执行如下操作:
显示所述背景视频或背景图像;在检测到针对所述覆盖层关联的区域的触发操作时,将在所述背景视频或背景图像上叠加所述覆盖层,并显示叠加后的视频图像。其中,可以在未检测到针对所述覆盖层关联的区域的触发操作时,才执行所述显示所述背景视频或背景图像。
结合第六方面,在第六方面的第一种可能的实现方式中,所述针对所述覆盖层关联的区域的触发操作可以包括:在所述覆盖层关联的区域内的点击操作,所述显示模块还用于:在所述初始状态标识的值指示默认关闭所述覆盖层的显示的情况下,显示是否通过在所述覆盖层关联的区域内的点击操作来显示所述覆盖层的提示信息。
结合第六方面第一种可能的实现方式,在第六方面的第二种可能的实现方式中,在检测到所述覆盖层关联的区域中至少部分区域位于当前用户视角范围内时,才执行所述显示是否通过在所述覆盖层关联的区域内的点击操作来显示所述覆盖层的提示信息。
结合第六方面或第六方面的以上任一种可能的实现方式,在第六方面的第三种可能的实现方式中,所述覆盖层关联的区域信息和所述初始状态标识位于覆盖层控制结构(overlay control structure)中。
结合第六方面或第六方面的以上任一种可能的实现方式,在第六方面的第四种可能的实现方式中,所述覆盖层关联的区域信息和所述初始状态标识位于媒体呈现描述MPD中。
结合第六方面的第四种可能的实现方式,在第六方面的第五种可能的实现方式中,所述覆盖层关联的区域信息和所述初始状态标识为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
结合第六方面或第六方面的以上任一种可能的实现方式,在第六方面的第六种可能的实现方式中,所述获取模块还用于:获取触发类型信息。所述针对所述覆盖层关联的区域的触发操作可以包括针对所述覆盖层关联的区域的所述触发类型信息指示的触发操作。
结合第六方面的第六种可能的实现方式,在第六方面的第七种可能的实现方式中,所述触发类型信息位于覆盖层控制结构中。
结合第六方面的第六种可能的实现方式,在第六方面的第八种可能的实现方式中,所 述触发类型信息位于媒体呈现描述MPD中。
结合第六方面的第八种可能的实现方式,在第六方面的第九种可能的实现方式中,所述触发类型信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
结合第六方面或第六方面的以上任一种可能的实现方式,在第六方面的第十种可能的实现方式中,所述获取模块还用于:获取条件触发标识。所述客户端还可以包括检测模块,用于在所述条件触发标识的值为第一预设值时,检测是否有针对所述覆盖层关联的区域的触发操作。
结合第六方面的第十种可能的实现方式,在第六方面的第十一种可能的实现方式中,所述条件触发标识位于用于用户交互控制的覆盖层控制结构中。
第七方面,提供了一种客户端,该客户端可以包括:获取模块,用于获取覆盖层,背景视频或背景图像和所述覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域;获取初始状态标识;显示模块,用于在所述初始状态标识的值指示默认显示所述覆盖层的情况下,执行如下操作:
将在所述背景视频或背景图像上叠加所述覆盖层,并显示叠加后的视频图像;在检测到针对所述覆盖层关联的区域的触发操作时,显示所述背景视频或背景图像。其中,可以是在未检测到针对所述覆盖层关联的区域的触发操作时,才执行所述显示叠加后的视频图像。
结合第七方面,在第七方面的第一种可能的实现方式中,所述针对所述覆盖层关联的区域的触发操作可以包括:在所述覆盖层关联的区域内的点击操作,所述显示模块还用于:在所述初始状态标识的值指示默认显示所述覆盖层的情况下,显示是否通过在所述覆盖层关联的区域内的点击操作来关闭所述覆盖层的显示的提示信息。
结合第七方面的第一种可能的实现方式,在第七方面的第二种可能的实现方式中,显示模块,在检测到所述覆盖层关联的区域中至少部分区域位于当前用户视角范围内时,才执行所述显示是否通过在所述覆盖层关联的区域内的点击操作来关闭所述覆盖层的显示的提示信息。
结合第七方面或第七方面的以上任一种可能的实现方式,在第七方面的第三种可能的实现方式中,所述覆盖层关联的区域信息和所述初始状态标识位于覆盖层控制结构(overlay control structure)中。
结合第七方面或第七方面的以上任一种可能的实现方式,在第七方面的第四种可能的实现方式中,所述覆盖层关联的区域信息和所述初始状态标识位于媒体呈现描述MPD中。
结合第七方面的第四种可能的实现方式,在第七方面的第五种可能的实现方式中,所述覆盖层关联的区域信息和所述初始状态标识为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
结合第七方面或第七方面的以上任一种可能的实现方式,在第七方面的第六种可能的实现方式中,所述获取模块还用于:获取触发类型信息;所述针对所述覆盖层关联的区域的触发操作可以包括针对所述覆盖层关联的区域的所述触发类型信息指示的触发操作。
结合第七方面的第六种可能的实现方式,在第七方面的第七种可能的实现方式中,所述触发类型信息位于覆盖层控制结构中。
结合第七方面的第六种可能的实现方式,在第七方面的第八种可能的实现方式中,所 述触发类型信息位于媒体呈现描述MPD中。
结合第七方面的第八种可能的实现方式,在第七方面的第九种可能的实现方式中,所述触发类型信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
结合第七方面或第七方面的以上任一种可能的实现方式,在第七方面的第十种可能的实现方式中,所述获取模块还用于:获取条件触发标识;所述客户端还可以包括检测模块,用于在所述条件触发标识的值为第一预设值时,检测是否有针对所述覆盖层关联的区域的触发操作。
结合第七方面的第十种可能的实现方式,在第七方面的第十一种可能的实现方式中,所述条件触发标识位于用于用户交互控制的覆盖层控制结构中。
第八方面,提供了一种服务器,该服务器可以包括:确定模块,用于确定覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域;发送模块,用于向客户端发送所述覆盖层关联的区域信息。
结合第八方面,在第八方面的第一种可能的实现方式中,所述覆盖层关联的区域信息位于覆盖层控制结构(overlay control structure)中。
结合第八方面或第八方面的第一种可能的实现方式,在第八方面的第二种可能的实现方式中,所述覆盖层关联的区域信息可以包括所述覆盖层关联的区域的位置信息。
结合第八方面或第八方面的以上任一种可能的实现方式,在第八方面的第三种可能的实现方式中,所述覆盖层关联的区域信息可以包括所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高。
结合第八方面或第八方面的以上任一种可能的实现方式,在第八方面的第四种可能的实现方式中,所述覆盖层关联的区域信息为平面区域信息或者球面区域信息。
结合第八方面或第八方面的以上任一种可能的实现方式,在第八方面的第五种可能的实现方式中,所述发送模块还用于:向所述客户端发送触发类型信息,所述触发类型信息用于指示用于触发所述覆盖层显示或者关闭显示的触发操作的触发类型。
结合第八方面第五种可能的实现方式,在第八方面的第六种可能的实现方式中,所述触发类型信息位于覆盖层控制结构中。
结合第八方面第五种可能的实现方式,在第八方面的第七种可能的实现方式中,所述触发类型信息位于媒体呈现描述MPD中。
结合第八方面第七种可能的实现方式,在第八方面的第八种可能的实现方式中,所述触发类型信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
结合第八方面或第八方面的以上任一种可能的实现方式,在第八方面的第九种可能的实现方式中,所述发送模块还用于:向所述客户端发送条件触发标识,所述条件触发标识的值为第一预设值用于指示所述覆盖层的显示或者关闭显示受用于触发所述覆盖层显示或者关闭显示的触发操作控制。
其中,所述条件触发标识的值为第二预设值用于指示所述覆盖层的显示或者关闭显示不受用于触发所述覆盖层显示或者关闭显示的触发操作控制。
结合第八方面的第九种可能的实现方式,在第八方面的第十种可能的实现方式中,所述条件触发标识位于用于用户交互控制的覆盖层控制结构中。
结合第八方面或第八方面的以上任一种可能的实现方式,在第八方面的第十一种可能 的实现方式中,所述覆盖层关联的区域信息位于媒体呈现描述MPD中。
结合第八方面的第十一种可能的实现方式,在第八方面的第十二种可能的实现方式中,所述覆盖层关联的区域信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
结合第八方面或第八方面的以上任一种可能的实现方式,在第八方面的第十三种可能的实现方式中,所述发送模块还用于:向所述客户端发送初始状态标识,所述初始状态标识用于指示所述覆盖层在初始状态下为显示状态,或者用于指示所述覆盖层在初始状态下为关闭显示状态。
结合第八方面的第十三种可能的实现方式,在第八方面的第十四种可能的实现方式中,所述初始状态标识位于覆盖层控制结构(overlay control structure)中。
结合第八方面的第十三种可能的实现方式,在第八方面的第十五种可能的实现方式中,所述初始状态标识位于媒体呈现描述MPD中。
结合第八方面的第十五种可能的实现方式,在第八方面的第十六种可能的实现方式中,所述初始状态标识为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
第九方面,提供一种客户端,可以包括:相互耦合的非易失性存储器和处理器;其中,所述处理器用于调用存储在所述存储器中的程序代码以执行第一方面或第二方面或第三方面中的任意一种实现方式中的方法的部分或全部步骤。
第十方面,提供一种服务器,可以包括:相互耦合的非易失性存储器和处理器;其中,所述处理器用于调用存储在所述存储器中的程序代码以执行第四方面中的任意一种实现方式中的方法的部分或全部步骤。
第十一方面,提供一种计算机可读存储介质,所述计算机可读存储介质存储了程序代码,其中,所述程序代码可以包括用于执行第一方面、第二方面、第三方面以及第四方面中的任意一种实现方式中的方法的部分或全部步骤的指令。
第十二方面,提供一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机执行第一方面、第二方面、第三方面以及第四方面中的任意一种实现方式中的方法的部分或全部步骤的指令。
附图说明
图1为本发明实施例提供的一种媒体数据处理系统的架构示意图;
图2是本发明实施例提供的一种实施场景示意图;
图3是本申请实施例提供的一种处理媒体数据的方法的示意性流程图;
图4是本申请实施例提供的一种处理媒体数据的方法的示意性流程图;
图5是本申请实施例提供的一种处理媒体数据的方法的示意性流程图;
图6是本发明实施例提供的一种电子装置的结构示意图;
图7为本发明实施例提供的一种客户端的示意性框图;
图8为本发明实施例提供的一种客户端的示意性框图;
图9为本发明实施例提供的一种客户端的示意性框图;
图10为本发明实施例提供的一种服务器的示意性框图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
为了更好地理解本申请实施例的处理媒体数据的方法,下面先对媒体数据相关的一些基本概念进行简要的介绍。
全景视频:又称360度全景视频,或者全向视频,由一系列的全景图片组成,全景图片内容覆盖三维空间中整个球体表面。随着虚拟现实(Virtual reality,VR)技术的快速发展,全景视频得到了越来越广泛的应用,基于360度全景视频的VR技术可以创建一种模拟环境,为用户带来交互式的三维动态视觉体验。全景视频由一系列全景图像组成,这些全景图像可以由计算机渲染产生,也可以通过拼接算法将多个相机分别从多个不同角度拍摄的视频图像拼接而成。一般来说,在观看全景视频时,用户在每个时刻观看到的图像内容仅占整个全景图像的一小部分,为了节省传输带宽,在通过远端服务器为用户提供全景图像时,可以只为用户传输每个时刻观看到的内容。
轨迹(track):是指一系列有时间属性的按照ISO基本媒体文件格式(ISO base media file format,ISOBMFF)的封装方式的样本。比如视频track,视频样本是通过将视频编码器编码每一帧后产生的码流按照ISOBMFF的规范封装后得到的。
轨迹在标准ISO/IEC 14496-12中的定义为:“timed sequence of related samples(q.v.)in an ISO base media file”,该定义的中文翻译为“ISO媒体文件中相关样本的时间属性序列”。
对于媒体数据来说,一个track就是个图像或者音频样本序列;对于提示轨迹,一个轨迹对应一个流频道(For media data,a track corresponds to a sequence of images or sampled audio;for hint tracks,a track corresponds to a streaming channel)。
样本(Sample):与时间戳相关联的数据。在ISO/IEC 14496-12中有如下定义和解释:“all the data associated with a single timestamp”
在一个轨迹中,不存在对应同一个时间戳的两个样本(No two samples within a track can share the same time-stamp)。在非提示轨迹中,一个样本可以是一个视频帧或者在解码顺序下的一系列视频帧,或者压缩后的一个音频帧;在提示轨迹中,一个样本定义了一个或多个流数据包的格式(In non-hint tracks,a sample is,for example,an individual frame of video,a series of video frames in decoding order,or a compressed section of audio in decoding order;in hint tracks,a sample defines the formation of one or more streaming packets)。
样本入口(sample entry):样本入口用以描述样本(sample)的格式,样本入口的类型来决定了样本(sample)的解码方式。
MMT:MPEG Media Transport,定义了基于包传输网络的多媒体服务的封装格式,传输协议和消息发送机制。
盒子(box):ISOBMFF文件是由多个盒子(box)构成,其中,一个box可以包括其它的box。轨迹中可以包括元数据box(moov box)和/或媒体数据box(mdat box)。
box在ISO/IEC 14496-12标准中的定义为:“object-oriented building block defined by a unique type identifier and length”,该定义的中文翻译为“面向对象的构建块,由唯一的类型标识符和长度定义”。
另外,box在某些规范被成为“原子”,可以包括MP4的第一个定义(Called“atom”in some specifications,including the first definition of MP4)。
辅助增强信息(supplementary enhancement information,SEI):是国际通信联盟(International Telecommunication Union,ITU)发布的视频编解码标准h.264,h.265中定义的一种网络接入单元(Network Abstract Layer Unit,NALU)的类型。
有时间属性的元数据轨迹(Timed metadata track):与时间顺序相关的信息元数据流。
覆盖层(Overlay):覆盖层,在背景视频或背景图像的某个区域之上额外叠加渲染的一层视频或者图片或者文本(可以具有时间属性)。(piece of visual media rendered over omnidirectional video or image item or over a viewport)
媒体呈现描述(Media presentation description,MPD):是标准ISO/IEC 23009-1中规定的一种文档,在该文档中包括了客户端构造HTTP-URL的元数据。在MPD中包括一个或者多个周期(period)元素,每个period元素可以包括有一个或者多个自适应集(adaptationset),每个adaptationset中可以包括一个或者多个表示(representation),每个representation中可以包括一个或者多个分段,客户端根据MPD中的信息,选择表示,并构建分段的http-URL,用于请求相应的分段。
为了描述媒体数据的有关时间的属性信息,OMAF标准中规定了在球面上的区域(region)的有时间属性的元数据轨迹(timed metadata track)。该元数据轨迹中的元数据的box中包含的是描述球面的元数据,在元数据的box中描述了有时间属性的元数据轨迹的意图,也就是球面区域是用来做什么的,在OMAF标准中描述了两种有时间属性的元数据轨迹:推荐视角元数据轨迹(the recommended viewport timed metadata track)和初始视点轨迹(the initial viewpoint timed metadata track)。其中,推荐视角轨迹描述了推荐给客户端呈现的视角的区域,初始视点轨迹描述了全景视频观看时的初始呈现方向。
现有的OMAF标准中规定的球面区域样本入口(Sample Entry)的格式如下:
Figure PCTCN2018125807-appb-000001
Figure PCTCN2018125807-appb-000002
上述球面区域样本入口中各个字段的语义如下:
Shape_type:用来描述球面区域形状类型;
Reserved:保留字段;
dynamic_range_flag:该值为0时表示区域的是水平垂直范围不变,该值为1时表示区域水平和垂直区域在样本中描述;
static_azimuth_range:区域的方位角覆盖范围;
static_elevation_range:区域的仰角覆盖范围;
num_regions:元数据轨迹中的区域个数。
OMAF中定义了两种球面区域形状类型,一种为四个大圆(Azimuth Circle)合成形成的形状,其shape_type值为0;另一种为两个大圆和两个小圆(Elevation Circle)合成形成的形状,其shape_type值为1。
现有的OMAF标准中规定的球面区域样本(Sample)格式定义如下:
Figure PCTCN2018125807-appb-000003
现有的OMAF标准中定义球面上的区域(region)的表示方法,其具体语法定义如下:
Figure PCTCN2018125807-appb-000004
上述球面区域样本中的各个字段的语义如下:
center_azimuth、center_elevation:表示球面区域的中心点位置;
Center_tilt:表示区域的倾斜角度;
azimuth_range:区域的方位角覆盖范围;
Elevation_range:区域的仰角覆盖范围。
在现有的OMAF标准中定义了覆盖层(overlay)的基本数据结构和携带方式。
其中用于表示覆盖层(overlay)的数据结构格式定义如下:
Figure PCTCN2018125807-appb-000005
表格1:overlay数据结构定义
其各字段语义如下:
OverlayStruct()中包括了覆盖层有关的信息,OverlayStruct()可以位于媒体数据盒子,或者位于元数据盒子(box)中。
SingleOverlayStruct()定义了一个覆盖层。
num_overlays定义该结构体中描述的覆盖层(overlay)的个数。num_overlays的值为0保留。
num_flag_bytes定义overlay_control_flag[i]元素总共占有多少个字节。num_flag_bytes的值为0保留.
overlay_id表示覆盖层(overlay)的唯一的标识。两个不同的覆盖层(overlay)不能拥有相同的overlay_id值。
overlay_control_flag[i]当该值为1时表示第i个overlay_control_struct[i]定义的结构体会出现。OMAF播放器应该支持所有的i值的overlay_control_flag[i]的所有可能的值.
overlay_control_essential_flag[i]当该值为0时表示OMAF播放器不需要处理第i个overlay_control_struct[i]定义的结构体.
overlay_control_essential_flag[i]当该值为1时表示OMAF播放器需要处理第i个overlay_control_struct[i]定义的结构体。当该值为1并且OMAF播放器没有能力处理第 i个overlay_control_struct[i]定义的结构体时,OMAF播放器则不应该显示覆盖层(overlay)和背景视频流。
byte_count[i]表示第i个overlay_control_struct[i]结构体占用的字节数。
overlay_control_struct[i][byte_count[i]]定义了有byte_count[i]表示的字节数的第i个结构体,其中每一个结构体都可以称为一个覆盖层控制结构(overlay control structure),每个覆盖层控制结构描述该覆盖层不同的属性。
其中overlay_control_struct定义了覆盖层(overlay)的显示区域,内容来源,优先级,透明度等等属性,其中具体定义的属性如下表:
Figure PCTCN2018125807-appb-000006
表格2:Overlay control structures
在上述表格中定义的各结构具体功能如下:
0:定义了参数表示覆盖层(overlay)显示的位置是相对于用户视角(viewport)的。
1:定义了参数表示覆盖层(overlay)显示的位置是相对于全景球面的。
2:定义了参数表示2D的覆盖层(overlay)显示的位置是相对于全景球面的。
3:定义了参数表示覆盖层(overlay)的内容来源,该结构表示覆盖层的内容来源是来自解码图像的。
4:定义了参数表示覆盖层(overlay)的内容来源,该结构表示覆盖层的内容来源是来自推荐视角的。
5:定义了参数表示覆盖层(overlay)显示的顺序。
6:定义了参数表示覆盖层(overlay)的透明度。
7:定义了参数表示覆盖层(overlay)的与用户可互动的操作。
8:定义了参数表示覆盖层(overlay)的标签。
9:定义了参数表示覆盖层(overlay)的优先级。
其中,0-2指示了覆盖层的渲染位置,3和4指示了覆盖层的内容来自于哪里。
上述语法定义了一个或者多个覆盖层(overlay)的表示方法和相关参数。当覆盖层(overlay)是静态的时候,其相关结构体OverlayStruct携带在Overlay Configuration Box中,其中该Overlay Configuration Box位于媒体数据轨迹中。当覆盖层(overlay)为动态的时候,其相关结构体OverlayStruct携带在Overlay timed metadata track的sample entry和sample中。
同时,在OMAF标准中还定义了覆盖层(overlay)在DASH MPD(Media Presentation Description)中的格式.OMAF标准在MPD中定义覆盖层描述字(overlay descriptor),其@schemeIdUri为"urn:mpeg:mpegI:omaf:2018:ovly",最多一个该描述字可以出现在MPD的adaptation set level,用于表示与该adaptation set所关联的覆盖层(overlay)。在该覆盖层描述字中,使用一个属性值表示该覆盖层的标识,具体如下:
表格3–Semantics of the attributes of the OVLY descriptor
Figure PCTCN2018125807-appb-000007
在现有方案中,定义了覆盖层(overlay)图像的基本数据结构和携带方式,但对于覆盖层的显示方式较为单一,不够灵活。因此,本申请提出了一种处理媒体数据的方法,通过携带所述覆盖层关联的区域信息,从而能够支持覆盖层的有条件的显示,从而能够更灵活地显示覆盖层。
其中,有条件的显示或者有条件的触发显示指的是在检测到触发操作后进行显示或者关闭显示。
图1为本发明实施例提供的一种媒体数据处理系统的架构示意图,如图1所示,该媒体数据处理系统可以包括服务器10和客户端20。
服务器10:可以包括编码前处理器、视频编码器、码流封装装置(可以用于生成MPD,当然服务器10也可以包括额外的部件来生成MPD)和发送传输装置中至少一种,对全景视频进行前处理,编码或转码的操作,同时将编码后的码流数据封装为可传输的文件,通过网络传输到客户端或者内容分发网络;除此之外,服务器可以根据客户端反馈的信息(如用户视角、基于服务器10发送的MPD建立的分段请求等),选择需要传输的内容进行信号传输。
在具体实现过程中,编码前处理器可以用于将全景视频图像进行裁剪,色彩格式变换, 色彩校正或者去噪等预处理操作。
视频编码器可以用于对获得的视频图像进行编码(可以包括划分)形成码流数据。
码流封装装置可以用于将码流数据和相应的元数据封装成用于传输或者存储的文件格式,例如,ISO基本媒体文件格式。
发送传输装置可以是输入/输出接口,也可以是通信接口,可以用于发送封装后的码流数据、MPD与媒体数据传输相关的信息给客户端。
发送传输装置还可以是接收装置,接收装置可以是输入/输出接口,也可以是通信接口,可以用于接收客户端20发送的分段请求信息,用户视角信息或者其他媒体数据传输相关的信息。
服务器10可以使用接收装置获取全景视频图像,也可以包括图像源,图像源可以是相机或者摄像装置等,用于生成全景视频图像。
客户端20:可以是VR眼镜,手机,平板,电视,电脑等可以连上网络的电子设备。客户端20接收服务器10发送的MPD或者媒体数据,并进行码流解封装以及解码和显示。
客户端20可以包括:接收装置、码流解封装装置、视频解码器和显示装置中至少一种。
在具体实现过程中,接收装置可以是输入/输出接口,也可以是通信接口,可以用于接收封装后的码流数据、MPD与媒体数据传输相关的信息。
码流解封装装置可以用于获取需要的码流数据和相应的元数据。
视频解码器可以用于根据相应的元数据和码流数据解码得到视频图像。
显示装置可以用于对视频图像进行显示,或者根据相应的元数据,对视频图像进行显示。
接收装置还可以是发送装置,用于向服务器10发送用户视角信息、其他媒体数据传输相关的信息或者根据MPD发送分段请求信息。
接收装置还可以接收用户的指令,例如接收装置可以是连接鼠标的输入接口。
显示装置还可以是触摸显示频,用于在显示的视频图像的同时接收用户指令,以实现与用户的交互。
应理解,编码前处理器、视频编码器、码流封装装置、码流解封装装置或者视频解码器可以通过处理器读取存储器中的指令并执行指令的方式实现,也可以通过芯片电路实现。
本发明实施例提供的处理媒体数据的方法可应用于服务器10或者客户端20,具体的,服务器10可以针对编码后的overlay图像关联的覆盖层关联的区域的描述,放入码流封装视频文件格式或者MPD描述中。客户端20可以针对码流使用相应的解封装装置来获取封装后的关于overlay关联的覆盖层关联的区域的信息,从而指导客户端播放器(可以包括接收装置,码流解封装装置和视频解码器中的至少一项)利用显示装置对overlay图像进行有条件的显示以及对背景视频或背景图像进行显示。
需要说明的是,在本申请中,有条件的覆盖层指的是在检测到针对所述覆盖层关联的区域的触发操作时,才会显示或者关闭显示的覆盖层。对覆盖层进行有条件显示指的是在检测到针对所述覆盖层关联的区域的触发操作时,才会显示或者关闭显示覆盖层。
图2是本发明实施例提供的一种实施场景示意图,如图2所示,本发明实施例应用于描述有条件的覆盖层(overlay)的表示,用户可以通过点击背景视频或背景图像上的覆盖层关联的区域来开关该覆盖层(overlay)的显示。例如,在图2中,用户通过点击球 员出现的区域可以触发描述该球员属性的覆盖层(overlay),如姓名年龄等。
图3是本申请实施例的处理媒体数据的方法的示意性流程图。图3所示的方法可以由客户端执行,客户端可以是位于客户端设备上为客户提供视频播放服务的程序,客户端可以是具有播放全景视频功能的设备,例如,VR设备。
图3所示的方法可以包括步骤310和步骤320,下面对步骤310和步骤320进行详细的描述。
310、获取覆盖层和所述覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域。
其中,其中获取覆盖层可以通过先获取MPD,再通过MPD从服务器获取覆盖层。本发明实施例其中其它的信息或者标识如果不在MPD中,也可以通过MPD从服务器获取得到,在此不做限定。
320、在检测到针对所述覆盖层关联的区域的触发操作时,显示所述覆盖层。
其中,所述覆盖层为用于叠加在背景视频或背景图像(可以是背景视频或背景图像上至少部分区域)上进行显示的视频、图像或者文本。
其中,所述针对所述覆盖层关联的区域的触发操作可以包括:在所述覆盖层关联的区域内的点击操作,或者,用户视线位于所述覆盖层关联的区域的触发操作。
可选的,所述覆盖层关联的区域信息可以位于覆盖层控制结构(overlay control structure)中。在具体实现过程中,覆盖层关联的区域信息可以位于不同于前文中的9种覆盖层控制结构的新的覆盖层控制结构中,其名称可以为关联区域结构(AssociatedSphereRegionStruct)。
可选的,所述覆盖层关联的区域信息也可以位于媒体呈现描述MPD中。在具体实现过程中,所述覆盖层关联的区域信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。该覆盖层描述字的@schemeIdUri可以为"urn:mpeg:mpegI:omaf:2018:ovly",该描述字可以出现在MPD的自适应集中,用于表示与该自适应集所关联的覆盖层(overlay)。
其中,所述覆盖层关联的区域信息为平面区域信息或者球面区域信息。
其中,所述覆盖层关联的区域信息可以包括所述覆盖层关联的区域的位置信息。可选地,
所述覆盖层关联的区域的位置信息可以包括以下几种情况:
(1)所述覆盖层关联的区域为球面区域,所述覆盖层关联的区域信息为所述覆盖层关联的区域的中心点的球面坐标值。
例如,当前视角的中心点的球面坐标值为(X,Y,Z),其中,X对应球面坐标的方位角(azimuth)或者偏航角(yaw),Y对应球面坐标的俯仰角(pitch或者elevation),Z对应球面坐标的倾斜角(tilt)或者翻滚角(roll)。
(2)所述覆盖层关联的区域为平面区域,所述覆盖层关联的区域信息为所述覆盖层关联的区域的中心点的平面坐标值。
例如,所述覆盖层关联的区域的中心点的二维坐标值为(X,Y),其中,X和Y分别表示所述覆盖层关联的区域的中心点在二维直角坐标系中的横坐标和纵坐标。
(3)所述覆盖层关联的区域为平面区域,所述覆盖层关联的区域信息为所述覆盖层关联的区域的左上角/右上角/左下角/右下角的二维坐标值。
例如,所述覆盖层关联的区域为平面区域,所述覆盖层关联的区域信息为所述覆盖层关联的区域的左上角的二维坐标值为(X,Y),其中,X和Y分别表示所述覆盖层关联的区域为平面区域,所述覆盖层关联的区域信息为所述覆盖层关联的区域的左上角在二维直角坐标系中的横坐标和纵坐标。
其中,所述覆盖层关联的区域信息可以包括所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高,可选地,所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高可以包括以下几种情况:
(1)所述覆盖层关联的区域为球面区域,所述覆盖层关联的区域的方位角范围(偏航角范围)和俯仰角范围。
例如,所述覆盖层关联的区域的方位角范围(偏航角范围)为110度,俯仰角范围为90度。
(2)所述覆盖层关联的区域为平面区域,所述覆盖层关联的区域的覆盖范围可以包括所述覆盖层关联的区域的宽度和高度。
可选的,所述方法还可以包括:获取触发类型信息;所述针对所述覆盖层关联的区域的触发操作可以包括针对所述覆盖层关联的区域的所述触发类型信息指示的触发操作。
在具体实现过程中,所述触发类型信息位于覆盖层控制结构中。例如可以是上文中的关联区域结构(AssociatedSphereRegionStruct)。
在具体实现过程中,所述触发类型信息也可以位于媒体呈现描述MPD中。在具体实现过程中,所述触发类型信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。其中,该覆盖层描述字的@schemeIdUri可以为上文中的"urn:mpeg:mpegI:omaf:2018:ovly"。
可选的,所述方法还可以包括:获取条件触发标识,在所述条件触发标识的值为第一预设值时,检测是否有针对所述覆盖层关联的区域的触发操作。其中,所述条件触发标识的值为第一预设值可以用于指示所述覆盖层的显示或者关闭显示受用于触发所述覆盖层显示或者关闭显示的触发操作控制。所述条件触发标识的值为第二预设值可以用于指示所述覆盖层的显示或者关闭显示不受用于触发所述覆盖层显示或者关闭显示的触发操作控制。在具体实现过程中,所述条件触发标识位于用于用户交互控制的覆盖层控制结构中,例如可以是上文中Bit index为7时对应的覆盖层控制结构(例如,OverlayInteraction)。
应理解,与图3相关的更多的内容在前文中以及发明内容中已有介绍,在此不再赘述。
图4是本申请实施例的处理媒体数据的方法的示意性流程图。图4所示的方法可以由客户端执行,客户端可以是位于客户端设备上为客户提供视频播放服务的程序,客户端可以是具有播放全景视频功能的设备,例如,VR设备。
图4所示的方法可以包括步骤410、步骤420、步骤430和步骤440,下面对步骤410、步骤420、步骤430和步骤440进行详细的描述。
410、获取覆盖层,背景视频或背景图像和所述覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域。
420、获取初始状态标识;在所述初始状态标识的值指示默认关闭所述覆盖层的显示的情况下,执行如下操作:
430、显示所述背景视频或背景图像。
其中,可以在未检测到针对所述覆盖层关联的区域的触发操作时,才执行所述显示所述背景视频或背景图像。
440、在检测到针对所述覆盖层关联的区域的触发操作时,将在所述背景视频或背景图像上叠加所述覆盖层,并显示叠加后的视频图像。
可选的,在所述初始状态标识的值指示默认显示所述覆盖层的情况下,可以执行如下操作:
将在所述背景视频或背景图像上叠加所述覆盖层,并显示叠加后的视频图像;在检测到针对所述覆盖层关联的区域的触发操作时,显示所述背景视频或背景图像。其中,可以是在未检测到针对所述覆盖层关联的区域的触发操作时,才执行所述显示叠加后的视频图像。
可选的,所述覆盖层关联的区域信息和所述初始状态标识可以位于覆盖层控制结构(overlay control structure)中。在具体实现过程中,所述覆盖层关联的区域信息和所述初始状态标识可以位于上文中的关联区域结构(AssociatedSphereRegionStruct)中。
可选的,所述覆盖层关联的区域信息和所述初始状态标识可以位于媒体呈现描述MPD中。在具体实现过程中,所述覆盖层关联的区域信息和所述初始状态标识可以为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。其中,该覆盖层描述字的@schemeIdUri可以为上文中的"urn:mpeg:mpegI:omaf:2018:ovly"。
可选的,所述方法还可以包括:获取触发类型信息;所述针对所述覆盖层关联的区域的触发操作可以包括针对所述覆盖层关联的区域的所述触发类型信息指示的触发操作。
在具体实现过程中,所述触发类型信息位于覆盖层控制结构中。例如可以是上文中的关联区域结构(AssociatedSphereRegionStruct)。
在具体实现过程中,所述触发类型信息也可以位于媒体呈现描述MPD中。在具体实现过程中,所述触发类型信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。其中,该覆盖层描述字的@schemeIdUri可以为上文中的"urn:mpeg:mpegI:omaf:2018:ovly"。
可选的,所述方法还可以包括:获取条件触发标识,在所述条件触发标识的值为第一预设值时,检测是否有针对所述覆盖层关联的区域的触发操作。其中,所述条件触发标识的值为第一预设值可以用于指示所述覆盖层的显示或者关闭显示受用于触发所述覆盖层显示或者关闭显示的触发操作控制。所述条件触发标识的值为第二预设值可以用于指示所述覆盖层的显示或者关闭显示不受用于触发所述覆盖层显示或者关闭显示的触发操作控制在具体实现过程中,所述条件触发标识位于用于用户交互控制的覆盖层控制结构中,例如可以是上文中Bit index为7时对应的覆盖层控制结构(OverlayInteraction)。
应理解,与图4相关的更多的内容在前文中以及发明内容中已有介绍,在此不再赘述。
图5是本申请实施例的处理媒体数据的方法的示意性流程图。图5所示的方法可以由服务器执行,图5所示的方法可以包括步骤510和步骤520,下面对步骤510和步骤520进行详细的描述。
510、确定覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域。
其中,确定区域信息,可以通过获取用户标记出的区域,来确定该区域的信息。也可 以通过识别背景视频或背景图像中的覆盖层对应的物体图形或人物图形,确定出可以包括该物体图形或人物图形的区域的区域信息。
520、向客户端发送所述覆盖层关联的区域信息。
其中,所述覆盖层关联的区域上的触发操作用于控制所述覆盖层的显示或者关闭显示。其中,所述触发操作可以包括:在所述覆盖层关联的区域内的点击操作,或者,用户视线位于所述覆盖层关联的区域的触发操作。
其中,所述覆盖层为用于叠加在背景视频或背景图像(可以是背景视频或背景图像上至少部分区域)上进行显示的视频、图像或者文本。
其中,该方法还可以包括获取覆盖层,编码覆盖层得到覆盖层的码流数据,向客户端发送覆盖层的码流数据。其中,覆盖层可以为用于全景视频图像的图像。
其中,该方法还可以包括获取背景视频或背景图像,编码背景视频背景图像得到背景视频或背景图像的码流数据,向客户端发送背景视频或背景图像的码流数据。其中,背景视频或背景图像可以为用于全景视频图像的图像。
其中,所述背景视频或背景图像显示的内容可以为目标事物,所述覆盖层的内容可以为所述目标事物的文字信息。
可选的,所述覆盖层关联的区域信息可以位于覆盖层控制结构(overlay control structure)中。例如可以是上文中的关联区域结构(AssociatedSphereRegionStruct)。
可选的,所述覆盖层关联的区域信息也可以位于媒体呈现描述MPD中。在具体实现过程中,所述覆盖层关联的区域信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。该覆盖层描述字的@schemeIdUri可以为上文中的"urn:mpeg:mpegI:omaf:2018:ovly"。
其中,所述覆盖层关联的区域信息可以包括所述覆盖层关联的区域的位置信息。在具体实现过程中,所述覆盖层关联的区域的位置信息可以包括所述覆盖层关联的区域中心点的位置信息,或者所述覆盖层关联的区域的左上角点的位置信息。
其中,所述覆盖层关联的区域信息可以包括所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高。
其中,所述覆盖层关联的区域信息为平面区域信息或者球面区域信息。
可选的,所述方法还可以包括:向所述客户端发送触发类型信息,所述触发类型信息用于指示用于触发所述覆盖层显示或者关闭显示的触发操作的触发类型。
在具体实现过程中,所述触发类型信息位于覆盖层控制结构中。例如可以是上文中的关联区域结构(AssociatedSphereRegionStruct)。
在具体实现过程中,所述触发类型信息也可以位于媒体呈现描述MPD中。在具体实现过程中,所述触发类型信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。其中,该覆盖层描述字的@schemeIdUri可以为上文中的"urn:mpeg:mpegI:omaf:2018:ovly"。
可选的,所述方法还可以包括:向所述客户端发送条件触发标识,所述条件触发标识的值为第一预设值用于指示所述覆盖层的显示或者关闭显示受用于触发所述覆盖层显示或者关闭显示的触发操作控制。
其中,所述条件触发标识的值为第二预设值用于指示所述覆盖层的显示或者关闭显示 不受用于触发所述覆盖层显示或者关闭显示的触发操作控制。
在具体实现过程中,所述条件触发标识位于用于用户交互控制的覆盖层控制结构中,例如可以是上文中Bit index为7时对应的覆盖层控制结构(OverlayInteraction)。
可选的,所述方法还可以包括:向所述客户端发送初始状态标识,所述初始状态标识用于指示所述覆盖层在初始状态下为显示状态,或者用于指示所述覆盖层在初始状态下为关闭显示状态。
可选的,所述初始状态标识可以位于覆盖层控制结构(overlay control structure)中。在具体实现过程中,所述初始状态标识可以位于上文中的关联区域结构(AssociatedSphereRegionStruct)中。
可选的,所述初始状态标识可以位于媒体呈现描述MPD中。在具体实现过程中,所述初始状态标识可以为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。其中,该覆盖层描述字的@schemeIdUri可以为上文中的"urn:mpeg:mpegI:omaf:2018:ovly"。
应理解,与图5相关的更多的内容在前文中以及发明内容中已有介绍,在此不再赘述。
上文结合图3至图5对本申请实施例的处理媒体数据的方法进行了详细的描述,下面结合具体的实施例对图3至图5中的实施细节进行详细的描述。
实施例一:
在本实施例中,新定义一个图3至图5中所需的覆盖层控制结构(overlay control structure),在该结构中定义一个与该覆盖层(overlay)相关联的球面区域用于表示用户可点击的背景视频或背景图像中的球面区域。当客户端检测到码流中出现该覆盖层(overlay)控制结构,则进一步解析该结构中定义的球面区域,从而可以在用户点击到该区域的时触发与该区域相关联的覆盖层(overlay)的显示。
在本实施例中,新定义一个覆盖层控制结构(overlay control structure)为AssociatedSphereRegionStruct,其具体语法如下:
Figure PCTCN2018125807-appb-000008
其中具体语义如下:
SphereRegionStruct(1)定义了一个与覆盖层(overlay)相关联的球面区域。在其中可以包括覆盖层关联的区域信息,具体可以为球面区域信息。
当上述定义的球面区域出现在用户视角范围内时,依赖于客户端的配置或者是用户界面的提示,用户可以通过点击该区域来触发与之相关联的覆盖层(overlay)的显示或关闭。
服务器端的步骤为:
步骤一:服务器获得全景视频码流以及对应的一个或多个overlay内容码流。
步骤二:在视频封装器(码流封装装置)中,按照视频文件格式封装。在OMAF标准文件格式中,使用上述定义的overlay控制结构,表示与overlay关联的球面区域出现在用户视角范围内时,用户可以通过点击该区域来触发与之相关联的overlay的显示或关闭显示。
步骤三:将封装后的码流送入发送传输装置,进行信号传输发送。
客户端的步骤为:
步骤一:接收装置获取全景视频内容封装后的码流。
步骤二:将码流送入码流解封装装置,进行解封装并进行解析。在该步骤,码流解封装装置寻找并解析到overlay的控制结构AssociatedSphereRegionStruct,获知该overlay是通过用户点击其关联的区域来触发显示或关闭显示的。
步骤三:视频解码在显示装置播放时,可在客户端配置或用户界面提示中,包含针对该overlay可进行用户触发显示或关闭显示的提示。
在本发明实施例中,通过在一个新定义的覆盖层控制结构里描述一个球面区域,该球面区域与覆盖层(overlay)相关联,从而使得用户可以点击该球面区域来控制与之相关联的覆盖层(overlay)的显示。
本发明实施例通过新增的结构定义来支持有条件的覆盖层(overlay)的显示,使用户能够个性化的进行操作和个性化显示。
实施例二
在本发明实施例中,同实施例一,新定义一个图3至图5中所需的覆盖层控制结构(overlay control structure)为AssociatedSphereRegionStruct,在其中定义一个2D区域(平面区域),用于表示用户点击到该区域的时候触发与该区域相关联的覆盖层(overlay)的显示。
具体的语法定义如下:
Figure PCTCN2018125807-appb-000009
其中具体语义如下:
2DRegionStruct()定义了一个与覆盖层(overlay)相关联的二维区域。在其中可以包括覆盖层关联的区域信息,具体可以为平面区域信息。
object_x和object_y可以为覆盖层关联的区域的位置信息,表示该区域左上角顶点在背景VR流内容(背景视频或者背景图像)中的二维坐标中的x,y位置。
object_width和object_height可以为所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高,表示该区域在背景VR流内容中的二维坐标中的宽和高。
当上述定义的球面区域出现在用户视角范围内时,依赖于客户端的配置或者是用户界面的提示,用户可以通过点击该区域来触发与之相关联的覆盖层(overlay)的显示或关闭显示。
服务器端和客户端的具体操作步骤请参考实施例一。
本发明实施例提供了与覆盖层(overlay)相关联的区域用二维坐标系(平面坐标系) 表示。
实施例三
在本实施例中,同实施例一,我们新定义一个覆盖层控制结构(overlay control structure),在该结构中定义一个与该覆盖层(overlay)相关联的球面区域用于表示用户可点击的背景VR视频流中的球面区域。当客户端检测到码流中出现该覆盖层(overlay)控制结构,则进一步解析该结构中定义的球面区域,从而可以在用户点击到该区域的时候触发与该区域相关联的覆盖层(overlay)的显示。额外的,在该覆盖层控制结构(overlay control structure)中,我们定义一个标志用以标志该覆盖层(overlay)的初始状态,用以表示在用户未进行任何操作时该覆盖层(overlay)默认是显示还是不显示。
在本实施例中,新定义一个覆盖层控制结构(overlay control structure)为AssociatedSphereRegionStruct,其具体语法如下:
Figure PCTCN2018125807-appb-000010
其中具体语义如下:
initial_status定义了一个标记,可以为上文中的初始状态标识,表示默认情况下该覆盖层(overlay)是否显示。
SphereRegionStruct(1)定义了一个与覆盖层(overlay)相关联的球面区域。
当上述定义的球面区域出现在用户视角范围内时,依赖于客户端的配置或者是用户界面的提示,用户可以通过点击该区域来触发与之相关联的覆盖层(overlay)的显示或关闭。
可替代的,如实施例二所示,覆盖层控制结构AssociatedSphereRegionStruct()中定义的SphereRegionStruct(1)可用实施例二中定义的2DRegionStruct()替换,则定义覆盖层控制结构语法如下:
Figure PCTCN2018125807-appb-000011
其中具体语义如下:
initial_status定义了一个标记,可以为上文中的初始状态标识,表示默认情况下该覆盖层(overlay)是否显示。
2DRegionStruct()定义了一个与覆盖层(overlay)相关联的二维区域。
object_x和object_y表示该区域左上角顶点在背景VR流内容中的二维坐标中的x,y位置。object_x和object_y可以为覆盖层关联的区域的位置信息,表示该区域左上角顶点在背景VR流内容(背景视频或者背景图像)中的二维坐标中的x,y位置。
object_width和object_height表示该区域在背景VR流内容中的二维坐标中的宽和高。object_width和object_height可以为所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高,表示该区域在背景VR流内容中的二维坐标中的宽和高。
在本发明实施例中,通过在一个新定义的覆盖层控制结构里描述一个球面区域,该球面区域与覆盖层(overlay)相关联,从而使得用户可以点击该球面区域来控制与之相关联的覆盖层(overlay)的显示。同时在覆盖层控制结构里定义了一个标志,用以表示默认情况下该覆盖层(overlay)是否显示。
本发明实施例通过新增的结构定义来支持有条件的覆盖层(overlay)的显示。
实施例四
在本实施例中,同实施例一,我们新定义一个覆盖层控制结构(overlay control structure),在该结构中定义一个与该覆盖层(overlay)相关联的球面区域用于表示用户可点击的背景VR视频流中的球面区域。当客户端检测到码流中出现该覆盖层(overlay)控制结构,则进一步解析该结构中定义的球面区域,从而可以在用户点击到该区域的时候触发与该区域相关联的覆盖层(overlay)的显示。额外的,在该覆盖层控制结构(overlay control structure)中,我们定义一个标志用以标志该覆盖层(overlay)显示被触发的类型,并为该标志定义一个值来表示该触发类型为用户点击触发。
在本实施例中,新定义一个覆盖层控制结构(overlay control structure)为AssociatedSphereRegionStruct,其具体语法如下:
Figure PCTCN2018125807-appb-000012
其中具体语义如下:
condition_type定义了一个标记,可以为上文中的触发类型信息,表示覆盖层(overlay)被触发显示的类型。
SphereRegionStruct(1)定义了一个与覆盖层(overlay)相关联的球面区域。
定义当condition_type值为0时,表示触发类型为用户点击触发,其它值保留。具体定义如下:
Value Description
0 用户点击关联的球面区域触发覆盖层的显示或关闭。
1…255 保留
当上述定义的球面区域出现在用户视角范围内时,依赖于客户端的配置或者是用户界 面的提示,用户可以通过点击该区域来触发与之相关联的覆盖层(overlay)的显示或关闭。
可替代的,如实施例二所示,覆盖层控制结构AssociatedSphereRegionStruct()中定义的SphereRegionStruct(1)可用实施例二中定义的2DRegionStruct()替换,则定义覆盖层控制结构语法如下:
Figure PCTCN2018125807-appb-000013
其中具体语义如下:
condition_type定义了一个标记,可以为上文中的触发类型信息,表示覆盖层(overlay)被触发显示的类型。
2DRegionStruct()定义了一个与覆盖层(overlay)相关联的二维区域。
object_x和object_y表示该区域左上角顶点在背景VR流内容中的二维坐标中的x,y位置。object_x和object_y可以为覆盖层关联的区域的位置信息,表示该区域左上角顶点在背景VR流内容(背景视频或者背景图像)中的二维坐标中的x,y位置。
object_width和object_height表示该区域在背景VR流内容中的二维坐标中的宽和高。object_width和object_height可以为所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高,表示该区域在背景VR流内容中的二维坐标中的宽和高。
额外的,如实施例三所示,在本实施例定义的覆盖层控制结构AssociatedSphereRegionStruct()中还可以增加initial_status标记,用于表示默认情况下覆盖层(overlay)是否显示,具体语法和语义同实施例三。
在本发明实施例中,通过在一个新定义的覆盖层控制结构里描述一个球面区域,该球面区域与覆盖层(overlay)相关联,从而使得用户可以点击该球面区域来控制与之相关联的覆盖层(overlay)的显示。同时在覆盖层控制结构里定义了一个标志,用以表示覆盖层(overlay)被触发显示的类型。
本发明实施例通过新增的结构定义来支持有条件的覆盖层(overlay)的显示。
实施例五
在本实施例中,同实施例一,我们新定义一个覆盖层控制结构(overlay control structure),在该结构中定义一个与该覆盖层(overlay)相关联的球面区域用于表示用 户可点击的背景VR视频流中的球面区域。当客户端检测到码流中出现该覆盖层(overlay)控制结构,则进一步解析该结构中定义的球面区域,从而可以在用户点击到该区域的时候触发与该区域相关联的覆盖层(overlay)的显示。额外的,在覆盖层交互结构(overlay interaction structure)中,我们定义一个标志用以标志该覆盖层(overlay)是有条件触发显示的。
在本实施例中,新定义一个覆盖层控制结构(overlay control structure)为AssociatedSphereRegionStruct,其具体语法如下:
Figure PCTCN2018125807-appb-000014
其中具体语义如下:
SphereRegionStruct(1)定义了一个与覆盖层(overlay)相关联的球面区域。
同时在覆盖层交互结构(overlay control structure)中定义一个标记,用于表示该覆盖层(overlay)是有条件触发显示的。
其具体语法如下:
Figure PCTCN2018125807-appb-000015
其中具体语义如下:
conditional_switch_on_off_flag定义了一个标记,可以为上文中的条件触发标识,表示该覆盖层(overlay)是有条件触发显示的。
其中,OverlayInteraction为上文中的用于用户交互控制的覆盖层控制结构。
当上述定义的球面区域出现在用户视角范围内时,依赖于客户端的配置或者是用户界面的提示,用户可以通过点击该区域来触发与之相关联的覆盖层(overlay)的显示或关闭。
可替代的,如实施例二所示,覆盖层控制结构AssociatedSphereRegionStruct()中定义的SphereRegionStruct(1)可用实施例二中定义的2DRegionStruct()替换,则定义覆盖层控制结构语法如下:
Figure PCTCN2018125807-appb-000016
其中具体语义如下:
2DRegionStruct()定义了一个与覆盖层(overlay)相关联的二维区域。
object_x和object_y可以表示该区域左上角顶点在背景VR流内容中的二维坐标中的x,y位置。object_x和object_y可以为覆盖层关联的区域的位置信息,表示该区域左上角顶点在背景VR流内容(背景视频或者背景图像)中的二维坐标中的x,y位置。
object_width和object_height可以表示该区域在背景VR流内容中的二维坐标中的宽和高。object_width和object_height可以为所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高,表示该区域在背景VR流内容中的二维坐标中的宽和高。
额外的,如实施例三所示,在本实施例定义的覆盖层控制结构AssociatedSphereRegionStruct()中还可以增加initial_status标记,用于表示默认情况下覆盖层(overlay)是否显示,具体语法和语义同实施例三。
在本发明实施例中,通过在一个新定义的覆盖层控制结构里描述一个球面区域,该球面区域与覆盖层(overlay)相关联,从而使得用户可以点击该球面区域来控制与之相关联的覆盖层(overlay)的显示。同时在覆盖层交互结构(overlay interaction structure)里定义了一个标志,用以表示覆盖层(overlay)是有条件触发显示的。
本发明实施例通过新增的结构定义来支持有条件的覆盖层(overlay)的显示。
实施例六
在本实施例中,我们在MPD新增覆盖层(overlay)的关联区域描述,用于表示用户可点击的背景VR视频流中的球面区域。当客户端检测到MPD中出现该覆盖层(overlay)关联的球面区域,从而可以在用户点击到该区域的时候触发与该区域相关联的覆盖层(overlay)的显示。
在OMAF标准中已经在MPD中定义覆盖层描述字(overlay descriptor),其@schemeIdUri为"urn:mpeg:mpegI:omaf:2018:ovly",最多一个该描述字可以出现在MPD的adaptation set level,用于表示与该adaptation set所关联的覆盖层(overlay)。
在本实施例中,在MPD的覆盖层描述字(overlay descriptor)中描述其所关联的球面区域,其具体语法如下:
Figure PCTCN2018125807-appb-000017
Figure PCTCN2018125807-appb-000018
当上述定义的球面区域出现在用户视角范围内时,依赖于客户端的配置或者是用户界 面的提示,用户可以通过点击该区域来触发与之相关联的覆盖层(overlay)的显示或关闭。
其中,OverlayInfo.associatedSphereRegion@center_azimuth,OverlayInfo.associatedSphereRegion@center_elevation,OverlayInfo.associatedSphereRegion@center_tilt,OverlayInfo.associatedSphereRegion@azimuth_range和OverlayInfo.associatedSphereRegion@elevation_range可以为上文中的覆盖层关联的区域信息,具体为球面区域信息。具体的,OverlayInfo.associatedSphereRegion@center_azimuth,OverlayInfo.associatedSphereRegion@center_elevation,OverlayInfo.associatedSphereRegion@center_tilt可以为上文中的覆盖层关联的区域的位置信息。OverlayInfo.associatedSphereRegion@azimuth_range和OverlayInfo.associatedSphereRegion@elevation_range可以分别为所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高。
在本发明实施例中,在MPD的覆盖层描述字里描述一个球面区域,该球面区域与覆盖层(overlay)相关联,从而使得用户可以点击该球面区域来控制与之相关联的覆盖层(overlay)的显示。
本发明实施例通过新增的结构定义来支持有条件的覆盖层(overlay)的显示。
实施例七
在本实施例中,我们在MPD新增覆盖层(overlay)的关联区域描述,用于表示用户可点击的背景VR视频流中的二维区域。当客户端检测到MPD中出现该覆盖层(overlay)关联的二维区域,从而可以在用户点击到该区域的时候触发与该区域相关联的覆盖层(overlay)的显示。
在OMAF标准中已经在MPD中定义覆盖层描述字(overlay descriptor),其@schemeIdUri为"urn:mpeg:mpegI:omaf:2018:ovly",最多一个该描述字可以出现在MPD的adaptation set level,用于表示与该adaptation set所关联的覆盖层(overlay)。
在本实施例中,在MPD的覆盖层描述字(overlay descriptor)中描述其所关联的二维区域,其具体语法如下:
Figure PCTCN2018125807-appb-000019
Figure PCTCN2018125807-appb-000020
当上述定义的球面区域出现在用户视角范围内时,依赖于客户端的配置或者是用户界面的提示,用户可以通过点击该区域来触发与之相关联的覆盖层(overlay)的显示或关闭。
其中,OverlayInfo.associated2DRegion@object_x,OverlayInfo.associated2DRegion@object_x, OverlayInfo.associated2DRegion@object_width和OverlayInfo.associated2DRegion@object_height为上文中的覆盖层关联的区域信息,具体为平面区域信息。
具体的,OverlayInfo.associated2DRegion@object_x,OverlayInfo.associated2DRegion@object_x可以为覆盖层关联的区域的位置信息,表示该区域左上角顶点在背景VR流内容(背景视频或者背景图像)中的二维坐标中的x,y位置。OverlayInfo.associated2DRegion@object_width和OverlayInfo.associated2DRegion@object_height可以为所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高,表示该区域在背景VR流内容中的二维坐标中的宽和高。
在本发明实施例中,在MPD的覆盖层描述字里描述一个二维区域,该二维区域与覆盖层(overlay)相关联,从而使得用户可以点击该二维区域来控制与之相关联的覆盖层(overlay)的显示。
本发明实施例通过新增的结构定义来支持有条件的覆盖层(overlay)的显示。
实施例八
在本实施例中,我们在MPD新增覆盖层(overlay)的默认状态描述。
在OMAF标准中已经在MPD中定义覆盖层描述字(overlay descriptor),其@schemeIdUri为"urn:mpeg:mpegI:omaf:2018:ovly",最多一个该描述字可以出现在MPD的adaptation set level,用于表示与该adaptation set所关联的覆盖层(overlay)。
在本实施例中,在MPD的覆盖层描述字(overlay descriptor)中描述覆盖层在默认状态下是否显示的标记,其具体语法如下:
Figure PCTCN2018125807-appb-000021
在本发明实施例中,在MPD的覆盖层描述字里描述一个标记,该标记表示默认状态下该覆盖层是否显示。其中,OverlayInfo@initial_status为上文中的初始状态标识。
本发明实施例通过新增的结构定义来支持有条件的覆盖层(overlay)的显示。
实施例九
在本实施例中,我们在MPD新增覆盖层(overlay)的显示触发类型描述。
在OMAF标准中已经在MPD中定义覆盖层描述字(overlay descriptor),其@schemeIdUri为"urn:mpeg:mpegI:omaf:2018:ovly",最多一个该描述字可以出现在MPD的adaptation set level,用于表示与该adaptation set所关联的覆盖层(overlay)。
在本实施例中,在MPD的覆盖层描述字(overlay descriptor)中描述覆盖层显示触发类型标记,其具体语法如下:
Figure PCTCN2018125807-appb-000022
在本发明实施例中,在MPD的覆盖层描述字里描述一个标记,该标记表示覆盖层显示的触发类型。其中,OverlayInfo@condition_type为上文中的触发类型信息。
本发明实施例通过新增的结构定义来支持有条件的覆盖层(overlay)的显示。
实施例十
在本实施例中,我们在MMT协议中新增覆盖层(overlay)的描述。
在本实施例中,在MMT协议中新增覆盖层的描述字,并在该描述字下描述覆盖层相关联的球面区域或二维区域,其中覆盖层描述字具体语法如下:
Figure PCTCN2018125807-appb-000023
Figure PCTCN2018125807-appb-000024
其中AssociatedSphereRegionStruct(),其具体语法如下:
Figure PCTCN2018125807-appb-000025
其中具体语义如下:
SphereRegionStruct(1)定义了一个与覆盖层(overlay)相关联的球面区域。其中可以包括覆盖层关联的区域信息,具体可以为球面区域信息。
可替代的,同实施例二,AssociatedSphereRegionStruct()中可定义二维区域,其具体的语法定义如下:
Figure PCTCN2018125807-appb-000026
其中具体语义如下:
2DRegionStruct()定义了一个与覆盖层(overlay)相关联的二维区域。其中可以包括覆盖层关联的区域信息,具体可以为平面区域信息,具体如下:
object_x和object_y表示该区域左上角顶点在背景VR流内容中的二维坐标中的x,y 位置。object_x和object_y可以为覆盖层关联的区域的位置信息,表示该区域左上角顶点在背景VR流内容(背景视频或者背景图像)中的二维坐标中的x,y位置。
object_width和object_height表示该区域在背景VR流内容中的二维坐标中的宽和高。object_width和object_height可以为所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高,表示该区域在背景VR流内容中的二维坐标中的宽和高。
当上述定义的球面区域出现在用户视角范围内时,依赖于客户端的配置或者是用户界面的提示,用户可以通过点击该区域来触发与之相关联的覆盖层(overlay)的显示或关闭。
在本发明实施例中,在MMT协议中描述一个覆盖层相关联的区域,从而使得用户可以点击该区域来控制与之相关联的覆盖层(overlay)的显示。
本发明实施例通过新增的结构定义来支持有条件的覆盖层(overlay)的显示。
本发明实施例提供一种客户端,该客户端可以为之前描述的客户端,或者,该客户端可以包括之前描述的客户端中的部分元件或者模块,该客户端可以包括获取模块和显示模块,其中,该客户端中的模块所执行的操作可以通过软件实现,可以作为软件模块位于客户端的存储器中并用于处理器调用并执行。该客户端中的模块所执行的操作也可以通过硬件芯片实现。
可以理解的是,本实施例的客户端的各个模块的更多的执行操作实现细节,可以参照上述方法实施例以及发明内容中的相关描述,此处不再赘述。
本发明实施例提供一种服务器,该服务器可以为之前描述的服务器,或者,该服务器可以包括之前描述的服务器中的部分元件或者模块,该服务器可以包括确定模块和发送模块,其中,该服务器中的模块所执行的操作可以通过软件实现,可以作为软件模块位于服务器的存储器中并用于处理器调用并执行。该服务器中的模块所执行的操作也可以通过硬件芯片实现。
可以理解的是,本实施例的服务器的各个模块的更多的执行操作实现细节,可以参照上述方法实施例以及发明内容中的相关描述,此处不再赘述。
图6是本申请实施例的处理媒体数据的装置(电子装置)的硬件结构示意图。图6所示的装置600可以视为是一种计算机设备,装置600可以作为本申请实施例的客户端或者服务器的一种实现方式,也可以作为本申请实施例的传输媒体数据的方法的一种实现方式,装置600可以包括处理器610、存储器620、输入/输出接口630和总线650,还可以包括通信接口640。其中,处理器610、存储器620、输入/输出接口630和通信接口640通过总线650实现彼此之间的通信连接。
处理器610可以采用通用的中央处理器(Central Processing Unit,CPU),微处理器,应用专用集成电路(Application Specific Integrated Circuit,ASIC),或者一个或多个集成电路,用于执行相关程序,以实现本申请实施例的客户端或者服务器中的模块所需执行的功能,或者执行本申请方法实施例的传输媒体数据的方法。处理器610可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器610中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器610可以是通用处理器、数字信号处理器(Digital Signal Processing,DSP)、专用集成电路(ASIC)、现成可编程门阵 列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器620,处理器610读取存储器620中的信息,结合其硬件完成本申请实施例的客户端或者服务器中可以包括的模块所需执行的功能,或者执行本申请方法实施例的传输媒体数据的方法。
存储器620可以是只读存储器(Read Only Memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(Random Access Memory,RAM)。存储器620可以存储操作系统以及其他应用程序。在通过软件或者固件来实现本申请实施例的客户端或者服务器中可以包括的模块所需执行的功能,或者执行本申请方法实施例的传输媒体数据的方法时,用于实现本申请实施例提供的技术方案的程序代码保存在存储器620中,并由处理器610来执行客户端或者服务器中可以包括的模块所需执行的操作,或者执行本申请方法实施例提供的传输媒体数据的方法。
输入/输出接口630用于接收输入的数据和信息,输出操作结果等数据。
通信接口640使用例如但不限于收发器一类的收发装置,来实现装置600与其他设备或通信网络之间的通信。可以作为处理装置中的获取模块或者发送模块。
总线650可包括在装置600各个部件(例如处理器610、存储器620、输入/输出接口630和通信接口640)之间传送信息的通路。
应注意,尽管图6所示的装置600仅仅示出了处理器610、存储器620、输入/输出接口630、通信接口640以及总线650,但是在具体实现过程中,本领域的技术人员应当明白,装置600还可以包括实现正常运行所必须的其他器件,例如还可以包括显示器,用于显示要播放的视频数据。同时,根据具体需要,本领域的技术人员应当明白,装置600还可包括实现其他附加功能的硬件器件。此外,本领域的技术人员应当明白,装置600也可仅仅包括实现本申请实施例所必须的器件,而不必包括图6中所示的全部器件。
参阅图7所示,提供了一种客户端700,该客户端700可以是上文中各种装置的一种实现方式,该客户端700可以包括:获取模块701和显示模块702,其中,
获取模块701,可以用于获取覆盖层和所述覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域。
其中,获取模块701可以是上文中的通信接口640或者输入输出接口630或者接收装置。
显示模块702,可以用于在检测到针对所述覆盖层关联的区域的触发操作时,显示所述覆盖层。
其中,显示模块702可以是上文中的显示器或者显示装置。
其中,所述针对所述覆盖层关联的区域的触发操作可以包括:在所述覆盖层关联的区域内的点击操作,或者,用户视线位于所述覆盖层关联的区域的触发操作。
其中,所述覆盖层关联的区域信息可以位于覆盖层控制结构(overlay control structure)中。
其中,所述覆盖层关联的区域信息可以包括所述覆盖层关联的区域的位置信息。
其中,所述覆盖层关联的区域信息可以包括所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高。
其中,所述覆盖层关联的区域信息可以为平面区域信息或者球面区域信息。
在一些可行的实施方式中,所述获取模块701还可以用于:获取触发类型信息。其中,所述针对所述覆盖层关联的区域的触发操作可以包括针对所述覆盖层关联的区域的所述触发类型信息指示的触发操作。其中,所述触发类型信息可以位于覆盖层控制结构中。
其中,所述触发类型信息可以位于媒体呈现描述MPD中。在一些可行的实施方式中,所述触发类型信息可以为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
在一些可行的实施方式中,所述获取模块701还可以用于:获取条件触发标识;所述客户端700还可以包括检测模块,用于在所述条件触发标识的值为第一预设值时,检测是否有针对所述覆盖层关联的区域的触发操作。其中,所述条件触发标识可以位于用于用户交互控制的覆盖层控制结构中。
其中,所述覆盖层关联的区域信息可以位于媒体呈现描述MPD中。在一些可行的实施方式中,所述覆盖层关联的区域信息可以为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
可以理解的是,本实施例的客户端700的各个模块的功能可以根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。
参阅图8所示,提供了一种客户端800,该客户端800可以是上文中各种装置的一种实现方式,该客户端800可以包括:获取模块801和显示模块802,其中,
获取模块801,可以用于获取覆盖层,背景视频或背景图像和所述覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域;获取初始状态标识。
其中,获取模块801可以是上文中的通信接口640或者输入输出接口630或者接收装置。
显示模块802,可以用于在所述初始状态标识的值指示默认关闭所述覆盖层的显示的情况下,可以执行如下操作:显示所述背景视频或背景图像;在检测到针对所述覆盖层关联的区域的触发操作时,将在所述背景视频或背景图像上叠加所述覆盖层,并显示叠加后的视频图像。其中,可以在未检测到针对所述覆盖层关联的区域的触发操作时,才执行所述显示所述背景视频或背景图像。
其中,所述针对所述覆盖层关联的区域的触发操作可以包括:在所述覆盖层关联的区域内的点击操作,所述显示模块802还可以用于:在所述初始状态标识的值指示默认关闭所述覆盖层的显示的情况下,显示是否通过在所述覆盖层关联的区域内的点击操作来显示所述覆盖层的提示信息。
其中,显示模块802可以是上文中的显示器或者显示装置。
其中,可以在检测到所述覆盖层关联的区域中至少部分区域位于当前用户视角范围内时,才执行所述显示是否通过在所述覆盖层关联的区域内的点击操作来显示所述覆盖层的提示信息。
其中,所述覆盖层关联的区域信息和所述初始状态标识可以位于覆盖层控制结构(overlay control structure)中。
其中,所述覆盖层关联的区域信息和所述初始状态标识可以位于媒体呈现描述MPD中。在一些可行的实施方式中,所述覆盖层关联的区域信息和所述初始状态标识可以为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
在一些可行的实施方式中,所述获取模块801还可以用于:获取触发类型信息。所述针对所述覆盖层关联的区域的触发操作可以包括针对所述覆盖层关联的区域的所述触发类型信息指示的触发操作。
其中,所述触发类型信息可以位于覆盖层控制结构中。
其中,所述触发类型信息可以位于媒体呈现描述MPD中。在一些可行的实施方式中,所述触发类型信息可以为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
其中,所述获取模块801还可以用于:获取条件触发标识。所述客户端还可以包括检测模块,用于在所述条件触发标识的值为第一预设值时,检测是否有针对所述覆盖层关联的区域的触发操作。
其中,所述条件触发标识可以位于用于用户交互控制的覆盖层控制结构中。
可以理解的是,本实施例的客户端800的各个模块的功能可以根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。
参阅图9所示,提供了一种客户端900,该客户端900可以是上文中各种装置的一种实现方式,该客户端900可以包括:获取模块901和显示模块902,其中,
获取模块901,可以用于获取覆盖层,背景视频或背景图像和所述覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域;获取初始状态标识。
其中,获取模块901可以是上文中的通信接口640或者输入输出接口630或者接收装置。
显示模块902,可以用于在所述初始状态标识的值指示默认显示所述覆盖层的情况下,执行如下操作:
将在所述背景视频或背景图像上叠加所述覆盖层,并显示叠加后的视频图像;在检测到针对所述覆盖层关联的区域的触发操作时,显示所述背景视频或背景图像。其中,可以是在未检测到针对所述覆盖层关联的区域的触发操作时,才执行所述显示叠加后的视频图像。
其中,所述针对所述覆盖层关联的区域的触发操作可以包括:在所述覆盖层关联的区域内的点击操作。所述显示模块902还可以用于:在所述初始状态标识的值指示默认显示所述覆盖层的情况下,显示是否通过在所述覆盖层关联的区域内的点击操作来关闭所述覆盖层的显示的提示信息。
其中,显示模块902,可以在检测到所述覆盖层关联的区域中至少部分区域位于当前用户视角范围内时,才执行所述显示是否通过在所述覆盖层关联的区域内的点击操作来关闭所述覆盖层的显示的提示信息。
其中,显示模块902可以是上文中的显示器或者显示装置。
其中,所述覆盖层关联的区域信息和所述初始状态标识可以位于覆盖层控制结构(overlay control structure)中。
其中,所述覆盖层关联的区域信息和所述初始状态标识可以位于媒体呈现描述MPD中。在一些可行的实施方式中,所述覆盖层关联的区域信息和所述初始状态标识可以为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
在一些可行的实施方式中,所述获取模块901还可以用于:获取触发类型信息;所述针对所述覆盖层关联的区域的触发操作可以包括针对所述覆盖层关联的区域的所述触发类型信息指示的触发操作。
其中,所述触发类型信息可以位于覆盖层控制结构中。
其中,所述触发类型信息可以位于媒体呈现描述MPD中。在一些可行的实施方式中,所述触发类型信息可以为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
在一些可行的实施方式中,所述获取模块901还可以用于:获取条件触发标识;所述客户端还可以包括检测模块,用于在所述条件触发标识的值为第一预设值时,检测是否有针对所述覆盖层关联的区域的触发操作。
其中,所述条件触发标识可以位于用于用户交互控制的覆盖层控制结构中。
可以理解的是,本实施例的客户端900的各个模块的功能可以根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。
参阅图10所示,提供了一种服务器1000,该服务器700可以是上文中各种装置的一种实现方式,该服务器1000可以包括:确定模块1001和发送模块1002,其中,
确定模块1001,可以用于确定覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域。
发送模块1002,可以用于向客户端发送所述覆盖层关联的区域信息。
其中,发送模块1002可以是上文中的发送传输装置或者通信接口640或者输入输出接口630。
其中,所述覆盖层关联的区域信息可以位于覆盖层控制结构(overlay control structure)中。
其中,所述覆盖层关联的区域信息可以包括所述覆盖层关联的区域的位置信息。
其中,所述覆盖层关联的区域信息可以包括所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高。
其中,所述覆盖层关联的区域信息可以为平面区域信息或者球面区域信息。
其中,所述发送模块还可以用于:向所述客户端发送触发类型信息,所述触发类型信息用于指示用于触发所述覆盖层显示或者关闭显示的触发操作的触发类型。
其中,所述触发类型信息可以位于覆盖层控制结构中。
其中,所述触发类型信息可以位于媒体呈现描述MPD中。在一些可行的实施方式中,所述触发类型信息可以为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
在一些可行的实施方式中,所述发送模块1002还可以用于:向所述客户端发送条件触发标识,所述条件触发标识的值为第一预设值用于指示所述覆盖层的显示或者关闭显示受用于触发所述覆盖层显示或者关闭显示的触发操作控制。
其中,所述条件触发标识的值可以为第二预设值用于指示所述覆盖层的显示或者关闭显示不受用于触发所述覆盖层显示或者关闭显示的触发操作控制。
其中,所述条件触发标识可以位于用于用户交互控制的覆盖层控制结构中。
其中,所述覆盖层关联的区域信息可以位于媒体呈现描述MPD中。在一些可行的实施方式中,所述覆盖层关联的区域信息可以为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
在一些可行的实施方式中,所述发送模块1002还可以用于:向所述客户端发送初始状态标识,所述初始状态标识用于指示所述覆盖层在初始状态下为显示状态,或者用于指示所述覆盖层在初始状态下为关闭显示状态。
其中,所述初始状态标识可以位于覆盖层控制结构(overlay control structure)中。
其中,所述初始状态标识可以位于媒体呈现描述MPD中。在一些可行的实施方式中,所述初始状态标识可以为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
可以理解的是,本实施例的服务器1000的各个模块的功能可以根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。
上述各个设备的全部或者部分模块也可以为软件模块,由处理器读取来执行相关的方法,也可以为芯片中的单元,在此不做限定。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,可以包括若干指令用以使得一台计算机设备(可以是个 人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质可以包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (48)

  1. 一种处理媒体数据的方法,其特征在于,包括:
    获取覆盖层和所述覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域;
    在检测到针对所述覆盖层关联的区域的触发操作时,显示所述覆盖层。
  2. 根据权利要求1所述的方法,其特征在于,所述针对所述覆盖层关联的区域的触发操作包括:在所述覆盖层关联的区域内的点击操作,或者,用户视线位于所述覆盖层关联的区域的触发操作。
  3. 根据权利要求1或2所述的方法,其特征在于,所述覆盖层关联的区域信息位于覆盖层控制结构(overlay control structure)中。
  4. 根据权利要求1至3任一项所述的方法,其特征在于,所述覆盖层关联的区域信息包括所述覆盖层关联的区域的位置信息。
  5. 根据权利要求1至4任一项所述的方法,其特征在于,所述覆盖层关联的区域信息包括所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高。
  6. 根据权利要求1至5任一项所述的方法,所述覆盖层关联的区域信息为平面区域信息或者球面区域信息。
  7. 根据权利要求1至6任一项所述的方法,其特征在于,所述方法还包括:获取触发类型信息;
    所述针对所述覆盖层关联的区域的触发操作包括针对所述覆盖层关联的区域的所述触发类型信息指示的触发操作。
  8. 根据权利要求7所述的方法,其特征在于,所述触发类型信息位于覆盖层控制结构中。
  9. 根据权利要求7所述的方法,其特征在于,所述触发类型信息位于媒体呈现描述MPD中。
  10. 根据权利要求9所述的方法,其特征在于,所述触发类型信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
  11. 根据权利要求1至10任一项所述的方法,其特征在于,所述方法还包括:获取条件触发标识,在所述条件触发标识的值为第一预设值时,检测是否有针对所述覆盖层关联的区域的触发操作。
  12. 根据权利要求11所述的方法,其特征在于,所述条件触发标识位于用于用户交互控制的覆盖层控制结构中。
  13. 根据权利要求1至12任一项所述的方法,其特征在于,所述覆盖层关联的区域信息位于媒体呈现描述MPD中。
  14. 根据权利要求13所述的方法,其特征在于,所述覆盖层关联的区域信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
  15. 一种处理媒体数据的方法,其特征在于,包括:
    获取覆盖层,背景视频或背景图像和所述覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域;
    获取初始状态标识;
    在所述初始状态标识的值指示默认关闭所述覆盖层的显示的情况下,执行如下操作:
    显示所述背景视频或背景图像;
    在检测到针对所述覆盖层关联的区域的触发操作时,将在所述背景视频或背景图像上叠加所述覆盖层,并显示叠加后的视频图像。
  16. 根据权利要求15所述的方法,其特征在于,所述覆盖层关联的区域信息和所述初始状态标识位于覆盖层控制结构(overlay control structure)中。
  17. 根据权利要求15所述的方法,其特征在于,所述覆盖层关联的区域信息和所述初始状态标识位于媒体呈现描述MPD中。
  18. 根据权利要求17所述的方法,其特征在于,所述覆盖层关联的区域信息和所述初始状态标识为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
  19. 根据权利要求15至18任一项所述的方法,其特征在于,所述方法还包括:获取触发类型信息;
    所述针对所述覆盖层关联的区域的触发操作包括针对所述覆盖层关联的区域的所述触发类型信息指示的触发操作。
  20. 根据权利要求19所述的方法,其特征在于,所述触发类型信息位于覆盖层控制结构中。
  21. 根据权利要求19所述的方法,其特征在于,所述触发类型信息位于媒体呈现描述MPD中。
  22. 根据权利要求21所述的方法,其特征在于,所述触发类型信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
  23. 根据权利要求15至22任一种所述的方法,其特征在于,所述方法还包括:获取条件触发标识,在所述条件触发标识的值为第一预设值时,检测是否有针对所述覆盖层关联的区域的触发操作。
  24. 根据权利要求23所述的方法,其特征在于,所述条件触发标识位于用于用户交互控制的覆盖层控制结构中。
  25. 一种客户端,其特征在于,包括:
    获取模块,用于获取覆盖层和所述覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域;
    显示模块,用于在检测到针对所述覆盖层关联的区域的触发操作时,显示所述覆盖层。
  26. 根据权利要求25所述的客户端,其特征在于,所述针对所述覆盖层关联的区域的触发操作包括:在所述覆盖层关联的区域内的点击操作,或者,用户视线位于所述覆盖层关联的区域的触发操作。
  27. 根据权利要求25或26所述的客户端,其特征在于,所述覆盖层关联的区域信息位于覆盖层控制结构(overlay control structure)中。
  28. 根据权利要求25至27任一项所述的客户端,其特征在于,所述覆盖层关联的区域信息包括所述覆盖层关联的区域的位置信息。
  29. 根据权利要求25至28任一项所述的客户端,其特征在于,所述覆盖层关联的区域信息包括所述覆盖层关联的区域的宽和所述覆盖层关联的区域的高。
  30. 根据权利要求25至29任一项所述的客户端,所述覆盖层关联的区域信息为平面区域信息或者球面区域信息。
  31. 根据权利要求25至30任一项所述的客户端,其特征在于,所述获取模块还用于:获取触发类型信息;
    所述针对所述覆盖层关联的区域的触发操作包括针对所述覆盖层关联的区域的所述触发类型信息指示的触发操作。
  32. 根据权利要求31所述的客户端,其特征在于,所述触发类型信息位于覆盖层控制结构中。
  33. 根据权利要求31所述的客户端,其特征在于,所述触发类型信息位于媒体呈现描述MPD中。
  34. 根据权利要求33所述的客户端,其特征在于,所述触发类型信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
  35. 根据权利要求25至34任一项所述的客户端,其特征在于,所述获取模块还用于:获取条件触发标识;所述客户端还包括检测模块,用于在所述条件触发标识的值为第一预设值时,检测是否有针对所述覆盖层关联的区域的触发操作。
  36. 根据权利要求35所述的客户端,其特征在于,所述条件触发标识位于用于用户交互控制的覆盖层控制结构中。
  37. 根据权利要求25至36任一项所述的客户端,其特征在于,所述覆盖层关联的区域信息位于媒体呈现描述MPD中。
  38. 根据权利要求37所述的客户端,其特征在于,所述覆盖层关联的区域信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
  39. 一种处理媒体数据的客户端,其特征在于,包括:
    获取模块,用于获取覆盖层,背景视频或背景图像和所述覆盖层关联的区域信息,所述覆盖层关联的区域信息用于指示所述覆盖层关联的区域;获取初始状态标识;
    显示模块,用于在所述初始状态标识的值指示默认关闭所述覆盖层的显示的情况下,执行如下操作:
    显示所述背景视频或背景图像;
    在检测到针对所述覆盖层关联的区域的触发操作时,将在所述背景视频或背景图像上叠加所述覆盖层,并显示叠加后的视频图像。
  40. 根据权利要求39所述的客户端,其特征在于,所述覆盖层关联的区域信息和所述初始状态标识位于覆盖层控制结构(overlay control structure)中。
  41. 根据权利要求39所述的客户端,其特征在于,所述覆盖层关联的区域信息和所述初始状态标识位于媒体呈现描述MPD中。
  42. 根据权利要求41所述的客户端,其特征在于,所述覆盖层关联的区域信息和所述初始状态标识为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
  43. 根据权利要求39至42任一项所述的客户端,其特征在于,所述获取模块还用于:获取触发类型信息;
    所述针对所述覆盖层关联的区域的触发操作包括针对所述覆盖层关联的区域的所述触发类型信息指示的触发操作。
  44. 根据权利要求43所述的客户端,其特征在于,所述触发类型信息位于覆盖层控制结构中。
  45. 根据权利要求43所述的客户端,其特征在于,所述触发类型信息位于媒体呈现描述MPD中。
  46. 根据权利要求45所述的客户端,其特征在于,所述触发类型信息为所述MPD中的覆盖层描述字(overlay descriptor)的属性信息。
  47. 根据权利要求39至46任一种所述的客户端,其特征在于,所述获取模块还用于:获取条件触发标识;
    所述客户端还包括检测模块,用于在所述条件触发标识的值为第一预设值时,检测是否有针对所述覆盖层关联的区域的触发操作。
  48. 根据权利要求47所述的客户端,其特征在于,所述条件触发标识位于用于用户交互控制的覆盖层控制结构中。
PCT/CN2018/125807 2018-09-27 2018-12-29 处理媒体数据的方法、客户端和服务器 WO2020062700A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202310199909.3A CN116248947A (zh) 2018-09-27 2018-12-29 处理媒体数据的方法、客户端和服务器
CN201880098171.9A CN112771878B (zh) 2018-09-27 2018-12-29 处理媒体数据的方法、客户端和服务器
EP18935304.8A EP3846481A4 (en) 2018-09-27 2018-12-29 MULTIMEDIA, CLIENT AND SERVER DATA PROCESSING PROCESS
US17/214,056 US20210218908A1 (en) 2018-09-27 2021-03-26 Method for Processing Media Data, Client, and Server

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201862737892P 2018-09-27 2018-09-27
US62/737,892 2018-09-27

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/214,056 Continuation US20210218908A1 (en) 2018-09-27 2021-03-26 Method for Processing Media Data, Client, and Server

Publications (1)

Publication Number Publication Date
WO2020062700A1 true WO2020062700A1 (zh) 2020-04-02

Family

ID=69952744

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/125807 WO2020062700A1 (zh) 2018-09-27 2018-12-29 处理媒体数据的方法、客户端和服务器

Country Status (4)

Country Link
US (1) US20210218908A1 (zh)
EP (1) EP3846481A4 (zh)
CN (2) CN116248947A (zh)
WO (1) WO2020062700A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220360865A1 (en) * 2019-07-03 2022-11-10 Beijing Xiaomi Mobile Software Co., Ltd. Method, system and apparatus for building virtual reality environment
US11617017B2 (en) 2021-06-30 2023-03-28 Rovi Guides, Inc. Systems and methods of presenting video overlays

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102160084A (zh) * 2008-03-06 2011-08-17 阿明·梅尔勒 用于分割、分类视频对象并拍卖交互式视频对象的权利的自动过程
US20130249900A1 (en) * 2012-03-23 2013-09-26 Kyonggi University Industry & Academia Cooperation Foundation Method and apparatus for processing media file for augmented reality service
CN106105220A (zh) * 2014-01-07 2016-11-09 诺基亚技术有限公司 用于视频编码和解码的方法和装置
CN106233745A (zh) * 2013-07-29 2016-12-14 皇家Kpn公司 向客户端提供瓦片视频流
CN107534794A (zh) * 2015-04-23 2018-01-02 Lg 电子株式会社 发送广播信号的装置、接收广播信号的装置、发送广播信号的方法和接收广播信号的方法
CN108271044A (zh) * 2016-12-30 2018-07-10 华为技术有限公司 一种信息的处理方法及装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1629672B1 (en) * 2003-06-05 2015-11-11 NDS Limited System for transmitting information from a streamed program to external devices and media
US20120331506A1 (en) * 2011-06-24 2012-12-27 Sony Network Entertainment International Llc User interface and content integration
US10841566B2 (en) * 2016-05-26 2020-11-17 Vid Scale, Inc. Methods and apparatus of viewport adaptive 360 degree video delivery
WO2018035133A1 (en) * 2016-08-17 2018-02-22 Vid Scale, Inc. Secondary content insertion in 360-degree video
US20180246641A1 (en) * 2017-02-28 2018-08-30 GM Global Technology Operations LLC Triggering control of a zone using a zone image overlay on an in-vehicle display

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102160084A (zh) * 2008-03-06 2011-08-17 阿明·梅尔勒 用于分割、分类视频对象并拍卖交互式视频对象的权利的自动过程
US20130249900A1 (en) * 2012-03-23 2013-09-26 Kyonggi University Industry & Academia Cooperation Foundation Method and apparatus for processing media file for augmented reality service
CN106233745A (zh) * 2013-07-29 2016-12-14 皇家Kpn公司 向客户端提供瓦片视频流
CN106105220A (zh) * 2014-01-07 2016-11-09 诺基亚技术有限公司 用于视频编码和解码的方法和装置
CN107534794A (zh) * 2015-04-23 2018-01-02 Lg 电子株式会社 发送广播信号的装置、接收广播信号的装置、发送广播信号的方法和接收广播信号的方法
CN108271044A (zh) * 2016-12-30 2018-07-10 华为技术有限公司 一种信息的处理方法及装置

Also Published As

Publication number Publication date
CN112771878A (zh) 2021-05-07
US20210218908A1 (en) 2021-07-15
CN112771878B (zh) 2023-01-13
EP3846481A1 (en) 2021-07-07
CN116248947A (zh) 2023-06-09
EP3846481A4 (en) 2021-11-10

Similar Documents

Publication Publication Date Title
US11902350B2 (en) Video processing method and apparatus
KR102118056B1 (ko) 복수의 뷰포인트들에 대한 메타데이터를 송수신하는 방법 및 장치
US11651752B2 (en) Method and apparatus for signaling user interactions on overlay and grouping overlays to background for omnidirectional content
CN110121734B (zh) 一种信息的处理方法及装置
TWI670973B (zh) 在iso基本媒體檔案格式推導虛擬實境投影、填充、感興趣區域及視埠相關軌跡並支援視埠滾動訊號之方法及裝置
US20200389640A1 (en) Method and device for transmitting 360-degree video by using metadata related to hotspot and roi
CN108965929B (zh) 一种视频信息的呈现方法、呈现视频信息的客户端和装置
CN109218755B (zh) 一种媒体数据的处理方法和装置
CN112534825A (zh) 用于传输媒体内容的方法、装置和计算机程序
WO2019137339A1 (zh) 处理媒体数据的方法和装置
WO2023051138A1 (zh) 沉浸媒体的数据处理方法、装置、设备、存储介质及程序产品
US11438731B2 (en) Method and apparatus for incorporating location awareness in media content
US20210218908A1 (en) Method for Processing Media Data, Client, and Server
WO2020107998A1 (zh) 视频数据的处理方法、装置、相关设备及存储介质
WO2022037423A1 (zh) 点云媒体的数据处理方法、装置、设备及介质
WO2023024839A1 (zh) 媒体文件封装与解封装方法、装置、设备及存储介质
US20230396808A1 (en) Method and apparatus for decoding point cloud media, and method and apparatus for encoding point cloud media
WO2020063850A1 (zh) 一种处理媒体数据的方法、终端及服务器
CN111937397B (zh) 媒体数据处理方法及装置
CN108271084B (zh) 一种信息的处理方法及装置
CN111937397A (zh) 媒体数据处理方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18935304

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018935304

Country of ref document: EP

Effective date: 20210330