WO2020063850A1 - Method for processing media data and terminal and server - Google Patents

Method for processing media data and terminal and server Download PDF

Info

Publication number
WO2020063850A1
WO2020063850A1 PCT/CN2019/108514 CN2019108514W WO2020063850A1 WO 2020063850 A1 WO2020063850 A1 WO 2020063850A1 CN 2019108514 W CN2019108514 W CN 2019108514W WO 2020063850 A1 WO2020063850 A1 WO 2020063850A1
Authority
WO
WIPO (PCT)
Prior art keywords
overlay
information
group
overlays
operation function
Prior art date
Application number
PCT/CN2019/108514
Other languages
French (fr)
Chinese (zh)
Inventor
宋翼
范宇群
邸佩云
王业奎
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2020063850A1 publication Critical patent/WO2020063850A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/485End-user interface for client configuration

Definitions

  • the embodiments of the present application relate to the technical field of streaming media transmission, and in particular, to a method, terminal, and server for processing media data.
  • the ISO / IEC 23090-2 standard specification is also called the OMAF (Omnidirectional Media Format) standard specification.
  • This specification defines a media application format that can implement omnidirectional media presentation in applications. Omnidirectional media mainly refers to panoramic video (360-degree video) and related audio.
  • the OMAF specification first specifies a list of projection methods that can be used to convert spherical video into two-dimensional video, and secondly, how to use ISO base media file format (ISOBMFF) to store omnidirectional media and the associated media.
  • ISO base media file format ISO base media file format
  • Metadata and how to encapsulate omnidirectional media data and transmit omnidirectional media data in a streaming media system, such as through Dynamic Adaptive Streaming based on HyperText Transfer Protocol (HTTP) HTTP, DASH), ISO / IEC 23009-1 standard dynamic adaptive streaming transmission.
  • HTTP Dynamic Adaptive Streaming based on HyperText Transfer Protocol
  • DASH Dynamic Adaptive Streaming based on HyperText Transfer Protocol
  • ISO / IEC 23009-1 standard dynamic adaptive streaming transmission.
  • the basic data structure of the overlay defines some basic properties of the overlay structure (for example, including the number of overlays, Id number, control symbol, control structure, etc.). Among them, the specific function of each structure is defined in the semantics of the control symbol syntax element overlay_control_flag. After the terminal resolves to the overlay, it can determine how to handle the overlay based on these syntax elements.
  • the embodiments of the present application provide a method, a terminal, and a server for processing media data, so as to reduce the complexity of operations that require users to perform the same operations on each overlay to achieve the corresponding purpose, make the operations on overlays more efficient, and improve the subjectivity of users. Experience.
  • an embodiment of the present application provides a method for processing media data, including: a terminal receiving at least two overlays corresponding to media data; overlay corresponding to first information; or overlay corresponding to second information and third information;
  • the first information includes the group identification information of the overlay
  • the second information is used to indicate the operation function corresponding to the overlay
  • the third information is used to indicate the group identification information of the overlay or the information that belongs to the same group as the overlay.
  • Identification information of other overlays when the overlay corresponds to the first information, the terminal processes the at least two overlays according to the first information of the at least two overlays, or when the overlay corresponds to the first information In the case of the second information and the third information, the terminal processes the at least two overlays according to the second information and the third information corresponding to the at least two overlays.
  • An embodiment of the present application provides a method for processing media data.
  • a terminal can process one or more overlays having the same group identification information by using first information corresponding to each overlay in at least two overlays.
  • each overlay can be processed one by one, which can reduce the complexity of operations and improve the user's subjective experience.
  • the foregoing terminal may perform the same processing on the overlays belonging to the same group according to the first information of at least two overlays. For example, to display all overlays in the same group at the granularity of the group. Or close all overlays in the same group with group granularity. For example, all overlays in the same group are scaled with group granularity.
  • the terminal processing the at least two overlays according to the first information of the at least two overlays includes: the terminal displays at least one group, and is used to indicate at least one group Information about the operation functions corresponding to each group and the overlays belonging to each group. At least one group is determined by the first information corresponding to each of the at least two overlays; the operation function corresponding to one group is determined by the overlay structure included in the overlay in each group.
  • the terminal processes the at least two overlays according to the second information and the third information corresponding to the at least two overlays, including: the terminal displaying at least one group, and indicating at least one group The information of the operation function corresponding to each group in the group and the overlay belonging to each group. At least one group is determined by third information corresponding to each of the at least two overlays, and an operation function corresponding to one group is determined by an overlay-related area control structure included in the overlay in the group.
  • the terminal may display at least one group on the display interface and information used to indicate the operation function corresponding to each group. In addition, it may also display information used to indicate the overlay in each group, which is convenient. The user knows the operation function corresponding to each group and what overlays there are in each group.
  • the terminal processes at least two overlays according to the first information of the at least two overlays, or the terminal processes second information and third information corresponding to the at least two overlays.
  • Processing the at least two overlays includes: when any group is triggered, all overlays in the any group respond to an operation function of the any group.
  • the terminal processes the at least two overlays according to the first information of the at least two overlays, or the terminal processes the second information and the first according to the second information corresponding to the at least two overlays.
  • Three pieces of information process the at least two overlays, including: when any overlay in any group is triggered, other overlays belonging to the same group as any overlay also respond to the operation of the any group Features.
  • the operation function is display.
  • the method provided in the embodiment of the present application further includes: when any one of the at least one group is triggered, the overlay in any one of the groups is displayed.
  • the terminal can display all overlays belonging to the same group with granularity based on the triggered operation.
  • the operation function of the overlay is taken as an example here. In an actual process, an operation function of the overlay may also be size scaling, position change, and the like.
  • the operation function is turned off.
  • the method provided in the embodiment of the present application further includes: when any overlay is triggered, any overlay displayed on the terminal, and any overlay that belongs to the same group as any overlay Other overlays are closed.
  • the terminal can close all overlays belonging to the same group with the granularity of the group based on the trigger operation.
  • each overlay needs to be closed one by one, which can reduce the operation complexity.
  • the overlay also corresponds to fourth information.
  • the fourth information is used to indicate that, when the first operation function is performed on the overlay, all the overlays in the group to which the overlay belongs respond to the first operation function.
  • the first information of the at least two overlays processes the at least two overlays, or the terminal processes the at least two overlays according to second information and third information corresponding to the at least two overlays.
  • the terminal processes at least two overlays according to the first information and the fourth information of the at least two overlays; or the terminal processes at least two overlays according to the second information, the third information, and the fourth information of the at least two overlays overlay for processing.
  • the terminal can determine the overlay for group operation processing.
  • each group in the at least one group further corresponds to indication information used to indicate a group operation.
  • the terminal may further display, according to the fourth information, instruction information indicating the group operation corresponding to each group.
  • the overlay also corresponds to the fifth information.
  • the fifth information is used to indicate that in the case of performing the first operation function on the overlay, all the overlays in the group to which the overlay belongs respond to the first operation function, or the overlay is executed.
  • the terminal processes the at least two overlays according to the first information of the at least two overlays, or the terminal pairs the second information and the third information according to the at least two overlays.
  • the processing of the at least two overlays further includes: the terminal processes the at least two overlays according to the first information and the fifth information of the at least two overlays; or the terminal processes the second information and the third information according to the at least two overlays And fifth information to process at least two overlays.
  • the terminal can determine that the overlay can be processed in a group operation and an individual operation.
  • each group in the at least one group further corresponds to instruction information for indicating a group operation and instruction information for indicating an independent operation. This is convenient for choosing whether to handle overlays in a group operation or in a single operation.
  • the terminal may further display, according to the fifth information, the instruction information indicating the group operation and the instruction information indicating the separate operation corresponding to each group.
  • the terminal may determine whether to operate alone or in groups according to the fourth information and the fifth information corresponding to the overlay.
  • the overlay when the overlay corresponds to the first information, the overlay also corresponds to the sixth information.
  • the sixth information is used to indicate an operation function corresponding to the overlay.
  • the terminal performs at least two overlays according to the first information of the at least two overlays.
  • the processing includes: the terminal processes at least two overlays according to the first information and the sixth information of the at least two overlays; or the terminal processes the at least two overlays according to the second information, the third information, and the sixth information of the at least two overlays. overlay for processing.
  • the terminal parses the sixth message, it can determine the operation function of each overlay based on the operation function indicated by the sixth message.
  • the operation function of the group may also be determined based on the sixth message.
  • the second information and the third information are carried in an overlay file format.
  • the second information and the third information are carried in the OMAF file format of the overlay.
  • the file format includes an overlay structure, an overlay-related area control structure, and an overlay group box located in the overlay structure. Then the second information is located in the overlay association area control structure, and the third information is located in the overlay group box. It should be understood that in this case, the terminal can obtain the overlay association area control structure and the third information by analyzing the overlay structure, and then determine the operation function of the overlay according to the overlay association area control structure, and obtain the overlay group identification information according to the third information. To determine the group to which the overlay belongs.
  • the third information is located in an overlay control structure included in the overlay structure.
  • the first information is carried in an overlay file format. It should be understood that the first information is located in an overlay group box included in the overlay. At this time, the overlay structure may not carry an overlay-related area control structure. It should be understood that when the first information is carried in the file format, after receiving the overlay, the terminal may obtain the group identification information of the overlay by analyzing the overlay group box of the overlay, and then determine the group to which the overlay belongs according to the group identification information.
  • the third information is carried in supplementary enhanced information (supplementary enhancement information) of the overlay code stream corresponding to the overlay, and the second information is carried in an overlay-related area control structure of the overlay.
  • supplementary enhanced information supplementary enhancement information
  • the overlay structure of the overlay at this time includes an overlay-related area control structure.
  • the operation function corresponding to each overlay can be indicated by the overlay-related area control structure.
  • the overlay-associated area control structure may indicate that the operation function is displaying or closing.
  • the first information is carried in an SEI of an overlay code stream corresponding to the overlay.
  • the overlay structure of the overlay may not include an overlay-related area control structure.
  • the overlay structure may include an overlay control symbol for indicating an operation function corresponding to the overlay.
  • the SEI payload type is used to indicate that the SEI carries the group identification information of the overlay.
  • the terminal when the terminal processes the overlay, it can determine the group identification information of the overlay carried in the SEI according to the payload type when parsing to the SEI, and further analyze the SEI to obtain the group identification information of the overlay.
  • the SEI load type is also used to indicate the attribute of the group.
  • the attributes of the SEI load type group are a common display group or a common interaction group. For example, if it is a common interaction group, an interactive operation may be performed on the group. If it is a common display group, you can display or close the overlay in the group in a group operation.
  • the third information is carried in a Media Presentation Description (MPD) including a media data stream of the overlay
  • the second information is carried in an overlay association area control structure of the overlay.
  • MPD Media Presentation Description
  • the overlay structure of the overlay at this time includes an overlay-related area control structure.
  • the operation function corresponding to each overlay can be indicated by the overlay-related area control structure.
  • the overlay-associated area control structure may indicate that the operation function is displaying or closing.
  • the first information is carried in an MPD of a media data stream including an overlay.
  • the overlay's overlay structure may not include an overlay-related area control structure.
  • the overlay structure may include an overlay control symbol for indicating an operation function corresponding to the overlay.
  • the first information or the third information is located in an overlay description word of an adaptation set level or a representation level of the MPD.
  • the server when the first information or the third information is carried in the MPD, the server also needs to send the MPD corresponding to the media data including at least two overlays before sending the code stream to the terminal.
  • the MPD includes first information or third information of each of at least two overlays. This embodiment of the present application does not limit this.
  • the code stream here includes information of at least two overlays. The information of each of the at least two overlays is the information defined in the overlay structure corresponding to the overlay.
  • the group identification information of the overlay includes at least one group identification information. It should be understood that when the overlay corresponds to at least two group identification information, one overlay may belong to at least two groups.
  • the overlay corresponds to multiple groups.
  • the overlay When the overlay is triggered, the overlay respectively responds to the operation function corresponding to the multiple groups, or all overlays in the multiple groups respectively respond to their respective groups. Corresponding operation function.
  • the overlay belongs to the first group and the second group, and different groups correspond to different operation functions.
  • all The overlay responds to the operation function corresponding to the first group
  • all overlays in the second group respond to the operation function corresponding to the second group.
  • the fifth information carried in the overlay is used to indicate that group operations and separate operations are available.
  • the terminal processes at least two overlays according to the group operation.
  • the terminal processes at least two overlays according to the separate operation. The specific process can be described in the above corresponding description, which will not be repeated here.
  • the overlay in the first group when the overlay in the first group is triggered, the overlay responds to the operation function corresponding to the first group, and the overlay in the second group responds to the operation corresponding to the second group.
  • the file format of the overlay further includes: an overlay group box, and the overlay group box carries name information of the overlay group; the method further includes: the terminal displays the overlay The group name indicated by the group's name information.
  • an embodiment of the present application provides a method for processing media data, including: the server obtains the media data; the server processes the media data to obtain at least two overlays corresponding to the media data; the overlay corresponds to the first information, or the overlay Corresponds to second information and third information; wherein the first information includes group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay, and the second information is used to indicate that the overlay corresponds to The third information is used to indicate the group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay; the overlays in the same group correspond to the same operation function; the server sends The terminal sends the at least two overlay layers.
  • the processing of the media data by the server may refer to encoding the media data and encapsulating the media data stream obtained after the encoding.
  • the server after the server obtains the media data, it can process the media data so that one or more overlays obtained after processing correspond to the group identification information, or at least two overlays corresponding to the group obtained after processing Group identification information, and operational functions.
  • the terminal when the terminal obtains one or more overlays, it can process at least two overlays based on the first information, or process at least two overlays based on the second information and the third information. Since the first information and the third information both indicate group identification information, the terminal can process at least two overlays with the group granularity.
  • the second information and the third information are carried in an overlay file format.
  • the server may encode the media data and use the video file format (for example, the OMAF standard file format) to encapsulate the media data stream obtained after the encoding. Therefore, the second information and the third information are carried in a file format obtained by encapsulating the overlay.
  • the file format includes an overlay structure, an overlay-related area control structure located in the overlay structure, and an overlay group box.
  • the third information is located in the overlay group box, and the second information is located in the overlay association area control structure.
  • the third information is located in an overlay control structure included in the overlay structure.
  • the first information is carried in an overlay file format.
  • the server may encode the media data and use the video file format (for example, the OMAF standard file format) to encapsulate the media data stream obtained after the encoding. So that the first information is in a file format obtained after the overlay is encapsulated.
  • the file format includes an overlay group box, and the first information is located in the overlay group box. It should be understood that the file format at this time also includes an overlay structure.
  • the third information is carried in the auxiliary enhanced information SEI of the overlay code stream corresponding to the overlay, and the second information is carried in the overlay-related area control structure of the overlay.
  • the server may encode the media data to obtain a media data stream, and then encode one or the overlay included in the media data to obtain an overlay code stream corresponding to each overlay.
  • the server carries the third information in the overlay corresponding to the overlay.
  • the overlay bitstream includes auxiliary enhancement information SEI.
  • the encoded media data stream and the overlay code stream corresponding to each overlay are encapsulated in a video file format (for example, the OMAF standard file format). So that the second information is carried in a file format obtained by encapsulating the overlay.
  • the first information is carried in an SEI of an overlay code stream corresponding to the overlay.
  • the server may encode the media data to obtain a media data stream, and then encode one or the overlay included in the media data to obtain an overlay code stream corresponding to each overlay.
  • the server carries the third information in the overlay corresponding to the overlay.
  • the overlay bitstream includes auxiliary enhancement information SEI.
  • the encoded media data stream and the overlay code stream corresponding to each overlay are encapsulated.
  • the encapsulated overlay has an overlay structure.
  • the payload type of the SEI is used to indicate that the SEI carries the group identification information of the overlay. It should be understood that the load type of the SEI may also be used to indicate the attributes of the group.
  • the third information is carried in a media presentation description MPD corresponding to the media data stream containing the overlay, and the second information is carried in an overlay-related area control structure of the overlay.
  • the server may encapsulate the overlay based on the HTTP adaptive network adaptive media transmission protocol (Dynamic Adaptive Streaming Through HTTP, DASH) to obtain the MPD.
  • the third information is then carried in the MPD.
  • the third information is located in an MPD's adaptation set level or representation level overlay description word.
  • the first information is carried in a media presentation description MPD corresponding to a media data stream containing an overlay.
  • a media presentation description MPD corresponding to a media data stream containing an overlay.
  • the encapsulated overlay has an overlay structure.
  • the first information is located in an overlay description word of an adaptation set or a representation level of the MPD.
  • the overlay also corresponds to fourth information, and the fourth information is used to indicate that in a case where the first operation function is performed on the overlay, all overlays in a group to which the overlay belongs respond to the first Operational functions. It should be understood that the terminal can thus determine to operate in overlays in groups.
  • the overlay also corresponds to fifth information, and the fifth information is used to indicate that in a case where the first operation function is performed on the overlay, all overlays in a group to which the overlay belongs respond to the first An operation function, or the overlay responds to the first operation function. It should be understood that, in this way, the terminal can determine that the overlay can perform a group operation or an independent operation.
  • the overlay when the overlay corresponds to the first information, the overlay also corresponds to the sixth information, and the sixth information is used to indicate an operation function corresponding to the overlay.
  • the file format of the overlay further includes: an overlay group box, and the overlay group box carries name information of the overlay group.
  • an embodiment of the present application provides a terminal, and the terminal includes a module for responding to the method in any one of the foregoing implementation manners of the first aspect.
  • a terminal is a device capable of presenting media data (eg, video images) and / or one or more overlays to a user.
  • media data eg, video images
  • an embodiment of the present application provides a server, and the server includes a module for executing a method in any one of the foregoing implementation manners of the second aspect.
  • the server is a device capable of storing media data and processing one or more overlays corresponding to the media data.
  • the server may provide video images and the processed one or more overlays to the terminal, so that the terminal can provide the media data, One or more overlays are presented to the user.
  • a terminal including: a non-volatile memory and a processor coupled to each other; wherein the processor is configured to call a program code stored in the memory to execute the method in any implementation manner of the first aspect Some or all of the steps.
  • a server including: a non-volatile memory and a processor coupled to each other; wherein the processor is configured to call program code stored in the memory to execute any one of the implementation manners of the second aspect Part or all of the steps of the method.
  • a computer-readable storage medium stores program code, where the program code includes a part for performing a method in any implementation manner of the first aspect or Instructions for all steps.
  • a computer-readable storage medium stores program code, where the program code includes a part for performing a method in any implementation manner of the second aspect or Instructions for all steps.
  • a computer program product is provided, and when the computer program product runs on a computer, the computer is caused to execute instructions of some or all steps of the method in any one of the implementation manners of the first aspect.
  • a computer program product is provided, and when the computer program product runs on a computer, the computer is caused to execute instructions of some or all steps of the method in any one of the implementation manners of the second aspect.
  • FIG. 1 is a schematic diagram of a communication system according to an embodiment of the present application.
  • FIG. 2 is a schematic structural diagram of a communication device according to an embodiment of the present application.
  • FIG. 3 is a first schematic flowchart of a media data processing method according to an embodiment of the present application.
  • FIG. 4 is a second schematic flowchart of a method for processing media data according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a display interface according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of another display interface according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a device for processing media data according to an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of another apparatus for processing media data according to an embodiment of the present application.
  • Panoramic video Also known as 360-degree panoramic video, it is composed of a series of panoramic pictures. The content of the panoramic picture covers the entire sphere surface in three-dimensional space. It is a video shot with a full-scale 360-degree using a 3D camera. When watching a video, you can freely adjust the video to watch up, down, left and right.
  • MPD Media presentation description
  • Track Chinese translation "Track”, the definition of Track in the standard ISO / IEC 14496-12 "timed sequence of related samples (qv) in an ISO media file. Translated into:” The relevant samples in ISO media files Time attribute sequence.
  • a track a sequence of images, or a sampled audio
  • a track a track, a stream of channels
  • a stream of channels a stream of channels.
  • Track refers to a series of time-dependent samples in accordance with the ISOBMFF packaging method, such as video track.
  • Video samples are code streams generated by the video encoder after encoding each frame. All video samples are encapsulated according to the ISOBMFF specification. Generate a sample.
  • a sample sample for example, an individual frame of video, a series of video frame frames in coding order, or a compressed section of audio section in audio coding in order; inhint tracks, a sample sample definitions of one or more streaming packets.
  • the sample can be an independent video frame, a series of video frames placed in decoding order, or a compressed audio placed in the decoding order; in the cue track, the sample defines one or The shape of multiple stream packets.
  • box Chinese translation of "box”, the definition of box in the ISO / IEC 14496-12 standard: "object-oriented building block defined by unique type identifier and length. It can be translated as” object-oriented building block, Defined by a unique type identifier and length. "
  • ISOBMFF files are made up of multiple boxes, and boxes can contain other boxes.
  • SEI full name supplementary enhancement, is a type of Network Abstract Unit (NALU) defined in the video codec standards (h.264, h.265).
  • NALU Network Abstract Unit
  • Overlay Chinese translation "overlay”, that is, the media content superimposed on the background video (specifically, it can refer to an additional layer of rendered video or picture superimposed on a certain area of the background video picture), in the OMAF standard
  • the overlay can also be information such as the name and age of an element displayed on the background video.
  • background video background visual media
  • video that can be superimposed by overlay.
  • background visual media video that can be superimposed by overlay.
  • OMAF there are the following definitions and explanations: "piece of visual media, which is superimposed.”
  • Chinese translation Visual media film superimposed by the overlay.
  • multiple means two or more.
  • “And / or” describes the association relationship of related objects, and indicates that there can be three kinds of relationships, for example, A and / or B can represent: the case where A exists alone, A and B exist simultaneously, and B alone exists, where A, B can be singular or plural.
  • the character “/” generally indicates that the related objects are an "or” relationship.
  • “At least one or more of the following” or similar expressions refers to any combination of these items, including any combination of single or plural items.
  • At least one (a), a, b, or c can be expressed as: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple .
  • words such as “first” and “second” are used to distinguish between the same or similar items having substantially the same functions and functions. Those skilled in the art can understand that the words “first”, “second” and the like do not limit the number and execution order, and the words “first” and “second” are not necessarily different.
  • FIG. 1 shows a schematic diagram of a communication system provided by an embodiment of the present application.
  • the communication system includes a server 100 and at least one terminal 200 that communicates with the server 100.
  • the server 100 may be a media server having a function of processing panoramic video.
  • the terminal 200 may be a device having a function of playing a panoramic video.
  • the terminal 200 may be an electronic device such as VR glasses, a mobile phone, a tablet, a television, and a computer that can be connected to a network.
  • the terminal 200 receives the data sent by the media server, and decapsulates the code stream, and decodes and displays it.
  • the server 100 includes a pre-encoding processor 1001, a video encoder 1002, a code stream packaging device 1003, and a transmitting and transmitting device 1004.
  • the pre-encoding processor 1001 performs pre-processing on the panoramic video, such as image stitching, format conversion, etc., to convert the original panoramic video into a video that can be compression-encoded.
  • the video encoder 1002 is used to obtain the panoramic video from the pre-encoding processor 1001.
  • the video content is subjected to compression encoding or transcoding operation, and the encoded video bitstream is output.
  • the bitstream encapsulation device 1003 encapsulates the encoded bitstream data into a transportable file and transmits it to the terminal or the content distribution network through the network.
  • the server 100 may select the content to be transmitted for signal transmission according to the information (such as a user perspective) fed back by the terminal 200.
  • Terminal 200 includes: receiving device 2001, stream de-encapsulation device 2002, video decoder 2003, and display device 2004
  • the receiving device 2001 is configured to receive media data sent by the server 100.
  • the code stream decapsulating device 2002 is used for decapsulating the media data received by the receiving device 2001 to obtain a video code stream and code stream information corresponding to the code stream.
  • the video decoder 2003 is used to decode a video code stream and output a video image frame for display and playback.
  • FIG. 2 is a schematic diagram of a hardware structure of an apparatus for processing media data according to an embodiment of the present application.
  • the apparatus for processing media data shown in FIG. 2 may be regarded as a computer device, and the apparatus for processing media data may be used as an implementation manner of the server 100 or the terminal 200 in the embodiment of the present application, or may be used as an embodiment of the embodiment of the present application.
  • the apparatus for processing media data includes a processor 110, a memory 120, an input / output interface 130, and a bus 150.
  • the apparatus for processing media data may further include a communication interface 140.
  • the apparatus for processing media data may further include a display 160 for displaying video data to be played. For example, background video and one or more overlays.
  • the processor 110, the memory 120, the input / output interface 130, the communication interface 140, and the display 160 implement a communication connection with each other through the bus 150.
  • the processor 110 may use a general-purpose central processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits for executing related programs to To implement the functions required by the modules in the server in the embodiment of the present application, or to execute the method for processing media data in the method embodiment of the present application.
  • the processor 110 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method may be completed by an integrated logic circuit of hardware in the processor 110 or an instruction in the form of software.
  • the aforementioned processor 110 may be a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA), or other programmable logic device, Discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA ready-made programmable gate array
  • Various methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the steps of the method disclosed in combination with the embodiments of the present application may be directly implemented by a hardware decoding processor, or may be performed by using a combination of hardware and software modules in the decoding processor.
  • the software module may be located in a mature storage medium such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory, or an electrically erasable programmable memory, a register, and the like.
  • the storage medium is located in the memory 120, and the processor 110 reads the information in the memory 120 and, in conjunction with its hardware, completes the functions required by the modules included in the server in the embodiment of the present application, or performs processing of media data in the embodiments of the method of the present application method.
  • the memory 120 may be a read-only memory (Read Only Memory, ROM), a static storage device, a dynamic storage device, or a random access memory (Random Access Memory, RAM).
  • the memory 120 may store an operating system and other application programs. When software or firmware is used to implement the functions required by the modules included in the server in the embodiment of the present application, or the method for processing media data in the method embodiment of the present application, the method for implementing the technical solution provided in the embodiment of the present application is implemented.
  • the program code is stored in the memory 120, and the processor 110 performs operations required by the modules included in the server 100, or executes the method for processing media data provided by the method embodiment of the present application.
  • the input / output interface 130 is used to receive input data and information, and output data such as operation results.
  • the communication interface 140 uses a transceiving device such as, but not limited to, a transceiver to implement communication between a device that processes media data and other devices or a communication network. It can be used as an obtaining module or a sending module in a device for processing media data.
  • a transceiving device such as, but not limited to, a transceiver to implement communication between a device that processes media data and other devices or a communication network. It can be used as an obtaining module or a sending module in a device for processing media data.
  • the bus 150 may include a path for transmitting information between various components of a device that processes media data, such as the processor 110, the memory 120, the input / output interface 130, and the communication interface 140.
  • the apparatus for processing media data shown in FIG. 2 only shows the processor 110, the memory 120, the input / output interface 130, the communication interface 140, and the bus 150, in the specific implementation process, those skilled in the art It should be understood that the apparatus 100 also includes other devices necessary for achieving normal operation. At the same time, according to specific needs, those skilled in the art should understand that the apparatus for processing media data may further include hardware devices that implement other additional functions. In addition, those skilled in the art should understand that the apparatus for processing media data may also include only the components necessary to implement the embodiments of the present application, and not necessarily all the components shown in FIG. 2.
  • the apparatus for processing media data may further include one or more network cards for forming a session channel between the server 100 and the terminal 200 to transmit media services.
  • the overlay in the embodiments of the present application refers to the overlay media content superimposed on the background layer media content, and the overlay may be separately encoded as the media content or may be a part of the background layer media content. If the overlay is part of the media content of the background layer, the overlay may not be separately encoded, and the media data code stream obtained by the server after the media data is encapsulated will include the overlay information. If the overlay can be separately encoded as media content, the overlay codestream corresponding to each overlay will be obtained.
  • the media content is content displayed by playing media data.
  • overlay structure In the current OMAF standard document, the basic data structure of the overlay (abbreviated as overlay structure) and the carrying method have been defined, as shown in Table 1 below:
  • the overlay structure shown in Table 1 defines some basic attributes of the overlay structure, including the number of overlays (number, abbreviation: num), identification information (for example, Id number), overlay control symbols, and overlay control structure. Wait.
  • the value of the overlay control symbol syntax element overlay_control_flag can be used to indicate the function of the overlay control structure.
  • the semantics of overlay_control_flag include the overlay's associated source, hierarchical order, transparency, user operation information, flags, and priorities, as shown in Table 2:
  • an interactive control structure (OverlayInteraction control structure) is defined. It can be understood that the OverlayInteraction control structure is one of the overlay control structures. Among them, the OverlayInteraction control structure contains the types of interaction that the overlay may be operated by the user. The structure is shown in Table 3:
  • Table 4 is only a list of some operation functions. In the actual process, there may be other operation functions for overlay, and of course, there may be other operation functions.
  • FIG. 3 shows a schematic flowchart of a method for processing media data according to an embodiment of the present application.
  • the method includes:
  • Step 101 The server obtains media data.
  • the foregoing media data may be a video image, for example, a panoramic video.
  • the one or more overlays corresponding to the media data may be one or more overlays displayed on the media data.
  • the overlay layer may be a video or a picture displayed on the media data.
  • the picture overlaid on it may be a name or an age.
  • Step 102 The server processes the media data to obtain at least two overlay layers corresponding to the media data.
  • the overlay layer is a video, image, or text that is used to be superimposed on a background video or a background image for display.
  • processing of media data includes operations such as preprocessing, encoding, and encapsulation of the media data.
  • An example is that the overlay corresponds to the first information.
  • the first information includes group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay.
  • the identification information of other overlays is used to determine other overlays that belong to the same group as the overlay.
  • the identification information of other overlays corresponding to overlay1 is overlay1 and overlay2. This means that overlay1, overlay2, and overlay3 belong to the same group.
  • the overlay corresponds to the second information and the third information.
  • the first information and the third information are used to determine a group of the overlay, respectively.
  • the second information is used to indicate an operation function corresponding to the overlay.
  • the third information is used to indicate group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay.
  • the group identification information of the overlay in the same group is the same.
  • the operation functions corresponding to overlays in the same group have the same meaning: all operation functions corresponding to all overlays included in the same group are all the same.
  • overlay1 and overlay2 belong to group 1, and the operation functions corresponding to overlay1 and overlay2 both include rotation and the size of the window can be changed.
  • the same operation function corresponding to overlays in the same group means that at least one operation function corresponding to all overlays included in the same group is the same.
  • the operation functions corresponding to overlay1 include rotation and the size of the window can be changed.
  • the corresponding operation functions of overlay2 include rotation.
  • overlay1 and overlay2 can also be divided into group 1.
  • the group identification information is used to determine a group to which the overlay belongs.
  • the group identification information may be a group ID or a group name, which is not limited herein.
  • each overlay in the embodiment of the present application includes an overlay structure, and the overlay structure includes indication information for indicating an overlay operation function.
  • the operation function can be determined through the OverlayInteraction control structure. For example, rotation, free selection depth, window size can be changed, and so on.
  • the server encodes the media data to obtain one or more overlays included in the media data stream, and then determines at least one operation function that each overlay has when the media data stream is encapsulated, so that for any two or two In the above overlay, if any two or more overlays have at least one of the same operation functions, the server may set the group identification information of each of them to be the same. For example, if the operation function corresponding to overlay1 and overlay2 is rotation, the server may use the first information / third information corresponding to overlay1 and overlay2 to indicate group 1.
  • the overlay control structure also defines a control structure of an area (eg, a spherical area) associated with the overlay, which is used to indicate that when an area in a video image is triggered, the overlay display associated with the area can be triggered.
  • the control structure of the area associated with the overlay is an overlay associated area control structure (AssociatedSphereRegionStruct) as an example.
  • AssociatedSphereRegionStruct The syntax of AssociatedSphereRegionStruct is shown in Table 5 below:
  • SphereRegionStruct (1) in Table 5 defines a spherical area associated with the overlay.
  • the user can click the spherical area to trigger the overlay associated with the spherical area. On or off.
  • the above-mentioned area of the overlay may refer to an area just covered or occupied by the area of the overlay, that is, the media data in the area of the overlay belong to the overlay, and the media data in the overlay are all in the area of the overlay.
  • the above-mentioned area spatial information of the overlay may also be referred to as area spatial information of the area of the overlay.
  • the area spatial information of the overlay is used to indicate the spatial range or spatial position of the area associated with the overlay. In this way, when the user is watching a video image, the area associated with the area can be displayed in the video image by triggering the area.
  • the above-mentioned spatial position of the area associated with the overlay may specifically be directed to a coordinate system, and the coordinate system may be a three-dimensional coordinate system or a two-dimensional coordinate system.
  • the origin of the three-dimensional coordinate system may be the center point of the panoramic video image, the point in the upper left corner of the panoramic video image, or other fixed position points in the panoramic video image.
  • the spatial position of the area associated with the overlay may also be the position of the overlay in the panoramic video image area. Spatial location).
  • Scenario 2 You can add one or more overlays with a certain type of interaction to a group based on one or more overlays with the interaction defined in the OverlayInteraction control structure (you can call this group name for interaction) For: interaction group, it should be understood that the group for interaction operation may also be in another name). This enables the terminal to perform a certain type of defined operation function, such as an interactive operation, based on a trigger operation on the interaction group. Exemplarily, the interactive operation may be shown in Table 4, and details are not described herein again. At this time, each of the one or more overlays corresponds to the first information.
  • step 102 in the embodiment of the present application may be specifically implemented in the following manner: S1.
  • the server encodes the media data to obtain a media data code stream corresponding to the media data.
  • S2. The server encapsulates the media data stream obtained after encoding.
  • the encapsulated media data stream includes information of one or more overlays, and first information corresponding to each overlay in the one or more overlays. Or the second information and the third information corresponding to each overlay.
  • Each overlay in the one or more overlays corresponds to a file format.
  • the overlay in the following embodiments may be part of the media content (ie, media data) of the background layer, and the overlay may not be separately encoded at this time. That is, when the server encodes the media data, the obtained media data stream includes one or more overlays. The server may then encapsulate the media data stream including one or more overlays. For example, make the encapsulated media data stream correspond to the file description. Or the server encapsulates the media data stream so that one or more overlays included in the media data stream have an overlay structure.
  • the overlay can also be separately encoded as media content.
  • the server encodes the media data to obtain a media data stream, and then encodes the overlay included in the media data to obtain an overlay code stream.
  • the server encapsulates the media data stream and the overlay code stream, the encapsulated media data stream has overlay information.
  • the overlay information may be an overlay structure.
  • the second information and the third information may be carried in a file format of an overlay that encapsulates a media data stream including the overlay.
  • the file format includes: an overlay structure, and an overlay-related area control structure and an overlay group box located in the overlay structure.
  • the third information is located in the overlay group box, and the second information is located in the overlay associated area control structure.
  • the second information may be an overlay associated area control structure.
  • the operation function indicated by the third information may be display or shutdown.
  • the first information may be carried in the file format of the overlay. That is, one or more overlay file formats obtained after encapsulation may not have an overlay-related area control structure.
  • the file format includes: overlay group box.
  • the first information is located in an overlay group box. It should be understood that in scenario 2, the file format may also include an overlay structure.
  • the server may encapsulate the media data stream including one or more overlays according to the OMAF standard file format.
  • the file format of each overlay in the one or more overlays has an overlay control structure.
  • the overlay structure may have an overlay-related area control structure.
  • the overlay structure may be provided with an OverlayInteraction control structure.
  • the server may add a box corresponding to the overlay control region control structure to the file format, so that the overlay file format has an overlay control structure.
  • entity groups are defined for multiple overlays.
  • a type of group eg, a switching group
  • Multiple overlays in this switching group can be switched to each other.
  • the specific syntax for switching groups is shown in Table 6 below:
  • ref_overlay_id [i] represents identification information of other overlays that belong to the same group as an overlay.
  • the server can also make it possible to define identification information of other overlays that belong to the same group as the overlay in the overlay structure of each overlay, so as to replace the above-mentioned group identification information.
  • the identification information of the overlay is used to identify the overlay.
  • the identification information may be an ID number of the overlay.
  • overlay1 belongs to group1 and overlay2 also belongs to group1. Therefore, the identification information of group1 and the identification information of overlay2 can be defined in overlay1. In overlay2, identification information of group 1 and identification information of overlay1 can be defined.
  • entity entity group there may be an entity group and an overlay structure in a file format corresponding to each overlay.
  • entity entity group box EntityToGroupBox.
  • the file format in the embodiment of the present application further includes an overlay group box.
  • the overlay group box is used to indicate an operation function corresponding to the overlay when overlaying any one of the overlay group boxes.
  • the overlays all respond to this operation function.
  • the overlay group box can be defined as an OverlayConditionalShownGroupBox, which means that a group of overlays can be displayed or closed together when the user targets and triggers a certain overlay.
  • an overlay group box with an OverlayInteraction control structure may be an OverlayRelationGroupBox.
  • OverlayConditionalShownGroupBox and OverlayRelationGroupBox in the overlay group box in the embodiment of the present application may also have other names, which are not limited in the embodiment of the present application.
  • the group may be named after a common display group.
  • the common display group may indicate that multiple overlays in the common display group may be displayed or closed together.
  • the first prompt information corresponding to the group is used to indicate that the group can be expanded or closed together.
  • the interactive operation indicated by the corresponding operation type It can be understood that this is only an example, and the name of the group may also be another name, which is not limited in the embodiment of the present application.
  • the overlay group box in the embodiment of the present application may be OverlayConditionalShownGroupBox, which means that a group of overlays can be performed when the user triggers the display for any overlay or the group Show together.
  • the third information may be carried in the OverlayConditionalShownGroupBox.
  • the ref_overlay_id [i] in the above Table 7 indicates that the overlay_id corresponding to the track or image item indicated by the i-th entity_id is an overlay that can be displayed under the trigger of the user in this group. There will be an overlay_id corresponding to ref_overlay_id [i] in the referenced i-th track or image item.
  • the ref_overlay_id [i] syntax element in the structure is also allowed to exist.
  • Example 2-1 Taking interactive groups as an example, the file format of each overlay in one or more overlays obtained by the server after processing the media data has the group identification information of the overlay.
  • the overlay group box in the embodiment of the present application may be an OverlayRelationGroupBox, which is used to form multiple overlays into an interaction group, and all overlays in the interaction group may have the same interaction operation. At this time, the first information is carried in the OverlayRelationGroupBox.
  • the same interaction group specifies an interaction operation that can perform a certain type of operation function instruction for all overlays in the interaction group.
  • the other overlays in the OverlayRelationGroupBox also respond to the operation function corresponding to the OverlayRelationGroupBox.
  • Table 8 The specific syntax is shown in Table 8:
  • the interactive information syntax element included in the OverlayInteraction control structure when there are multiple overlays forming an interaction group OverlayRelationGroupBox. If any overlay in the interaction group is triggered, the operation functions defined in the OverlayInteraction control structure will be applied to each overlay in the interaction group together.
  • the OverlayRelationGroupBox defines an operation function for scaling all overlays in the OverlayRelationGroupBox as an example.
  • overlayA, overlayB, and overlayC in the OverlayRelationGroupBox form interaction group 1
  • resize_flag 1 in the OverlayInteraction control structure corresponding to overlayA, overlayB, and overlayC, respectively.
  • overlayA, overlayB, and overlayC will be scaled.
  • the OverlayRelationGroupBox definition is an operation function for changing the position of all overlays in the OverlayRelationGroupBox
  • change_position_flag 1 in the OverlayInteraction control structure corresponding to overlayA, overlayB, and overlayC respectively.
  • overlayA, overlayB, and overlayC will perform a position change operation.
  • the overlay in the embodiment of the present application can be displayed together with the background video, and the overlay and the background video can be bound for common display.
  • the syntax structure of overlay and background video is shown in Table 9:
  • step 102 in the embodiment of the present application may be specifically implemented in the following manner: S3.
  • the server encodes the media data to obtain a media data stream, and encodes one or more overlays included in the media data to obtain An overlay code stream corresponding to the overlay, and each overlay code stream includes a SEI.
  • the server encapsulates the media data stream and the overlay code stream corresponding to each overlay to obtain a media data stream including one or more overlay information.
  • the server can also separately encapsulate the overlay code stream. Then send the encapsulated overlay code stream to the terminal.
  • the SEI payload type is used to indicate that the SEI carries overlay group identification information.
  • the third information may be carried as an indication field in the SEI of the overlay code stream.
  • the first information may be used as an SEI of an overlay code stream.
  • the SEI has an indication field for indicating the group identification information of the overlay.
  • the encapsulated overlay may also include: an overlay associated area control structure.
  • the specific encapsulation process can refer to the above S2, which will not be repeated here.
  • the third information is carried in the SEI of the overlay code stream corresponding to the overlay, and the second information is carried in the overlay associated area control structure of the overlay.
  • the SEI corresponding to the scenario 1 may be named after a common group when carrying the group identification information, so that the second information may not be carried, that is, it is not defined in the overlay structure of the overlay overlay associated area control structure.
  • the first information may be carried as an indication field in the SEI of the overlay code stream corresponding to the overlay.
  • the encapsulated overlay may not have an overlay-related area control structure.
  • the operation function corresponding to each overlay in the group can be used to name the group.
  • the first information is carried in the SEI of the overlay code stream corresponding to the overlay.
  • the SEI is used to indicate group identification information of the overlay.
  • the syntax structure of the SEI is shown in Table 10:
  • the sei_payload in Table 10 defines the SEI payload information, including two parameters payloadType and payloadSize. Among them, the payloadType indicates the type of the SEI, and the payloadSize indicates the size of the SEI.
  • OLG in Table 10 is a variable, which represents the value of the payloadType of an SEI.
  • the value of OLG may be 190.
  • payloadSize indicates the payload size.
  • an overlay group can be represented as an overlay condition display group (overlay_conditional_shown_group).
  • the group identification information of the overlay in Table 10 may be replaced with overlay_conditional_shown_group_info (information).
  • overlay_conditional_shown_group_info information
  • Table 11 the syntax structure of overlay_conditional_shown_group_info
  • overlay_conditional_shown_group_id This value indicates the ID number of the group of the overlay.
  • the overlay group can be overlay_relation_group, and overlay_relation_group_info can be used to replace the overlay group identification information in Table 10.
  • the above interactive operations may refer to a common operation on a certain type of operation function, or a common operation on all operation functions supported by the overlay, which are not limited in the embodiments of the present application.
  • step S102 in the embodiment of the present application may be specifically implemented in the following manner: S5. Encode the media data to obtain a media data stream including one or more overlays. S6. The server encapsulates a media data stream including one or more overlays, and obtains a description file corresponding to the media data stream.
  • S6 may be specifically implemented in the following manner:
  • the server may encapsulate a media data stream including one or more overlays based on a DASH transmission protocol standard to obtain a media presentation description MPD of the media data stream as a description file.
  • the overlay descriptor of the MPD's adaptation, level, or representation level carries the group identification information of the overlay.
  • each overlay in the one or more overlays corresponds to the second information and the third information.
  • the description file includes at least third information of each overlay in one or more overlays, and the third information may be carried as an indication field in the description file of the media data stream. It should be understood that after the media data stream is encapsulated, one or more overlays included in the media data stream have an overlay associated area control structure. The specific encapsulation process can refer to the above S2, which will not be repeated here.
  • the third information is carried in an MPD corresponding to a media data stream obtained by encapsulating a media data stream including one or more overlays, and the second information is carried in an overlay-related area control structure of the overlay.
  • corresponding to scenario 1 can also be replaced in the following manner, that is, the overlay group in the description file containing the media data of the overlay is named after the operation function of the overlay. Has an overlay associated area control structure.
  • each overlay in one or more overlays corresponds to the first information.
  • the first information may be carried as an indication field in a description file containing the media data of the overlay. It can be understood that the name of the group can also be named by the operation function of the code stream.
  • the first information is carried in a media presentation description MPD of the media data obtained by encapsulating the media data including the overlay.
  • the first information or the third information is located in an overlay description word of an adaptation set level or a representation level of the MPD.
  • Example 5-1 Take the operation function as the display or shutdown as an example, so you can define a new @schemeIdUri for the overlay descriptor, the value is: "urn: mpeg: mpegI: omaf: 2018: ocsg", the semantics is common to overlay OCSG descriptor. A maximum of one OCSG descriptor is allowed to appear at the adaptation level or the representation level.
  • the value of the OCSG descriptor is a comma-separated string.
  • the specific values and semantics are defined in Table 12 below:
  • M represents a required parameter
  • O represents an optional parameter.
  • An adaptation set that has the same overlay_relation_group_id value belongs to the same interaction group.
  • the values in the adaptation set that belong to different groups can be different.
  • Table 13 shows an example of an MPD carrying a group indicating an overlay as a common display group:
  • Each of the two common display groups shown in Table 13 includes: two overlays.
  • a common display group Group 1 includes overlay1 and overlay2, and together shows that group 2 includes overlay3 and overlay4.
  • Example 6-1 Take the common interaction group of the group where the overlay is located as an example.
  • the server can define a new @schemeIdUri for the overlay descriptor, whose value is: "urn: mpeg: mpegI: omaf: 2018: ovly", the semantics is overlay common interaction grouping information (OVLY) descriptor, which describes the group for the overlay. Groups perform some kind of interaction. If the position of the overlay moves. A maximum of one OVLY descriptor can appear at the adaptation level or the representation level.
  • OVLY overlay common interaction grouping information
  • the value of the OVLY descriptor is a string separated by commas.
  • the specific values and semantic definitions are shown in Table 14 below:
  • adaptation sets having the same overlay_relation_group_id value belong to the same common interaction group, and the values in the adaptation sets belonging to different groups must be different.
  • Table 15 shows an example of an MPD carrying a group indicating that the overlay is a common interaction group:
  • the common interaction group 1 includes overlay1 and overlay2
  • common interaction group 2 includes overlay3 and overlay4.
  • the server may determine that one or more overlays having the same operation function belong to the same group. It can also be understood that if two or more overlays have the same operation function, two or more overlays with the same operation function can be divided into the same group. And the group can be named after the two or two overlays have a common operation. At this time, the group can correspond to an operation option, which is used to prompt the operation functions that the overlay in the group has in common.
  • the server may carry identification information indicating group 1 in overlay1 and overlay2. It should be understood that the operation option corresponding to group 1 is used to indicate the operation function shared by overlay1 and overlay2.
  • server may determine the respective operation function of each overlay through the respective overlay control structure of each overlay.
  • Step 103 The server sends one or more overlays to the terminal.
  • the server may send one or more overlays to the terminal through a transmitting and transmitting device.
  • the server may directly send the processed one or more overlays to the terminal. It is also possible to send the processed one or more overlays after receiving a request message for requesting an overlay sent by the terminal.
  • the first information, the second information, and the third information are included in the overlay.
  • the first information, the second information, and the third information are included in the MPD file.
  • the service uses the third possible implementation manner to process the media data, it is understood that the one or more overlays sent by the server in S103 also send the MPD corresponding to the one or more overlays to the terminal.
  • the MPD corresponding to one or more overlays includes information of each overlay.
  • the first information or the third information when carried in the SEI, it may be carried in the SEI of the overlay code stream corresponding to the overlay. If the server sends an overlay stream to the terminal, the terminal can display the overlay when decoding and playing the overlay stream.
  • Step 104 The terminal receives one or more overlays sent by the server.
  • the terminal may receive one or more overlays sent by the server through a receiving device.
  • the one or more overlays sent by the server may be implemented in the following manner: the server sends the encapsulated media data stream and the one or more overlays included in the media data stream to the terminal. Or, the server sends an overlay code stream corresponding to each overlay in the encapsulated one or more overlays to the terminal.
  • the terminal also needs to receive the MPD of the included media data stream of one or more overlays.
  • the terminal can obtain the group identification information from the overlay group box in the file format by analyzing the file format of the overlay.
  • the terminal can obtain the group identification information from the overlay group box in the file format by analyzing the file format of the overlay.
  • Step 105 When the overlay corresponds to the first information, the terminal processes at least two overlays according to the first information of the at least two overlays; or, when the overlay corresponds to the second information and the first information, In the third information, the terminal processes the at least two overlays according to the second information and the third information corresponding to the at least two overlays.
  • S105 may be implemented in the following manner: After receiving one or more overlays sent by the server, the terminal decapsulates the overlays to obtain first information corresponding to one or more overlays. Or after the terminal is decapsulated, the second information and the third information corresponding to one or more overlays are obtained. Then when the terminal decodes and plays media data, it can include in the client configuration or user interface prompts an operation option corresponding to the overlay in the same group, which is used to prompt all overlays in the group to perform common operations. Operational functions.
  • the terminal may determine each group of each overlay according to the first information corresponding to each overlay, and then may determine all overlays belonging to the same group.
  • the server may determine the interactive operation corresponding to each overlay according to the OverlayInteraction control structure of each overlay.
  • the terminal may determine a respective group of each overlay according to the third information corresponding to each overlay. You can then determine all overlays that belong to the same group. The terminal can determine the corresponding display or close operation function of each overlay according to the AssociatedSphereRegionStruct of each overlay.
  • the terminal may determine all overlays belonging to the same group in the following manner: The terminal divides the overlays with the same group identification information into the same group according to the group identification information corresponding to each overlay.
  • the terminal may determine that there are two groups, that is, group 1 and group 2.
  • the terminal may determine all overlays belonging to the same group by: the terminal indicates the identification information of any overlay and other overlays corresponding to any overlay according to the identification information of any overlay corresponding to any overlay The other overlays are grouped into the same group.
  • An embodiment of the present application provides a method for processing media data.
  • a terminal can process one or more overlays having the same group identification information by using first information corresponding to each overlay in at least two overlays.
  • each overlay can be processed one by one, which can reduce the complexity of operations and improve the user's subjective experience.
  • the method provided in this embodiment of the present application further includes:
  • Step 106 The terminal displays at least one group, and information indicating an operation function corresponding to each group in the at least one group and an overlay belonging to each group.
  • At least one group is determined by first information corresponding to each overlay in at least two overlays; an operation function corresponding to one group is determined by an overlay structure included in the overlay in each group.
  • At least one group is determined by third information corresponding to each overlay in the at least two overlays, and an operation function corresponding to one group is an overlay associated area included in the overlay in the group
  • the control structure is determined.
  • the terminal may also decode and play the received media data stream to display the media data.
  • the at least one group may be displayed overlaid on the media data.
  • Step 107 When any group in at least one group is triggered, all overlays belonging to any one group respond to the operation function corresponding to the group. Or when any overlay is triggered, any overlay and other overlays belonging to the same group as any overlay respond to the operation function of any overlay being triggered.
  • the operation function corresponding to a group is determined by the operation function shared by all overlays in the group.
  • a group can correspond to multiple operating functions.
  • any group All overlays respond to the group's triggered operation functions. It should be understood that if multiple operation functions corresponding to the group are triggered, all overlays in any one group respond to the multiple operation functions. If any one of a plurality of operation functions corresponding to the group is triggered, all overlays in any one group respond to any one of the triggered operation functions.
  • overlay1 and overlay2 belong to group 1, where the operation functions corresponding to overlay1 and overlay2 are rotation and size scaling.
  • the operation function corresponding to group 1 is also rotation and size scaling. If both rotation and size scaling are triggered, overlay1 and overlay2 respond to the rotation and size scaling operations. If the triggered operation function is rotation, overlay1 and overlay2 respond to the rotation operation.
  • the terminal may assign an operation option to each group, and the operation option is used to prompt an operation function that all overlays in the group can respond to.
  • each group may be assigned an operation option 1 for instructing execution of all operation functions.
  • the operation option 1 is triggered, if the group has multiple operation functions, all overlays in the group respond to multiple operation functions.
  • the user may be prompted for all overlays included in the group and the operation functions common to all overlays in the group. For example, when the mouse is on the operation option, but the click operation is not triggered, the user can also be prompted for all overlays included in the group and the operation functions common to all overlays in the group.
  • step 105 may be specifically implemented in the following manner:
  • the terminal parses each overlay to obtain the respective AssociatedSphereRegionStruct of each overlay.
  • the server determines whether each overlay's respective operation function is displayed or closed according to the AssociatedSphereRegionStruct.
  • the terminal parses the entity group in the media data stream containing one or more overlays, and obtains the OverlayConditionalShownGroupBox.
  • the terminal can further obtain ref_overlay_id. Therefore, the terminal can determine whether the operation function that can be performed on the one overlay and other overlays in the same common display group as the overlay is display or close.
  • step 106 may be specifically implemented in the following manner (1-1):
  • the client configuration or the user interface prompt may include operation options for triggering the display or closing of the common display group.
  • the media data may be displayed on the client or user interface of the terminal.
  • Step 107 may be specifically implemented in the following manner (1-2) or (1-3):
  • Method (1-2) When any group is triggered, the terminal displays all overlays in the any group. That is, all overlays in the group are displayed on the display interface.
  • the terminal closes all overlays in the first group. That is, all overlays in the first group are canceled.
  • the operation option may be displayed on the display interface in the form of icons or text.
  • the operation option is displayed in the form of a chart, when the user's touch operation or click operation is located on the chart or near the chart, the text used to prompt the corresponding function of the operation option can be displayed on the display interface.
  • an operation option of triggering display or closing the display may be performed on the overlay in the common display group in the client configuration or the user interface prompt of the terminal.
  • group 1 in FIG. 5 corresponds to operation option 1
  • group 2 corresponds to one operation option 2.
  • the display mode of the operation options is text as an example.
  • the terminal controls the first group. All overlays in a group perform operational functions.
  • the common display is taken as an example
  • the first group is taken as the group 1 shown in FIG. 5 as an example.
  • the terminal displays the information in the group 1 on the display interface. All overlays.
  • a touch operation is used as an example for the operation option 1 of the group 1 being triggered.
  • the display of the group 1 includes : Name corresponding to media content 1, name 2 corresponding to media content 2, and name 3 corresponding to media content 3.
  • the terminal closes all overlays included in group 1 and uses the group to 1 way is displayed on the display interface. That is, in response to closing the operation function of the display, the display interface at this time may be as shown in FIG. 5.
  • the terminal can simultaneously display the one or more overlays on the background video in response to the user's trigger operation.
  • the one or more overlays can be closed at the same time in response to the user's trigger operation.
  • Example 2-2 corresponds to Example 2-1.
  • step 105 can be specifically implemented in the following manner:
  • the terminal parses the overlay structure of each overlay to obtain the first information carried in the overlay structure of each overlay, and then determines the group identification information of each overlay according to the first information.
  • the terminal can parse the overlay structure of each overlay to obtain the operation function of each overlay.
  • the terminal may determine other overlays that belong to the same group as the overlay.
  • the terminal parses the entity group in the media data stream that contains one or more overlays.
  • the terminal may obtain the OverlayRelationGroupBox from the entity group, thereby obtaining the respective group identification information of each overlay, and / or ref_overlay_id.
  • the terminal can also determine the operation function of the overlay according to the overlay structure, and then can determine the identification information of all overlays belonging to the same group. And all the syntax elements in the overlay's OverlayInteraction control structure.
  • step 106 may be specifically implemented in the following manner (2-1):
  • the terminal configuration or the user interface prompt may include operation options corresponding to a common interaction group for a corresponding interaction operation.
  • the terminal determines all overlays in the first group according to the ref_overlay_id or the identification information of all overlays corresponding to the overlay. . The terminal then performs a common operation function on all overlays in the first group.
  • the operation function corresponding to the OverlayRelationGroupBox semantics is size scaling.
  • all overlays in the first group respond to the size scaling operation. If all overlays in the first group are displayed on the display interface, if the size scaling function corresponding to any of the overlays in the first group is triggered, then any of the triggered overlays responds to the size scaling operation. Other untriggered overlays in a group also respond to size scaling operations.
  • Example 3-2 corresponds to Example 3-1.
  • step 105 can be specifically implemented in the following manner:
  • the terminal decodes the NALU of one or more overlays to obtain the SEI contained in each overlay stream in the one or more overlays.
  • the SEI payload type is a value represented by OLG, it means that the SEI carries a common display group message.
  • the terminal continues to decode the SEI to obtain the overlay_conditional_shown_group_id, or the terminal continues to decode the SEI to obtain the ref_overlay_id corresponding to each overlay.
  • the terminal can determine all overlays belonging to the same common display group according to the overlay_conditional_shown_group_id or ref_overlay_id corresponding to each overlay.
  • the terminal searches for and resolves to the AssociatedSphereRegionStruct in the overlay control structure of each overlay, and learns that the overlay is triggered by the user or turned off.
  • Example 4-2 corresponds to Example 4-1.
  • step 105 may be specifically implemented in the following manner:
  • the terminal decodes the NALU of one or more overlays to obtain the SEI contained in each overlay stream in the one or more overlays.
  • the SEI payload type is a value represented by OLG, it means that the SEI carries a common interaction group message.
  • the terminal continues to decode the SEI to obtain the overlay_relation_group_id, which indicates the ID number of the common interaction group of the overlay or the terminal continues to decode the SEI to obtain the ref_overlay_id.
  • the terminal may determine all overlays belonging to the same common interaction group according to the obtained overlay_relation_group_id corresponding to each overlay or ref_overlay_id corresponding to each overlay.
  • Example 5-2 corresponds to Example 5-1.
  • step 105 can be specifically implemented in the following manner:
  • the terminal obtains one or an MPD corresponding to the overlay, and obtains an adaptation-level attribute by parsing the MPD corresponding to each overlay and obtains the overlay descriptor and the value of the attribute. Then, the identification information of the group to which each overlay belongs or the identification information of other overlays belonging to the same group can be obtained.
  • the terminal decapsulates each overlay, it can learn that the operation function of the overlay is to trigger the display or close the display by analyzing the AssociatedSphereRegionStruct in the overlay structure of each overlay.
  • the terminal can determine its operation function for the display or shutdown according to the group operation function of the overlay.
  • Example 6-2 corresponds to Example 6-1.
  • step 105 may be specifically implemented in the following manner:
  • the terminal obtains one or an MPD corresponding to the overlay, and obtains an adaptation-level attribute by parsing the MPD corresponding to each overlay and obtains the overlay descriptor and the value of the attribute. Then, the identification information of the group to which each overlay belongs or the identification information of other overlays belonging to the same group can be obtained. And according to the overlay structure in each overlay, determine the interactive operation that each overlay has.
  • the terminal may divide the overlay with the same group identification information into the same group. Or, the terminal converts it into the same group according to the identification information carried in each overlay and the identification information of other overlays that belong to the same group.
  • the embodiment of the present application can define conditions for performing a group operation on a group where an overlay is located.
  • one or more overlays in the embodiment of the present application further include fourth information, where the fourth information is used to indicate that in the case where the overlay responds to the first operation function, all overlays in the group to which the overlay belongs respond to An operation function (ie group operation).
  • the first operation function is any one of a plurality of operation functions of the overlay. It should be noted that the first operation function is any operation function that is triggered among a plurality of operation functions corresponding to the overlay.
  • the group operation refers to: in the case of performing an operation function on any overlay in a group, all overlays in the group to which the overlay belongs perform the operation function triggered by the overlay.
  • overlay1 and overlay2 are located in group 1.
  • overlay1 and overlay2 jointly respond to the operation function of overlay1 being triggered.
  • step 105 in the embodiment of the present application may be specifically implemented in the following manner: the terminal processes the at least two overlays according to the first information and the fourth information of the at least two overlays. Specifically, if the overlay corresponds to the fourth information, step 106 further includes that the terminal may include an operation option for performing a group operation on the group to which the overlay belongs in a client configuration or a user interface prompt.
  • step 107 For a specific implementation of step 107 at this time, reference may be made to the foregoing description, that is, the terminal uses the group as a granularity to perform triggered operation functions on all overlays in the group, and details are not described herein again. That is, any overlay is triggered, and all overlays in the group corresponding to any overlay respond to the operation function of any overlay being triggered.
  • one or more overlays in the embodiments of the present application further include: fifth information.
  • the fifth information is used to indicate that in the case where the overlay responds to the first operation function, all overlays in the group to which the overlay belongs respond to the first operation function (for example, group operation), or the overlay responds to the first operation function (single operation ).
  • step 105 may be specifically implemented in the following manner: the terminal processes the at least two overlays according to the first information and the fifth information of the at least two overlays.
  • the corresponding step 106 further specifically includes: the terminal displaying an operation option corresponding to the at least one group to perform a group operation and an operation option to perform an individual operation.
  • the corresponding step 107 further includes: if an individual operation is triggered, when any overlay is triggered, any overlay responds to the triggered operation function. If a group operation is triggered, when any overlay is triggered, any overlay and other overlays belonging to the same group as the overlay respond to the triggered operation function.
  • all overlays in the same group have fourth information.
  • all overlays in the same group have fifth information.
  • the individual operation refers to: in the case of performing an operation function on any overlay in a group, the any overlay responds to the triggered operation function. For example, when overlay1 is triggered, other overlays in group 1 do not respond to the operation function corresponding to overlay1, and only overlay1 responds to the operation function of overlay1.
  • the fourth information or the fifth information may be carried in an MPD of a media data stream, an SEI of an overlay code stream, or a file format.
  • the terminal may determine that the group to which the overlay belongs has only the permission with the granularity of the group operation. If the overlay carries the fifth information, the terminal may determine that the group to which the overlay belongs has permissions with group operations and individual operations as the granularity. The granularity of operation depends on the user's choice.
  • Example 2-3 combined with the above example 2-1, for example, the fourth information or the fifth information is located in the entity group defined in the overlay structure.
  • the following uses the third information and the fourth information as a condition type (condition_type) as an example.
  • condition_type is used to indicate a condition for the user to perform a certain type of common operation on the group.
  • condition_type may have different values, and different condition_type values indicate that the groups of the overlay have different permissions.
  • condition_type value is 0, which means that it has group operation authority. That is, when any overlay in the group is triggered, other overlays in the group also respond to the operation function of any overlay.
  • the condition_type value is 1, indicating that the group has group operation and individual operation permissions. If a group operation is triggered, when any overlay in the group is operated, other overlay codes in the group also respond to the operation function of any of the overlays being triggered. If an individual operation is triggered, when any overlay in the group is operated, only any overlay responds to the triggered operation function.
  • condition_type corresponding to all overlays in the same group are the same.
  • the terminal can obtain the fourth information or the fifth information by parsing to the entity group in the overlay structure.
  • condition_type 1
  • the client configuration or the user interface prompt may include an operation option for performing overlay group operations or an operation option for performing individual operations.
  • condition_type 1
  • the operation option of the group operation is not triggered, only interactive operations are responded to the triggered overlay.
  • any overlay in the group triggers the operation function corresponding to the OverlayRelationGroupBox semantics, according to all ref_overlay_id in the group, all overlays in the group will jointly respond to the triggered overlay. Interaction.
  • Example 4-3 combined with the above example 4-1, for example, the fourth information or the fifth information is located in the overlay code stream SEI. That is, the condition_type is defined in the SEI of the overlay stream corresponding to each overlay in one or more overlays, and the syntax structure is shown in Table 17:
  • step 105 the fourth information or the fifth information is located in the overlay code stream SEI corresponding to each overlay or the description file or file format of the media data including one or more overlays
  • step 107 the specific implementation of step 105 to step 107
  • Example 6-3 combined with the above example 6-1, for example, the fourth information or the fifth information is located in the MPD. Specifically, the fourth information and the fifth information may be located in the overlay description word of the overlay together with the group identification information.
  • overlay_relation_group_id and condition_type are defined in the MPD corresponding to one or more overlays, and the syntax structure is shown in Table 18:
  • condition_type for the definition and value of condition_type, reference may be made to the description in Table 16 above, which is not repeated here.
  • Table 19 shows the condition_type and syntax structure carried in the MPD file, as shown in Table 19:
  • Group 1 includes overlay1 and overlay2
  • Group 2 includes overlay 3 and overlay 4.
  • the group identification information of the overlay in this embodiment of the present application includes one or more group identification information of the overlay. That is, the overlay structure, description file, or SEI of the overlay can be used to indicate that the overlay corresponds to multiple groups. At this time, the first information is also used to indicate the number of groups corresponding to the overlay.
  • Example A when the information indicating that the overlay corresponds to multiple groups exists in the overlay structure, the overlay structure where each overlay is located is shown in Table 20:
  • overlay_relation_group_number indicates the number of groups to which the overlay belongs.
  • overlay_relation_group_id [i] represents the ID number of the i-th group in which the overlay is located.
  • the client configuration or the user interface prompt may set operation options for common operations on the same group overlay, and different groups have different operation options.
  • the overlay in any one group is triggered, all overlays in any one group will jointly respond to the triggered operation function of the triggered overlay. If an overlay is in multiple groups, the overlay will respond to user operations for different groups in which the overlay is located in turn.
  • the overlay structure of each overlay may also have a syntax element of condition_type, as shown in Table 21 below:
  • Example B When the information indicating that multiple groups correspond to the overlay exists in the SEI of the overlay stream, the SEI syntax structure corresponding to each overlay stream is shown in Table 22:
  • the SEI syntax shown in Table 22 may also have a syntax element of condition_type, as shown in Table 23 below:
  • an information description file indicating multiple groups corresponding to the overlay is taken as an example.
  • the information of multiple groups is located in the OVLY description word, and the OVLY description word corresponding to each overlay is shown in Table 24:
  • the adaptation sets having the same overlay_relation_group_id value belong to the same group, and the same overlay can belong to multiple different groups.
  • overlay_relation_group_number The number of groups to which the overlay belongs is specified by overlay_relation_group_number, and overlay_relation_group_id indicates the ID number of the group to which the overlay belongs.
  • condition_type is a condition type in which the groups corresponding to the overlay_relation_group_id interact together.
  • Table 25 shows a specific example of syntax elements with multiple group identification information and condition_type in the MPD, as shown in Table 25:
  • the multiple groups correspond to different operation options.
  • the overlay has fourth information
  • the overlay is located in the second group and the third group as an example.
  • the terminal performs an operation function corresponding to the second group on all overlays in the second group, and performs a third group on all overlays in the third group.
  • Corresponding operation function That is, all overlays in the second group respond to the operation function of the second group, and all overlays in the third group respond to the operation function corresponding to the third group.
  • the terminal performs an operation function corresponding to the second group on all overlays in the second group. (That is, all overlays in the second group respond to the operation functions corresponding to the second group), and perform the operation functions corresponding to the third group on all overlays in the third group (that is, the All overlays respond to the operation function corresponding to the third group). If a separate operation is triggered, when the overlay in the second group is triggered, the terminal performs an operation function corresponding to the second group on the overlay, and performs a third group on the overlay in the third group. Corresponding operation function.
  • the first information when each overlay in one or more overlays in the embodiment of the present application corresponds to first information, the first information further includes first indication information, and an operation function corresponding to a group in which the overlay is located is The operation type indicated by the first instruction information is determined.
  • the first indication information is used to indicate a type of an interaction operation (overlay_interaction_type). That is, it is used to indicate the specific operation function corresponding to the overlay.
  • Table 26 shows an example of the first indication information carried in the SEI syntax.
  • overlay_relation_group_info (payloadSize) ⁇ Descriptor overlay_relation_group_id Zh overlay_interaction_type Zh ⁇ Zh
  • overlay_interaction_type indicates the type of the group that the current overlay can perform common interaction operations on.
  • One representation is to indicate by bit, as shown in Table 27 below:
  • a bit of the overlay_interaction_type When a bit of the overlay_interaction_type has a value, it means that the overlay can perform an interaction operation corresponding to the bit in a group operation mode.
  • Table 27 only exemplifies the types of partial interaction operations, and the types of common interaction operations of the overlay may not be limited to those shown in Table 27 above.
  • the terminal determines the type of interaction operation that can be performed in a group operation based on the value of overlay_interaction_type. For example, when the terminal determines that the values of the overlay_interaction_type corresponding to all overlays in the same group are the bit index 6 in Table 24, the terminal determines that the operation function possessed by the group can be rotated together. If the terminal determines that one or more overlays belong to multiple groups, the terminal corresponds to each group with an operation option corresponding to a type of the interactive operation indicated by the first indication information. When an operation option corresponding to any one group is triggered, all overlays in any one group operate all overlays in any one group according to the operation function indicated by the first instruction information.
  • the terminal performs NALU decoding on one or more overlay code streams to obtain the SEI contained in the overlay code stream.
  • SEI payload type is a value represented by OLG, it indicates that the SEI is an overlay common interaction group Group of messages.
  • the terminal continues to decode to obtain the identification information of the overlay_relation_group_id in the SEI and other overlays that belong to the same group, and the overlay_interaction_type value, that is, the ID number of the common interaction group of the overlay and the type of group interaction .
  • the terminal decodes the part, it obtains the group identification information of all overlays, and determines that the overlays belong to the same group.
  • the terminal can also determine the type of interaction that can be performed according to the overlay_interaction_type value.
  • the client configuration or user interface prompt can set user interaction option information for overlays with the same ID number as a group, and specify the type of group operation that can be performed according to the overlay_interaction_type value. Different interaction options are given for overlay groups with different ID numbers. When the user clicks or activates the option corresponding to the ID number, the overlay of the group corresponding to the ID number can be operated simultaneously according to the operation type of the specified group.
  • overlay_interaction_type may also be carried in the MPD.
  • Table 27 For the specific process, refer to the description at Table 27, which is not repeated here.
  • the server may add common interaction group description information to a file format corresponding to each overlay.
  • the difference from Example 1-1 is that in this embodiment, the group identification information of the overlay is located in the overlay control structure. That is, the third information is located in an overlay control structure included in the overlay structure.
  • the server encodes one or more overlay media data to obtain one or more overlay media data streams, and then encapsulates the one or more overlay media data streams.
  • each overlay has an overlay control structure
  • each overlay has an overlay control structure, and may further include an Overlay control symbol Overlay group, which is used to indicate group identification information of the overlay.
  • the Overlay control symbols are shown in Table 28:
  • overlay_relation_group_id represents the ID number of the group to which the overlay belongs.
  • the group identification information of the overlays belonging to the same group is the same, and the group identification information of the overlays belonging to different groups is different.
  • the terminal obtains the overlay control structure syntax element overlay_control_flag after decapsulating one or more overlays, thereby obtaining the Overlay group represented by the tenth bit in Table 29, and then obtains the OverlayGroup structure information to obtain the overlay group identification information . After the parsing is completed, obtain the group information of all overlays.
  • the embodiment of the present application adds the group identification information used to indicate the overlay to the file format, the SEI or the MPD of the overlay code stream, so that the terminal can display the same group identification information in groups.
  • One or more overlays are used to display the same group identification information in groups.
  • the user can perform the corresponding operation function of the group's overlay at the same time, which reduces the steps when the user performs the same operation on one or more overlays, and improves the user's watching VR.
  • the subjective experience of the video is a subjective experience of the video.
  • the second information and the third information in the embodiment of the present application may be carried in an overlay file format that encapsulates a media data stream including an overlay, and the file format includes an entity group. Box (EntityToGroupBox) and overlay structure, wherein the overlay structure has an overlay associated area control structure.
  • the second information is located in the overlay association area control structure, and the third information is located in the EntityToGroupBox.
  • the group_id in Table 30 indicates the unique ID number of the group, which is different from any other EntityToGroupBox structure ID number.
  • num_entities_in_group represents the number of entities in the current group, and entity_id corresponds to the ID number of an entity in the file format.
  • the file format in the embodiment of the present application may also have the following table 31
  • ocsg represents a type of grouping_type, which is used to indicate that the group type is a common display group.
  • the terminal can resolve from EntityToGroupBox to OverlayConditionalShownGroupBox, which means that a group of overlays can be displayed together when the user triggers the display for any overlay or the group.
  • a common close is performed.
  • the ID number of the overlay contained in the current overlay group is represented by the entity_id in the current EntityToGroupBox.
  • the OverlayConditionalShownGroupBox can also contain information about the name of the overlay group.
  • the syntax structure is shown in Table 32 below:
  • overlay_group_label is a UTF-8 encoded string of unlimited length, representing the description of the overlay group. Can be null
  • overlay_group_label is used to give group description information, and it can be displayed on the user's display interface as group information.
  • the first information in the embodiment of the present application may be carried in an overlay file format that encapsulates a media data stream including an overlay, and the file format includes an EntityToGroupBox and an overlay structure, where The overlay structure includes control symbols.
  • the first information is in EntityToGroupBox.
  • the file format in the embodiment of the present application may also have the syntax shown in Table 32:
  • ovrg also represents a type of grouping_type (group type), which is used to indicate that the group type is a common interaction group.
  • the overlay structure can adopt the above description, which is not limited here.
  • the terminal can resolve from EntityToGroupBox to OverlayRelationGroupBox, which means that when a group of overlays can perform interactive operations on any overlay or all overlays in the group, the operation function of the overlay interactive operation is determined based on the overlay structure of each overlay. .
  • the ID number of the overlay contained in the current overlay group is represented by the entity_id in the current EntityToGroupBox.
  • the OverlayRelationGroupBox can also contain information about the name of the overlay group.
  • the syntax structure is shown in Table 34 below:
  • overlay_group_label is used to give group name information, which can be displayed on the user's display interface as group information.
  • the name information of the overlay group can be carried in the overlay group box.
  • the terminal can resolve the name information of the overlay group carried by the overlay group box, and display the group name indicated by the name information of the overlay group.
  • the overlay group box is defined as an OverlayRelationGroupBox
  • the name information of the overlay group is located in the OverlayRelationGroupBox.
  • the overlay group box is defined as an OverlayConditionalShownGroupBox
  • the name information of the overlay group is located in the OverlayConditionalShownGroupBox.
  • each network element such as a device for processing media data
  • each network element includes a hardware structure and / or a software module corresponding to each function.
  • this application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is performed by hardware or computer software-driven hardware depends on the specific application of the technical solution and design constraints. Professional technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.
  • each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional unit. It should be noted that the division of the units in the embodiments of the present application is schematic, and is only a logical function division. There may be another division manner in actual implementation.
  • FIG. 7 is a schematic block diagram of an apparatus for processing media data according to an embodiment of the present application.
  • the apparatus for processing media data may be a terminal or a chip applied to the terminal.
  • the apparatus 500 for processing media data shown in FIG. 7 includes an obtaining module 510 and a processing module 520.
  • the obtaining module 510 and the processing module 520 in the apparatus 500 for processing media data may perform various steps performed by the terminal in the methods shown in FIG. 3 and FIG. 4.
  • the specific functions of the obtaining module 510 and the processing module 520 are as follows:
  • the obtaining module 510 is configured to receive at least two overlay layers corresponding to the media data.
  • each overlay corresponds to first information
  • the first information includes group identification information of the overlay
  • the overlay corresponds to second information and third information
  • the second information is used to indicate all
  • the third information includes group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay.
  • a processing module 520 is configured to process the at least two overlays according to the first information of the at least two overlays when the overlay corresponds to the first information.
  • a processing module 520 is configured to, when the overlay corresponds to the second information and the third information, the terminal performs a pairing process on the at least two according to the second information and the third information corresponding to the at least two overlays. An overlay is processed.
  • the apparatus 500 for processing media data executes the method shown in FIG. 4, the apparatus 500 for processing media data further includes a display module 530.
  • the specific functions of the display module 530 and the processing module 520 are as follows:
  • the display module 530 is configured to display at least one group, and information indicating an operation function corresponding to each group in the at least one group and an overlay belonging to each group.
  • At least one group is determined by first information corresponding to each overlay in at least two overlays; an operation function corresponding to one group is determined by an overlay structure included in the overlay in each group.
  • At least one group is determined by third information corresponding to each overlay in the at least two overlays, and an operation function corresponding to one group is an overlay associated area included in the overlay in the group
  • the control structure is determined.
  • the processing module 520 is configured to process, when any one of the at least one group is triggered, all overlays belonging to any one group in response to an operation function corresponding to the group. Or when any overlay is triggered, it handles any overlay and other overlays that belong to the same group as any overlay in response to the operation function of any overlay being triggered.
  • FIG. 8 is a schematic block diagram of an apparatus 600 for processing media data according to an embodiment of the present application.
  • the apparatus 600 for processing media data may be a server. Or a chip used in a server.
  • the apparatus 600 for processing media data shown in FIG. 8 includes an obtaining module 610, a processing module 620, and a sending module 630.
  • the obtaining module 610, the processing module 620, and the sending module 630 in the apparatus 600 for processing media data may perform each step of the method shown in FIG. 3 and FIG. 4 by the server.
  • the specific functions of the obtaining module 610, the processing module 620, and the sending module 630 are as follows:
  • the obtaining module 610 is configured to obtain media data.
  • a processing module 620 configured to process media data to obtain at least two overlay layers corresponding to the media data
  • the sending module 630 is configured to send one or more overlays to the terminal.
  • the input / output interface 130 is configured to acquire media data.
  • the processor 110 is configured to process media data to obtain at least two overlay layers corresponding to the media data.
  • the input / output interface 130 is also used to send one or more overlays to the terminal.
  • the input / output interface 130 is configured to receive at least two overlay layers corresponding to the media data.
  • each overlay corresponds to first information
  • the first information includes group identification information of the overlay
  • the overlay corresponds to second information and third information
  • the second information is used to indicate all
  • the third information includes group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay.
  • the processor 110 is configured to process the at least two overlays according to the first information of the at least two overlays when the overlay corresponds to the first information. Alternatively, the processor 110 is configured to: when the overlay corresponds to the second information and the third information, the terminal performs, on the at least two, the second information and the third information corresponding to the at least two overlays. An overlay is processed.
  • the display 160 is configured to display at least one group, and information used to indicate an operation function corresponding to each group in the at least one group and an overlay belonging to each group.
  • At least one group is determined by first information corresponding to each overlay in at least two overlays; an operation function corresponding to one group is determined by an overlay structure included in the overlay in each group.
  • At least one group is determined by third information corresponding to each overlay in the at least two overlays, and an operation function corresponding to one group is an overlay association area included in the overlay in the group
  • the control structure is determined.
  • the display 160 is configured to process all overlays belonging to any one group in response to an operation function corresponding to the group when any one of the groups is triggered. Or when any overlay is triggered, it handles any overlay and other overlays that belong to the same group as any overlay in response to the operation function of any overlay being triggered.
  • the disclosed systems, devices, and methods may be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the unit is only a logical function division.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
  • the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of this application is essentially a part that contributes to the existing technology or a part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present application.
  • the aforementioned storage media include: U disks, mobile hard disks, read-only memories (ROMs), random access memories (RAMs), magnetic disks or compact discs and other media that can store program codes .

Abstract

Disclosed are a method for processing media data, a terminal and a server, relating to the technical field of streaming media transmission and used for reducing the operation complexity of a user needing to respectively execute the same operation on overlays in order to realize a corresponding target so as to make the operation of the user on the overlay more effective and improve the user subjective experience. The method comprises: a terminal acquiring at least two overlays corresponding to media data, wherein the overlay corresponds to first information or the overlay corresponds to second information and third information, the first information comprises group identification information, the second information is used for indicating an operation function of the overlay, the third information is used for indicating the group identification information, and the overlays in the same group correspond to the same operation function; and a terminal processes at least two overlays according to the first information of at least two overlays.

Description

一种处理媒体数据的方法、终端及服务器Method, terminal and server for processing media data
本申请要求于2018年9月27日提交美国专利局、申请号为62/737,900、发明名称为“Method,terminal and server for processing media data”的美国临时专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority from a U.S. provisional patent application filed on September 27, 2018 with the U.S. Patent Office, application number 62 / 737,900, and invention name "Method, terminal and server for processing media." Incorporated in this application.
技术领域Technical field
本申请实施例涉及流媒体传输技术领域,尤其涉及一种处理媒体数据的方法、终端及服务器。The embodiments of the present application relate to the technical field of streaming media transmission, and in particular, to a method, terminal, and server for processing media data.
背景技术Background technique
ISO/IEC 23090-2标准规范又称为OMAF(Omnidirectional media format,全向媒体格式)标准规范,该规范定义了一种媒体应用格式,该媒体应用格式能够在应用中实现全向媒体的呈现,全向媒体主要是指全景视频(360度视频)和相关音频。OMAF规范首先指定了可以用于将球面视频转换为二维视频的投影方法的列表,其次是如何使用ISO基本媒体文件格式(ISO base media file format,ISOBMFF)存储全向媒体和该媒体相关联的元数据,以及如何在流媒体系统中封装全向媒体的数据和传输全向媒体的数据,例如通过基于超文本传输协议(Hyper Text Transfer Protocol,HTTP)的动态自适应流传输(Dynamic Adaptive Streaming over HTTP,DASH),ISO/IEC 23009-1标准中规定的动态自适应流传输。The ISO / IEC 23090-2 standard specification is also called the OMAF (Omnidirectional Media Format) standard specification. This specification defines a media application format that can implement omnidirectional media presentation in applications. Omnidirectional media mainly refers to panoramic video (360-degree video) and related audio. The OMAF specification first specifies a list of projection methods that can be used to convert spherical video into two-dimensional video, and secondly, how to use ISO base media file format (ISOBMFF) to store omnidirectional media and the associated media. Metadata, and how to encapsulate omnidirectional media data and transmit omnidirectional media data in a streaming media system, such as through Dynamic Adaptive Streaming based on HyperText Transfer Protocol (HTTP) HTTP, DASH), ISO / IEC 23009-1 standard dynamic adaptive streaming transmission.
目前提出了在虚拟现实(Virtual Reality,VR)视频中的某些区域可叠加显示额外的覆盖层(overlay)(例如,图片或视频)的应用场景。当前OMAF标准中,已定义了overlay的基本数据结构。overlay的基本数据结构定义了overlay结构的一些基本属性(例如,包括overlay的个数,Id号,控制符号,控制结构等)。其中,控制符号语法元素overlay_control_flag的语义中定义了各结构具体功能。当终端解析到overlay后,可以根据这些语法元素确定对overlay如何处理。Application scenarios are currently proposed in which certain areas in a virtual reality (VR) video can be superimposed to display additional overlays (eg, pictures or videos). In the current OMAF standard, the basic data structure of the overlay has been defined. The basic data structure of the overlay defines some basic properties of the overlay structure (for example, including the number of overlays, Id number, control symbol, control structure, etc.). Among them, the specific function of each structure is defined in the semantics of the control symbol syntax element overlay_control_flag. After the terminal resolves to the overlay, it can determine how to handle the overlay based on these syntax elements.
但是,上述语法元素均针对单个overlay可进行的操作和处理方式进行描述。当用户需要对一个或者多个overlay进行同一个操作时,如果采用当前的方法需要针对所有overlay逐个操作才能实现,这样会增加操作的复杂度,且降低了用户观看视频时的体验。However, the above syntax elements are described in terms of operations and processing methods that can be performed by a single overlay. When the user needs to perform the same operation on one or more overlays, if the current method is used to implement all overlays one by one, this will increase the complexity of the operation and reduce the user's experience when watching videos.
发明内容Summary of the Invention
本申请实施例提供一种处理媒体数据的方法、终端及服务器,用以降低需要用户对各个overlay分别执行相同操作才能实现相应目的操作复杂度,使用户对overlay的操作更为高效,提升用户主观体验。The embodiments of the present application provide a method, a terminal, and a server for processing media data, so as to reduce the complexity of operations that require users to perform the same operations on each overlay to achieve the corresponding purpose, make the operations on overlays more efficient, and improve the subjectivity of users. Experience.
为了达到上述目的,本申请实施例提供如下技术方案:In order to achieve the foregoing objective, the embodiments of the present application provide the following technical solutions:
第一方面,本申请实施例提供一种处理媒体数据的方法,包括:终端接收媒体数据对应的至少两个覆盖层overlay;overlay对应第一信息,或者,overlay对应第二信息和第三信息;其中,第一信息包括所述overlay的群组标识信息,第二信息用于指示overlay对应的操作功能;第三信息用于指示overlay的群组标识信息或者与所述overlay属于同一个群组的其他overlay的标识信息;当所述overlay对应所述第一信息时,所 述终端根据所述至少两个overlay的所述第一信息对所述至少两个overlay进行处理,或者,当overlay对应第二信息和第三信息时,终端根据至少两个overlay对应的第二信息和第三信息对至少两个overlay进行处理。In a first aspect, an embodiment of the present application provides a method for processing media data, including: a terminal receiving at least two overlays corresponding to media data; overlay corresponding to first information; or overlay corresponding to second information and third information; The first information includes the group identification information of the overlay, the second information is used to indicate the operation function corresponding to the overlay, and the third information is used to indicate the group identification information of the overlay or the information that belongs to the same group as the overlay. Identification information of other overlays; when the overlay corresponds to the first information, the terminal processes the at least two overlays according to the first information of the at least two overlays, or when the overlay corresponds to the first information In the case of the second information and the third information, the terminal processes the at least two overlays according to the second information and the third information corresponding to the at least two overlays.
本申请实施例提供一种处理媒体数据的方法,终端通过根据至少两个overlay中每个overlay对应的第一信息,这样可以对群组标识信息相同的一个或者多个overlay进行处理。与现有技术中对一个或者多个overlay进行相同处理时只能逐个处理每个overlay相比,可以降低操作的复杂度,提升用户主观体验。An embodiment of the present application provides a method for processing media data. A terminal can process one or more overlays having the same group identification information by using first information corresponding to each overlay in at least two overlays. Compared with the prior art, in which one or more overlays are processed the same, each overlay can be processed one by one, which can reduce the complexity of operations and improve the user's subjective experience.
应理解,上述终端可以根据至少两个overlay的第一信息对属于同一个群组中的overlay进行相同的处理。例如,以群组为粒度展示同一个群组中的所有overlay。或者以群组为粒度关闭同一个群组中的所有overlay。例如,对同一个群组中的所有overlay以群组为粒度进行缩放。It should be understood that the foregoing terminal may perform the same processing on the overlays belonging to the same group according to the first information of at least two overlays. For example, to display all overlays in the same group at the granularity of the group. Or close all overlays in the same group with group granularity. For example, all overlays in the same group are scaled with group granularity.
一种可能的实现方式中,终端根据所述至少两个overlay的所述第一信息对所述至少两个overlay进行处理,包括:终端显示至少一个群组,以及用于指示至少一个群组中每个群组对应的操作功能的信息和属于每个群组中的overlay。至少一个群组由所述至少两个overlay中每个overlay对应的第一信息确定;一个群组对应的操作功能由所述每个群组中的overlay包括的overlay结构确定。In a possible implementation manner, the terminal processing the at least two overlays according to the first information of the at least two overlays includes: the terminal displays at least one group, and is used to indicate at least one group Information about the operation functions corresponding to each group and the overlays belonging to each group. At least one group is determined by the first information corresponding to each of the at least two overlays; the operation function corresponding to one group is determined by the overlay structure included in the overlay in each group.
一种可能的实现方式中,终端根据所述至少两个overlay对应的第二信息和第三信息对所述至少两个overlay进行处理,包括:终端显示至少一个群组,以及用于指示至少一个群组中每个群组对应的操作功能的信息和属于每个群组中的overlay。至少一个群组由所述至少两个overlay中每个overlay对应的第三信息确定,一个群组对应的操作功能由所述群组中的overlay包括的overlay关联区域控制结构确定。In a possible implementation manner, the terminal processes the at least two overlays according to the second information and the third information corresponding to the at least two overlays, including: the terminal displaying at least one group, and indicating at least one group The information of the operation function corresponding to each group in the group and the overlay belonging to each group. At least one group is determined by third information corresponding to each of the at least two overlays, and an operation function corresponding to one group is determined by an overlay-related area control structure included in the overlay in the group.
应理解,终端可以在显示界面上显示至少一个群组,以及用于指示每个群组对应的操作功能的信息,此外,还可以显示用于指示每个群组中的overlay的信息,这样便于用户获知每个群组对应的操作功能,以及每个群组中具有哪些overlay。It should be understood that the terminal may display at least one group on the display interface and information used to indicate the operation function corresponding to each group. In addition, it may also display information used to indicate the overlay in each group, which is convenient. The user knows the operation function corresponding to each group and what overlays there are in each group.
一种可能的实现方式中,终端根据所述至少两个overlay的所述第一信息对至少两个overlay进行处理,或者所述终端根据所述至少两个overlay对应的第二信息和第三信息对所述至少两个overlay进行处理,包括:当任一个群组被触发,该任一个群组中的所有overlay响应该任一个群组的操作功能。In a possible implementation manner, the terminal processes at least two overlays according to the first information of the at least two overlays, or the terminal processes second information and third information corresponding to the at least two overlays. Processing the at least two overlays includes: when any group is triggered, all overlays in the any group respond to an operation function of the any group.
一种可能的实现方式中,终端根据所述至少两个overlay的所述第一信息对所述至少两个overlay进行处理,或者所述终端根据所述至少两个overlay对应的第二信息和第三信息对所述至少两个overlay进行处理,包括:当任一个群组中的任一个overlay被触发,与该任一个overlay属于同一个群组中的其他overlay也响应该任一个群组的操作功能。In a possible implementation manner, the terminal processes the at least two overlays according to the first information of the at least two overlays, or the terminal processes the second information and the first according to the second information corresponding to the at least two overlays. Three pieces of information process the at least two overlays, including: when any overlay in any group is triggered, other overlays belonging to the same group as any overlay also respond to the operation of the any group Features.
一种可能的实现方式中,操作功能为展示,本申请实施例提供的方法还包括:当至少一个群组中的任一个群组被触发,任一个群组中的overlay被展示。这样使得终端可以基于触发操作,以群组为粒度展示属于同一个群组中的所有overlay。应理解,此处以overlay的操作功能为展示为例,在实际过程中一个overlay的操作功能还可以为尺寸缩放、位置改变等。In a possible implementation manner, the operation function is display. The method provided in the embodiment of the present application further includes: when any one of the at least one group is triggered, the overlay in any one of the groups is displayed. In this way, the terminal can display all overlays belonging to the same group with granularity based on the triggered operation. It should be understood that the operation function of the overlay is taken as an example here. In an actual process, an operation function of the overlay may also be size scaling, position change, and the like.
一种可能的实现方式中,操作功能为关闭,本申请实施例提供的方法还包括:当 任一个overlay被触发,显示在终端上的任一个overlay以及与任一个overlay属于同一个群组中的其他overlay被关闭。这样使得当多个overlay显示在终端上时,终端可以基于触发操作,以群组为粒度关闭属于同一个群组中的所有overlay。与现有技术中如果终端上显示多个overlay时,需要逐个关闭每个overlay相比,可以降低操作复杂度。In a possible implementation manner, the operation function is turned off. The method provided in the embodiment of the present application further includes: when any overlay is triggered, any overlay displayed on the terminal, and any overlay that belongs to the same group as any overlay Other overlays are closed. In this way, when multiple overlays are displayed on the terminal, the terminal can close all overlays belonging to the same group with the granularity of the group based on the trigger operation. Compared with the prior art, if multiple overlays are displayed on a terminal, each overlay needs to be closed one by one, which can reduce the operation complexity.
一种可能的实现方式中,overlay还对应第四信息,第四信息用于指示在对overlay执行第一操作功能的情况下,overlay所在群组中的所有overlay响应第一操作功能,终端根据所述至少两个overlay的所述第一信息对所述至少两个overlay进行处理,或者所述终端根据所述至少两个overlay对应的第二信息和第三信息对所述至少两个overlay进行处理,还包括:终端根据至少两个overlay的第一信息和第四信息对至少两个overlay进行处理;或者,终端根据至少两个overlay的第二信息,第三信息和第四信息对至少两个overlay进行处理。通过第四信息,终端可以确定overlay作群组操作处理。In a possible implementation manner, the overlay also corresponds to fourth information. The fourth information is used to indicate that, when the first operation function is performed on the overlay, all the overlays in the group to which the overlay belongs respond to the first operation function. The first information of the at least two overlays processes the at least two overlays, or the terminal processes the at least two overlays according to second information and third information corresponding to the at least two overlays. And further includes: the terminal processes at least two overlays according to the first information and the fourth information of the at least two overlays; or the terminal processes at least two overlays according to the second information, the third information, and the fourth information of the at least two overlays overlay for processing. Through the fourth information, the terminal can determine the overlay for group operation processing.
一种可能的实现方式中,至少一个群组中每个群组还对应用于指示群组操作的指示信息。In a possible implementation manner, each group in the at least one group further corresponds to indication information used to indicate a group operation.
应理解,终端在显示至少一个群组时,还可以根据第四信息,显示每个群组对应的指示群组操作的指示信息。It should be understood that when the terminal displays at least one group, the terminal may further display, according to the fourth information, instruction information indicating the group operation corresponding to each group.
一种可能的实现方式中,当任一个overlay执行第一操作功能时,与该overlay属于同一个群组中的其他overlay也响应第一操作功能。In a possible implementation manner, when any overlay performs the first operation function, other overlays belonging to the same group as the overlay also respond to the first operation function.
一种可能的实现方式中,overlay还对应第五信息,第五信息用于指示在对overlay执行第一操作功能的情况下,overlay所在群组中的所有overlay响应第一操作功能,或者overlay执行第一操作功能,终端根据所述至少两个overlay的所述第一信息对所述至少两个overlay进行处理,或者所述终端根据所述至少两个overlay对应的第二信息和第三信息对所述至少两个overlay进行处理,还包括:终端根据至少两个overlay的第一信息和第五信息对至少两个overlay进行处理;或者,终端根据至少两个overlay的第二信息,第三信息和第五信息对至少两个overlay进行处理。通过第五信息,终端可以确定overlay可以作群组操作和单独操作处理。In a possible implementation manner, the overlay also corresponds to the fifth information. The fifth information is used to indicate that in the case of performing the first operation function on the overlay, all the overlays in the group to which the overlay belongs respond to the first operation function, or the overlay is executed. In a first operation function, the terminal processes the at least two overlays according to the first information of the at least two overlays, or the terminal pairs the second information and the third information according to the at least two overlays. The processing of the at least two overlays further includes: the terminal processes the at least two overlays according to the first information and the fifth information of the at least two overlays; or the terminal processes the second information and the third information according to the at least two overlays And fifth information to process at least two overlays. Through the fifth information, the terminal can determine that the overlay can be processed in a group operation and an individual operation.
一种可能的实现方式中,至少一个群组中每个群组还对应用于指示群组操作的指示信息和用于指示单独操作的指示信息。这样便于用于选择是以群组操作还是单独操作处理overlay。In a possible implementation manner, each group in the at least one group further corresponds to instruction information for indicating a group operation and instruction information for indicating an independent operation. This is convenient for choosing whether to handle overlays in a group operation or in a single operation.
应理解,终端在显示至少一个群组时,还可以根据第五信息,显示每个群组对应的指示群组操作的指示信息和用于指示单独操作的指示信息。It should be understood that when the terminal displays at least one group, the terminal may further display, according to the fifth information, the instruction information indicating the group operation and the instruction information indicating the separate operation corresponding to each group.
需要说明的是,无论终端根据第一信息处理至少两个overlay,还是根据第二信息和第三信息处理至少两个overlay。终端均可以根据overlay对应第四信息、第五信息确定单独操作还是群组操作。It should be noted that whether the terminal processes at least two overlays according to the first information, or processes at least two overlays according to the second information and the third information. The terminal may determine whether to operate alone or in groups according to the fourth information and the fifth information corresponding to the overlay.
应理解,当单独操作被触发,当一个群组中的任一个overlay执行第一操作功能,同一个群组中的其他overlay不响应第一操作功能。当群组操作被触发,当一个群组中的任一个overlay执行第一操作功能,同一个群组中的其他overlay也响应第一操作功能。这样便于基于用户的指示选择群组操作还是单独操作。It should be understood that when an individual operation is triggered, when any overlay in a group performs a first operation function, other overlays in the same group do not respond to the first operation function. When a group operation is triggered, when any overlay in a group performs the first operation function, other overlays in the same group also respond to the first operation function. This is convenient for selecting a group operation or an individual operation based on a user's instruction.
一种可能的实现方式中,overlay对应第一信息时,overlay还对应第六信息,第六 信息用于指示overlay对应的操作功能;终端根据至少两个overlay的第一信息对至少两个overlay进行处理,包括:终端根据至少两个overlay的第一信息和第六信息对至少两个overlay进行处理;或者,终端根据至少两个overlay的第二信息,第三信息和第六信息对至少两个overlay进行处理。这样终端在解析到第六消息时,可以基于第六消息指示的操作功能确定每个overlay的操作功能。在显示至少一个群组时,也可以基于第六消息确定该群组的操作功能。In a possible implementation manner, when the overlay corresponds to the first information, the overlay also corresponds to the sixth information. The sixth information is used to indicate an operation function corresponding to the overlay. The terminal performs at least two overlays according to the first information of the at least two overlays. The processing includes: the terminal processes at least two overlays according to the first information and the sixth information of the at least two overlays; or the terminal processes the at least two overlays according to the second information, the third information, and the sixth information of the at least two overlays. overlay for processing. In this way, when the terminal parses the sixth message, it can determine the operation function of each overlay based on the operation function indicated by the sixth message. When at least one group is displayed, the operation function of the group may also be determined based on the sixth message.
一种可能的实现方式中,第二信息和第三信息携带在overlay的文件格式中。In a possible implementation manner, the second information and the third information are carried in an overlay file format.
应理解,当overlay采用OMAF标准文件格式封装时,第二信息和第三信息携带在overlay的OMAF文件格式中。It should be understood that when the overlay is encapsulated in the OMAF standard file format, the second information and the third information are carried in the OMAF file format of the overlay.
一种可能的实现方式中,文件格式包括:overlay结构,和位于overlay结构中的overlay关联区域控制结构以及overlay群组box。则第二信息位于overlay关联区域控制结构中,第三信息位于overlay群组box中。应理解,在这种情况下,终端可以通过解析overlay结构得到overlay关联区域控制结构和第三信息,然后根据overlay关联区域控制结构确定overlay的操作功能,根据第三信息得到overlay的群组标识信息,进而确定overlay所在的群组。In a possible implementation manner, the file format includes an overlay structure, an overlay-related area control structure, and an overlay group box located in the overlay structure. Then the second information is located in the overlay association area control structure, and the third information is located in the overlay group box. It should be understood that in this case, the terminal can obtain the overlay association area control structure and the third information by analyzing the overlay structure, and then determine the operation function of the overlay according to the overlay association area control structure, and obtain the overlay group identification information according to the third information. To determine the group to which the overlay belongs.
作为另一种实现方式,第三信息位于overlay结构包括的overlay控制结构中。As another implementation manner, the third information is located in an overlay control structure included in the overlay structure.
一种可能的实现方式中,第一信息携带在overlay的文件格式中。应理解,第一信息位于overlay包括的overlay群组box中。此时,该overlay结构中可以不携带overlay关联区域控制结构。应理解,当第一信息携带在文件格式中时,终端接收到overlay之后可以通过解析overlay的overlay群组box以获取overlay的群组标识信息,进而根据群组标识信息确定overlay所在的群组。In a possible implementation manner, the first information is carried in an overlay file format. It should be understood that the first information is located in an overlay group box included in the overlay. At this time, the overlay structure may not carry an overlay-related area control structure. It should be understood that when the first information is carried in the file format, after receiving the overlay, the terminal may obtain the group identification information of the overlay by analyzing the overlay group box of the overlay, and then determine the group to which the overlay belongs according to the group identification information.
一种可能的实现方式中,第三信息携带在overlay对应的overlay码流的辅助增强信息(supplementary enhancement information,SEI)中,第二信息携带在overlay的overlay关联区域控制结构中。In a possible implementation manner, the third information is carried in supplementary enhanced information (supplementary enhancement information) of the overlay code stream corresponding to the overlay, and the second information is carried in an overlay-related area control structure of the overlay.
应理解,此时overlay的overlay结构中包括overlay关联区域控制结构。这时,每个overlay对应的操作功能可以由overlay关联区域控制结构指示。例如,overlay关联区域控制结构可以指示该操作功能为展示或者关闭。It should be understood that the overlay structure of the overlay at this time includes an overlay-related area control structure. At this time, the operation function corresponding to each overlay can be indicated by the overlay-related area control structure. For example, the overlay-associated area control structure may indicate that the operation function is displaying or closing.
一种可能的实现方式中,第一信息携带在overlay对应的overlay码流的SEI中。应理解,这时overlay的overlay结构中可以不包括overlay关联区域控制结构。可选的,overlay结构中可以包括用于指示overlay对应的操作功能的overlay控制符号。In a possible implementation manner, the first information is carried in an SEI of an overlay code stream corresponding to the overlay. It should be understood that, at this time, the overlay structure of the overlay may not include an overlay-related area control structure. Optionally, the overlay structure may include an overlay control symbol for indicating an operation function corresponding to the overlay.
应理解,SEI的载荷类型用于指示SEI中携带overlay的群组标识信息。这样终端在对overlay处理时,在解析到SEI时可以根据载荷类型确定SEI中携带overlay的群组标识信息,进一步解析SEI得到overlay的群组标识信息。It should be understood that the SEI payload type is used to indicate that the SEI carries the group identification information of the overlay. In this way, when the terminal processes the overlay, it can determine the group identification information of the overlay carried in the SEI according to the payload type when parsing to the SEI, and further analyze the SEI to obtain the group identification information of the overlay.
应理解,当overlay的overlay结构中可以不包括overlay关联区域控制结构时,SEI的载荷类型还用于指示群组的属性。例如,SEI的载荷类型群组的属性为共同展示群组或者共同交互群组。例如,如果是共同交互群组则可以对该群组执行交互操作。如果是共同展示群组则可以以群组操作展示或者关闭群组中的overlay。It should be understood that when the overlay structure of the overlay may not include the control structure of the overlay association area, the SEI load type is also used to indicate the attribute of the group. For example, the attributes of the SEI load type group are a common display group or a common interaction group. For example, if it is a common interaction group, an interactive operation may be performed on the group. If it is a common display group, you can display or close the overlay in the group in a group operation.
一种可能的实现方式中,第三信息携带在包含overlay的媒体数据流的媒体呈现描述(Media Presentation Description,MPD)中,第二信息携带在overlay的overlay关 联区域控制结构中。应理解,此时overlay的overlay结构中包括overlay关联区域控制结构。这时,每个overlay对应的操作功能可以由overlay关联区域控制结构指示。例如,overlay关联区域控制结构可以指示该操作功能为展示或者关闭。In a possible implementation manner, the third information is carried in a Media Presentation Description (MPD) including a media data stream of the overlay, and the second information is carried in an overlay association area control structure of the overlay. It should be understood that the overlay structure of the overlay at this time includes an overlay-related area control structure. At this time, the operation function corresponding to each overlay can be indicated by the overlay-related area control structure. For example, the overlay-associated area control structure may indicate that the operation function is displaying or closing.
一种可能的实现方式中,第一信息携带在包含overlay的媒体数据流的MPD中。应理解,overlay的overlay结构中可以不包括overlay关联区域控制结构。可选的,overlay结构中可以包括用于指示overlay对应的操作功能的overlay控制符号。In a possible implementation manner, the first information is carried in an MPD of a media data stream including an overlay. It should be understood that the overlay's overlay structure may not include an overlay-related area control structure. Optionally, the overlay structure may include an overlay control symbol for indicating an operation function corresponding to the overlay.
一种可能的实现方式中,第一信息或第三信息位于MPD的adaptation set level或者representation level的overlay描述字中。In a possible implementation manner, the first information or the third information is located in an overlay description word of an adaptation set level or a representation level of the MPD.
应理解,当第一信息或者第三信息携带在MPD中时,服务器向终端发送码流之前还需要发送包括至少两个overlay的媒体数据对应的MPD。该MPD包括至少两个overlay中每个overlay的第一信息或者第三信息。本申请实施例对此不做限定。此处的码流包括至少两个overlay的信息。至少两个overlay的信息中每个overlay的信息为该overlay对应的overlay结构中定义的信息。It should be understood that when the first information or the third information is carried in the MPD, the server also needs to send the MPD corresponding to the media data including at least two overlays before sending the code stream to the terminal. The MPD includes first information or third information of each of at least two overlays. This embodiment of the present application does not limit this. The code stream here includes information of at least two overlays. The information of each of the at least two overlays is the information defined in the overlay structure corresponding to the overlay.
一种可能的实现方式中,overlay的群组标识信息包括至少一个群组标识信息。应理解,当overlay对应至少两个群组标识信息时,一个overlay可以属于至少两个群组。In a possible implementation manner, the group identification information of the overlay includes at least one group identification information. It should be understood that when the overlay corresponds to at least two group identification information, one overlay may belong to at least two groups.
一种可能的实现方式中,overlay对应多个群组,当该overlay被触发,所述overlay分别响应多个群组对应的操作功能,或者,多个群组中所有overlay分别响应各自所在群组对应的操作功能。In a possible implementation manner, the overlay corresponds to multiple groups. When the overlay is triggered, the overlay respectively responds to the operation function corresponding to the multiple groups, or all overlays in the multiple groups respectively respond to their respective groups. Corresponding operation function.
一种可能的实现方式中,overlay属于第一群组和第二群组,不同群组对应不同的操作功能,当第一群组中的overlay被触发的情况下,第一群组中的所有overlay响应第一群组对应的操作功能,第二群组中的所有overlay响应第二群组对应的操作功能。应理解,如果overlay中携带的第五信息用于指示可进行群组操作和单独操作。则当群组操作被触发时,终端按照群组操作处理至少两个overlay。当单独操作被触发时,终端按照单独操作处理至少两个overlay。具体过程可以上述对应的描述,此处不再赘述。In a possible implementation manner, the overlay belongs to the first group and the second group, and different groups correspond to different operation functions. When the overlay in the first group is triggered, all The overlay responds to the operation function corresponding to the first group, and all overlays in the second group respond to the operation function corresponding to the second group. It should be understood that if the fifth information carried in the overlay is used to indicate that group operations and separate operations are available. When a group operation is triggered, the terminal processes at least two overlays according to the group operation. When a separate operation is triggered, the terminal processes at least two overlays according to the separate operation. The specific process can be described in the above corresponding description, which will not be repeated here.
一种可能的实现方式中,当第一群组中的overlay被触发的情况下,该overlay响应第一群组对应的操作功能,第二群组中的该overlay响应第二群组对应的操作功能。In a possible implementation manner, when the overlay in the first group is triggered, the overlay responds to the operation function corresponding to the first group, and the overlay in the second group responds to the operation corresponding to the second group. Features.
一种可能的实现方式中,overlay的文件格式还包括:overlay群组box,所述overlay群组box中携带所述overlay的群组的名称信息;所述方法还包括:终端显示所述overlay的群组的名称信息指示的群组名称。In a possible implementation manner, the file format of the overlay further includes: an overlay group box, and the overlay group box carries name information of the overlay group; the method further includes: the terminal displays the overlay The group name indicated by the group's name information.
第二方面,本申请实施例提供一种处理媒体数据的方法,包括:服务器获取媒体数据;服务器处理媒体数据,得到媒体数据对应的至少两个覆盖层overlay;overlay对应第一信息,或者,overlay对应第二信息和第三信息;其中,第一信息包括所述overlay的群组标识信息或者与所述overlay属于同一个群组的其他overlay的标识信息,所述第二信息用于指示overlay对应的操作功能;所述第三信息用于指示overlay的群组标识信息或者与所述overlay属于同一个群组的其他overlay的标识信息;同一个群组中的overlay对应相同的操作功能;服务器向终端发送所述至少两个覆盖层overlay。In a second aspect, an embodiment of the present application provides a method for processing media data, including: the server obtains the media data; the server processes the media data to obtain at least two overlays corresponding to the media data; the overlay corresponds to the first information, or the overlay Corresponds to second information and third information; wherein the first information includes group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay, and the second information is used to indicate that the overlay corresponds to The third information is used to indicate the group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay; the overlays in the same group correspond to the same operation function; the server sends The terminal sends the at least two overlay layers.
上述服务器处理媒体数据可以指对媒体数据进行编码,以及对编码后得到的媒体数据流进行封装。The processing of the media data by the server may refer to encoding the media data and encapsulating the media data stream obtained after the encoding.
本申请实施例中,当服务器获取了媒体数据后,能够将对媒体数据作处理,使得 处理后得到的一个或者多个overlay对应群组标识信息,或者使得处理后得到的至少两个overlay对应群组标识信息,和操作功能。这样当终端获取到一个或者多个overlay时,可以基于第一信息对至少两个overlay作处理,或者基于第二信息和第三信息对至少两个overlay作处理。由于第一信息和第三信息均指示群组标识信息,这样终端可以以群组为粒度处理至少两个overlay。In the embodiment of the present application, after the server obtains the media data, it can process the media data so that one or more overlays obtained after processing correspond to the group identification information, or at least two overlays corresponding to the group obtained after processing Group identification information, and operational functions. In this way, when the terminal obtains one or more overlays, it can process at least two overlays based on the first information, or process at least two overlays based on the second information and the third information. Since the first information and the third information both indicate group identification information, the terminal can process at least two overlays with the group granularity.
一种可能的实现方式中,第二信息和第三信息携带在overlay的文件格式中。应理解,服务器可以对媒体数据进行编码和对编码后得到的媒体数据流采用视频文件格式(例如,OMAF标准文件格式)封装。以使得第二信息和第三信息携带在对overlay封装后得到的文件格式中。In a possible implementation manner, the second information and the third information are carried in an overlay file format. It should be understood that the server may encode the media data and use the video file format (for example, the OMAF standard file format) to encapsulate the media data stream obtained after the encoding. Therefore, the second information and the third information are carried in a file format obtained by encapsulating the overlay.
一种可能的实现方式中,文件格式包括overlay结构,和位于overlay结构中的overlay关联区域控制结构,以及overlay群组box。其中,第三信息位于overlay群组box中,第二信息位于overlay关联区域控制结构中。In a possible implementation manner, the file format includes an overlay structure, an overlay-related area control structure located in the overlay structure, and an overlay group box. The third information is located in the overlay group box, and the second information is located in the overlay association area control structure.
作为另一种实现方式,第三信息位于overlay结构包括的overlay控制结构中。As another implementation manner, the third information is located in an overlay control structure included in the overlay structure.
一种可能的实现方式中,第一信息携带在overlay的文件格式中。应理解,服务器可以对媒体数据进行编码和对编码后得到的媒体数据流采用视频文件格式(例如,OMAF标准文件格式)封装。以使得第一信息在对overlay封装后得到的文件格式中。In a possible implementation manner, the first information is carried in an overlay file format. It should be understood that the server may encode the media data and use the video file format (for example, the OMAF standard file format) to encapsulate the media data stream obtained after the encoding. So that the first information is in a file format obtained after the overlay is encapsulated.
一种可能的实现方式中,文件格式包括overlay群组box,其中,第一信息位于overlay群组box中。应理解,此时文件格式还包括overlay结构。In a possible implementation manner, the file format includes an overlay group box, and the first information is located in the overlay group box. It should be understood that the file format at this time also includes an overlay structure.
一种可能的实现方式中,第三信息携带在overlay对应的overlay码流的辅助增强信息SEI中,第二信息携带在overlay的overlay关联区域控制结构中。应理解,服务器可以对媒体数据进行编码,得到媒体数据流,然后对媒体数据包括的一个或者overlay进行编码得到每个overlay对应的overlay码流,在编码时服务器将第三信息携带在overlay对应的overlay码流包括的辅助增强信息SEI中。然后对编码后得到的媒体数据流和每个overlay对应的overlay码流采用视频文件格式(例如,OMAF标准文件格式)封装。以使得第二信息携带在对overlay封装后得到的文件格式中。In a possible implementation manner, the third information is carried in the auxiliary enhanced information SEI of the overlay code stream corresponding to the overlay, and the second information is carried in the overlay-related area control structure of the overlay. It should be understood that the server may encode the media data to obtain a media data stream, and then encode one or the overlay included in the media data to obtain an overlay code stream corresponding to each overlay. When encoding, the server carries the third information in the overlay corresponding to the overlay. The overlay bitstream includes auxiliary enhancement information SEI. Then, the encoded media data stream and the overlay code stream corresponding to each overlay are encapsulated in a video file format (for example, the OMAF standard file format). So that the second information is carried in a file format obtained by encapsulating the overlay.
一种可能的实现方式中,第一信息携带在overlay对应的overlay码流的SEI中。应理解,服务器可以对媒体数据进行编码,得到媒体数据流,然后对媒体数据包括的一个或者overlay进行编码得到每个overlay对应的overlay码流,在编码时服务器将第三信息携带在overlay对应的overlay码流包括的辅助增强信息SEI中。然后对对编码后得到的媒体数据流和每个overlay对应的overlay码流进行封装,在这种情况下封装后的overlay具有overlay结构。In a possible implementation manner, the first information is carried in an SEI of an overlay code stream corresponding to the overlay. It should be understood that the server may encode the media data to obtain a media data stream, and then encode one or the overlay included in the media data to obtain an overlay code stream corresponding to each overlay. When encoding, the server carries the third information in the overlay corresponding to the overlay. The overlay bitstream includes auxiliary enhancement information SEI. Then, the encoded media data stream and the overlay code stream corresponding to each overlay are encapsulated. In this case, the encapsulated overlay has an overlay structure.
一种可能的实现方式中,SEI的载荷类型用于指示SEI中携带overlay的群组标识信息。应理解,SEI的载荷类型还可以用于指示群组的属性。In a possible implementation manner, the payload type of the SEI is used to indicate that the SEI carries the group identification information of the overlay. It should be understood that the load type of the SEI may also be used to indicate the attributes of the group.
一种可能的实现方式中,第三信息携带在包含overlay的媒体数据流对应的媒体呈现描述MPD中,第二信息携带在overlay的overlay关联区域控制结构中。应理解,服务器可以基于HTTP协议的网络自适应媒体传输协议(Dynamic Adaptive Streaming through HTTP,DASH)封装overlay,以得到MPD。然后在MPD中携带第三信息。第二信息携带在overlay的overlay关联区域控制结构中的过程可以参考上述描述,此处不再赘述。In a possible implementation manner, the third information is carried in a media presentation description MPD corresponding to the media data stream containing the overlay, and the second information is carried in an overlay-related area control structure of the overlay. It should be understood that the server may encapsulate the overlay based on the HTTP adaptive network adaptive media transmission protocol (Dynamic Adaptive Streaming Through HTTP, DASH) to obtain the MPD. The third information is then carried in the MPD. For the process of carrying the second information in the overlay-related area control structure of the overlay, refer to the foregoing description, and details are not described herein again.
一种可能的实现方式,第三信息位于MPD的自适应集合层级(adaptation set level)或者表述层级(representation level)的overlay描述字中。In a possible implementation manner, the third information is located in an MPD's adaptation set level or representation level overlay description word.
一种可能的实现方式中,第一信息携带在含有overlay的媒体数据流对应的媒体呈现描述MPD中。具体封装过程可以参考上述第三信息携带在MPD的过程,此处不再赘述。在这种情况下,封装后的overlay具有overlay结构。In a possible implementation manner, the first information is carried in a media presentation description MPD corresponding to a media data stream containing an overlay. For the specific encapsulation process, refer to the process in which the third information is carried in the MPD, and details are not described herein again. In this case, the encapsulated overlay has an overlay structure.
一种可能的实现方式中,第一信息位于MPD的adaptation set level或者representation level的overlay描述字中。In a possible implementation manner, the first information is located in an overlay description word of an adaptation set or a representation level of the MPD.
一种可能的实现方式中,overlay还对应第四信息,第四信息用于指示在对所述overlay执行第一操作功能的情况下,所述overlay所在群组中的所有overlay响应所述第一操作功能。应理解,这样终端便可以确定以群组操作overlay。In a possible implementation manner, the overlay also corresponds to fourth information, and the fourth information is used to indicate that in a case where the first operation function is performed on the overlay, all overlays in a group to which the overlay belongs respond to the first Operational functions. It should be understood that the terminal can thus determine to operate in overlays in groups.
一种可能的实现方式中,overlay还对应第五信息,第五信息用于指示在对所述overlay执行第一操作功能的情况下,所述overlay所在群组中的所有overlay响应所述第一操作功能,或者所述overlay响应所述第一操作功能。应理解,这样终端便可以确定overlay可以进行群组操作,也可以进行单独操作。In a possible implementation manner, the overlay also corresponds to fifth information, and the fifth information is used to indicate that in a case where the first operation function is performed on the overlay, all overlays in a group to which the overlay belongs respond to the first An operation function, or the overlay responds to the first operation function. It should be understood that, in this way, the terminal can determine that the overlay can perform a group operation or an independent operation.
一种可能的实现方式中,overlay对应第一信息时,所述overlay还对应第六信息,所述第六信息用于指示所述overlay对应的操作功能。In a possible implementation manner, when the overlay corresponds to the first information, the overlay also corresponds to the sixth information, and the sixth information is used to indicate an operation function corresponding to the overlay.
一种可能的实现方式中,overlay的文件格式还包括:overlay群组box,所述overlay群组box中携带所述overlay的群组的名称信息。In a possible implementation manner, the file format of the overlay further includes: an overlay group box, and the overlay group box carries name information of the overlay group.
应理解,上述对第一方面中的各个实现方式中的相应内容的限定和解释同样适用于第二方面中的各个实现方式、以及下述第三方面和第四方面的各个实现方式中。It should be understood that the above-mentioned definitions and explanations of the corresponding content in the respective implementation manners in the first aspect are equally applicable to the respective implementation manners in the second aspect and the respective implementation manners in the third and fourth aspects described below.
第三方面,本申请实施例提供一种终端,该终端包括用于响应上述第一方面任意一种实现方式中的方法的模块。In a third aspect, an embodiment of the present application provides a terminal, and the terminal includes a module for responding to the method in any one of the foregoing implementation manners of the first aspect.
应理解,终端是能够为用户呈现媒体数据(例如,视频图像)和/或一个或者多个overlay的设备。It should be understood that a terminal is a device capable of presenting media data (eg, video images) and / or one or more overlays to a user.
第四方面,本申请实施例提供一种服务器,该服务器包括用于执行上述第二方面任意一种实现方式中的方法的模块。In a fourth aspect, an embodiment of the present application provides a server, and the server includes a module for executing a method in any one of the foregoing implementation manners of the second aspect.
应理解,服务器是能够存储媒体数据,以及处理媒体数据对应的一个或者多个overlay的设备,服务器可以将视频图像,以及处理后的一个或者多个overlay提供给终端,使得终端能够将媒体数据、一个或者多个overlay呈现给用户。It should be understood that the server is a device capable of storing media data and processing one or more overlays corresponding to the media data. The server may provide video images and the processed one or more overlays to the terminal, so that the terminal can provide the media data, One or more overlays are presented to the user.
第五方面,提供一种终端,包括:相互耦合的非易失性存储器和处理器;其中,处理器用于调用存储在存储器中的程序代码以执行第一方面的任意一种实现方式中的方法的部分或全部步骤。According to a fifth aspect, a terminal is provided, including: a non-volatile memory and a processor coupled to each other; wherein the processor is configured to call a program code stored in the memory to execute the method in any implementation manner of the first aspect Some or all of the steps.
第六方面,提供一种服务器,包括:相互耦合的非易失性存储器和处理器;其中,处理器用于调用存储在所述存储器中的程序代码以执行第二方面的任意一种实现方式中的方法的部分或全部步骤。According to a sixth aspect, a server is provided, including: a non-volatile memory and a processor coupled to each other; wherein the processor is configured to call program code stored in the memory to execute any one of the implementation manners of the second aspect Part or all of the steps of the method.
第七方面,提供一种计算机可读存储介质,所述计算机可读存储介质存储了程序代码,其中,所述程序代码包括用于执行第一方面的任意一种实现方式中的方法的部分或全部步骤的指令。According to a seventh aspect, a computer-readable storage medium is provided, where the computer-readable storage medium stores program code, where the program code includes a part for performing a method in any implementation manner of the first aspect or Instructions for all steps.
第八方面,提供一种计算机可读存储介质,所述计算机可读存储介质存储了程序 代码,其中,所述程序代码包括用于执行第二方面的任意一种实现方式中的方法的部分或全部步骤的指令。According to an eighth aspect, a computer-readable storage medium is provided, where the computer-readable storage medium stores program code, where the program code includes a part for performing a method in any implementation manner of the second aspect or Instructions for all steps.
第九方面,提供一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机执行第一方面的任意一种实现方式中的方法的部分或全部步骤的指令。According to a ninth aspect, a computer program product is provided, and when the computer program product runs on a computer, the computer is caused to execute instructions of some or all steps of the method in any one of the implementation manners of the first aspect.
第十方面,提供一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机执行第二方面的任意一种实现方式中的方法的部分或全部步骤的指令。According to a tenth aspect, a computer program product is provided, and when the computer program product runs on a computer, the computer is caused to execute instructions of some or all steps of the method in any one of the implementation manners of the second aspect.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本申请实施例提供的一种通信系统示意图;FIG. 1 is a schematic diagram of a communication system according to an embodiment of the present application;
图2为本申请实施例提供的一种通信设备的结构示意图;2 is a schematic structural diagram of a communication device according to an embodiment of the present application;
图3为本申请实施例提供的一种媒体数据处理方法的流程示意图一;FIG. 3 is a first schematic flowchart of a media data processing method according to an embodiment of the present application; FIG.
图4为本申请实施例提供的一种媒体数据处理方法的流程示意图二;4 is a second schematic flowchart of a method for processing media data according to an embodiment of the present application;
图5为本申请实施例提供的一种显示界面示意图;5 is a schematic diagram of a display interface according to an embodiment of the present application;
图6为本申请实施例提供的另一种显示界面示意图;6 is a schematic diagram of another display interface according to an embodiment of the present application;
图7为本申请实施例提供的一种媒体数据处理的装置的结构示意图;7 is a schematic structural diagram of a device for processing media data according to an embodiment of the present application;
图8为本申请实施例提供的另一种媒体数据处理的装置的结构示意图。FIG. 8 is a schematic structural diagram of another apparatus for processing media data according to an embodiment of the present application.
具体实施方式detailed description
在介绍本申请实施例之前对本申请实施例涉及到的名词进行解释:Before introducing the embodiments of the application, the terms involved in the embodiments of the application are explained:
1)、全景视频:又称360度全景视频,由一系列的全景图片组成,全景图片内容覆盖三维空间中整个球体表面,是一种用3D摄像机进行全方位360度进行拍摄的视频,用户在观看视频的时候,可以随意调节视频上下左右进行观看。1) Panoramic video: Also known as 360-degree panoramic video, it is composed of a series of panoramic pictures. The content of the panoramic picture covers the entire sphere surface in three-dimensional space. It is a video shot with a full-scale 360-degree using a 3D camera. When watching a video, you can freely adjust the video to watch up, down, left and right.
2)、媒体呈现描述(Media Presentation Description,MPD):标准ISO/IEC 23009-1中规定的一种文档,在该文档中包含了客户端构造HTTP-URL的元数据。在MPD中包含一个或者多个周期(period)元素,每个period元素包含有一个或者多个自适应集(adaptation set),每个adaptation set中包含一个或者多个表示(representation),每个representation中包含一个或者多个分段,客户端根据MPD中的信息,选择表达,并构建分段的HTTP-URL。2) Media presentation description (MPD): A document specified in the standard ISO / IEC 23009-1, which contains metadata for the client to construct the HTTP-URL. The MPD contains one or more period elements, each period element contains one or more adaptation sets, each adaptation set contains one or more representations, and each representation It contains one or more segments, and the client selects expressions based on the information in the MPD, and constructs the segmented HTTP-URL.
3)、Track:中文翻译“轨迹”,Track在标准ISO/IEC 14496-12中的定义“timed sequence of related samples(q.v.)in an ISO base media file。翻译为:“ISO媒体文件中相关样本的时间属性序列。3), Track: Chinese translation "Track", the definition of Track in the standard ISO / IEC 14496-12 "timed sequence of related samples (qv) in an ISO media file. Translated into:" The relevant samples in ISO media files Time attribute sequence.
NOTE(注):For media data,a track corresponds to a sequence of images or sampled audio;for hint tracks,a track corresponds to a streaming channel。”翻译为:“对于媒体数据,一个Track就是个图像或者音频样本序列;对于提示轨迹,一个轨迹对应一个流频道。”NOTE (Note): For the media data, a track, a sequence of images, or a sampled audio; for a track, a track, a stream of channels, a stream of channels. "Translates to:" For media data, a track is a sequence of images or audio samples; for a cue track, a track corresponds to a stream channel. "
解释:Track是指一系列有时间属性的按照ISOBMFF的封装方式的样本,比如视频Track,视频样本是视频编码器编码每一帧后产生的码流,按照ISOBMFF的规范对所有的视频样本进行封装产生样本。Explanation: Track refers to a series of time-dependent samples in accordance with the ISOBMFF packaging method, such as video track. Video samples are code streams generated by the video encoder after encoding each frame. All video samples are encapsulated according to the ISOBMFF specification. Generate a sample.
4)、Sample:中文翻译“样本”,与时间戳相关联的数据。在ISO/IEC 14496-12中有如下定义和解释:4) Sample: Chinese translation "sample", data associated with timestamp. In ISO / IEC 14496-12, there are the following definitions and explanations:
“all the data associated with a single timestamp”"All the data associated with a single timestamp"
NOTE 1:No two samples within a track can share the same time-stamp。NOTE 1: No two samples within the track can share the same time-stamp.
NOTE 2:In non-hint tracks,a sample is,for example,an individual frame of video,a series of video frames in decoding order,or a compressed section of audio in decoding order;in hint tracks,a sample defines the formation of one or more streaming packets。NOTE2: In non-hint tracks, a sample sample, for example, an individual frame of video, a series of video frame frames in coding order, or a compressed section of audio section in audio coding in order; inhint tracks, a sample sample definitions of one or more streaming packets.
对应于如下中文翻译:Corresponds to the following Chinese translation:
“与单个时间轴关联的所有数据”"All data associated with a single timeline"
注1、在同一个轨迹内的两个样本不能有相同的时间轴。 Note 1. Two samples in the same trajectory cannot have the same time axis.
注2:在非提示轨迹中,样本可以是一个独立的视频帧,一系列按解码顺序摆放的视频帧,或者一段按照解码顺序摆放的压缩音频;在提示轨迹中,样本定义了一个或多个码流包的形态。Note 2: In a non-cue track, the sample can be an independent video frame, a series of video frames placed in decoding order, or a compressed audio placed in the decoding order; in the cue track, the sample defines one or The shape of multiple stream packets.
5)、box:中文翻译“盒子”,box在ISO/IEC 14496-12标准中的定义:“object-oriented building block defined by a unique type identifier and length。可以翻译为:“面向对象的构建块,由唯一的类型标识符和长度定义”。5), box: Chinese translation of "box", the definition of box in the ISO / IEC 14496-12 standard: "object-oriented building block defined by unique type identifier and length. It can be translated as" object-oriented building block, Defined by a unique type identifier and length. "
NOTE:Called‘atom’in some specifications,including the first definition of MP4.”可以翻译为:“在某些规范中称为‘原子’,包括MP4的第一个定义。”NOTE: Called 'atom' in some specifics, including the first definition of MP4. "Can be translated as:" In some specifications, it is called 'atomic', including the first definition of MP4. "
ISOBMFF文件是由多个box构成,box可以包含其他的box。ISOBMFF files are made up of multiple boxes, and boxes can contain other boxes.
6)、SEI:全称supplementary enhancement information,是视频编解码标准(h.264,h.265)中定义的一种网络接入单元(Network Abstract Layer Unit,NALU)的类型。6), SEI: full name supplementary enhancement, is a type of Network Abstract Unit (NALU) defined in the video codec standards (h.264, h.265).
7)、overlay:中文翻译“覆盖层”,即叠加在背景视频上的媒体内容(具体可以指在背景视频画面的某个区域之上额外叠加渲染的一层视频或者图片),在OMAF标准中有如下定义和解释:7) Overlay: Chinese translation "overlay", that is, the media content superimposed on the background video (specifically, it can refer to an additional layer of rendered video or picture superimposed on a certain area of the background video picture), in the OMAF standard There are the following definitions and explanations:
“piece of visual media rendered over omnidirectional video or image item or over a viewport”。"Piece of visual media rendering is overly omnidirectional video video image item viewport".
中文翻译为:在全景视频、图像项或视角上叠加渲染的视觉媒体片。Chinese translation: visual media film superimposed and rendered on panoramic video, image item or perspective.
例如,覆盖层还可以为显示在背景视频上的某个元素的姓名、年龄等信息。For example, the overlay can also be information such as the name and age of an element displayed on the background video.
8)、背景视频(background visual media):即可被overlay叠加的视频。在OMAF中有如下定义和解释:“piece of visual media on which an overlay is superimposed”。中文翻译为:被覆盖层所叠加的视觉媒体片。8), background video (background visual media): video that can be superimposed by overlay. In OMAF, there are the following definitions and explanations: "piece of visual media, which is superimposed." Chinese translation: Visual media film superimposed by the overlay.
本申请中“的(英文:of)”,相应的“(英文corresponding,relevant)”和“对应的(英文:corresponding)”有时可以混用,应当指出的是,在不强调其区别时,其所要表达的含义是一致的。In this application, "of", corresponding "(corresponding, relevant)" and "corresponding" can sometimes be mixed. It should be noted that when the difference is not emphasized, the The meaning expressed is consistent.
需要说明的是,本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其他实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。It should be noted that, in the embodiments of the present application, words such as "exemplary" or "for example" are used as examples, illustrations or illustrations. Any embodiment or design described as “exemplary” or “for example” in the embodiments of the present application should not be construed as more preferred or more advantageous than other embodiments or designs. Rather, the use of the words "exemplary" or "for example" is intended to present the relevant concept in a concrete manner.
本申请中,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组 合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。另外,为了便于清楚描述本申请实施例的技术方案,在本申请的实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。In the present application, "multiple" means two or more. "And / or" describes the association relationship of related objects, and indicates that there can be three kinds of relationships, for example, A and / or B can represent: the case where A exists alone, A and B exist simultaneously, and B alone exists, where A, B can be singular or plural. The character "/" generally indicates that the related objects are an "or" relationship. "At least one or more of the following" or similar expressions refers to any combination of these items, including any combination of single or plural items. For example, at least one (a), a, b, or c can be expressed as: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple . In addition, in order to facilitate a clear description of the technical solutions of the embodiments of the present application, in the embodiments of the present application, words such as “first” and “second” are used to distinguish between the same or similar items having substantially the same functions and functions. Those skilled in the art can understand that the words "first", "second" and the like do not limit the number and execution order, and the words "first" and "second" are not necessarily different.
如图1所示,图1示出了本申请实施例提供的一种通信系统示意图,该通信系统包括:服务器100和与服务器100通信的至少一个终端200。As shown in FIG. 1, FIG. 1 shows a schematic diagram of a communication system provided by an embodiment of the present application. The communication system includes a server 100 and at least one terminal 200 that communicates with the server 100.
其中,服务器100可以是具有对全景视频进行处理的媒体服务器。终端200可以是具有播放全景视频功能的设备。例如,终端200可以是VR眼镜,手机,平板,电视,电脑等可以连上网络的电子设备。终端200接收媒体服务器发送的数据,并进行码流解封装以及解码和显示。Wherein, the server 100 may be a media server having a function of processing panoramic video. The terminal 200 may be a device having a function of playing a panoramic video. For example, the terminal 200 may be an electronic device such as VR glasses, a mobile phone, a tablet, a television, and a computer that can be connected to a network. The terminal 200 receives the data sent by the media server, and decapsulates the code stream, and decodes and displays it.
其中,服务器100:包括编码前处理器1001、视频编码器1002、码流封装装置1003和发送传输装置1004。The server 100 includes a pre-encoding processor 1001, a video encoder 1002, a code stream packaging device 1003, and a transmitting and transmitting device 1004.
其中,编码前处理器1001对全景视频进行前处理,如图像拼接、格式转换等,将原始的全景视频转化为可进行压缩编码的视频,视频编码器1002用于对编码前处理器1001得到全景视频内容进行压缩编码或转码的操作,输出编码后的视频码流,码流封装装置1003将编码后的码流数据封装为可传输的文件,通过网络传输到终端或者内容分发网络。除此之外,服务器100可以根据终端200反馈的信息(如用户视角等),选择需要传输的内容进行信号传输。终端200:包括:接收装置2001、码流解封装装置2002、视频解码器2003和显示装置2004Among them, the pre-encoding processor 1001 performs pre-processing on the panoramic video, such as image stitching, format conversion, etc., to convert the original panoramic video into a video that can be compression-encoded. The video encoder 1002 is used to obtain the panoramic video from the pre-encoding processor 1001. The video content is subjected to compression encoding or transcoding operation, and the encoded video bitstream is output. The bitstream encapsulation device 1003 encapsulates the encoded bitstream data into a transportable file and transmits it to the terminal or the content distribution network through the network. In addition, the server 100 may select the content to be transmitted for signal transmission according to the information (such as a user perspective) fed back by the terminal 200. Terminal 200: includes: receiving device 2001, stream de-encapsulation device 2002, video decoder 2003, and display device 2004
其中,接收装置2001用于接收服务器100发送的媒体数据。码流解封装装置2002用于对接收装置2001接收到的媒体数据解封装,从而获得视频码流以及该码流对应的码流信息。视频解码器2003用于对视频码流进行解码,输出用于进行显示播放的视频图像帧。The receiving device 2001 is configured to receive media data sent by the server 100. The code stream decapsulating device 2002 is used for decapsulating the media data received by the receiving device 2001 to obtain a video code stream and code stream information corresponding to the code stream. The video decoder 2003 is used to decode a video code stream and output a video image frame for display and playback.
如图2所示,图2是本申请实施例的处理媒体数据的装置的硬件结构示意图。图2所示的处理媒体数据的装置可以视为是一种计算机设备,处理媒体数据的装置可以作为本申请实施例的服务器100或终端200的一种实现方式,也可以作为本申请实施例的处理媒体数据的方法的一种实现方式,处理媒体数据的装置包括处理器110、存储器120、输入/输出接口130和总线150。可选的,处理媒体数据的装置还可以包括通信接口140。需要说明的是,当处理媒体数据的装置作为本申请实施例的终端200时,该处理媒体数据的装置还可以包括显示器160,用于显示要播放的视频数据。例如,背景视频和一个或者多个覆盖层。As shown in FIG. 2, FIG. 2 is a schematic diagram of a hardware structure of an apparatus for processing media data according to an embodiment of the present application. The apparatus for processing media data shown in FIG. 2 may be regarded as a computer device, and the apparatus for processing media data may be used as an implementation manner of the server 100 or the terminal 200 in the embodiment of the present application, or may be used as an embodiment of the embodiment of the present application. An implementation of the method for processing media data. The apparatus for processing media data includes a processor 110, a memory 120, an input / output interface 130, and a bus 150. Optionally, the apparatus for processing media data may further include a communication interface 140. It should be noted that when the apparatus for processing media data is used as the terminal 200 in the embodiment of the present application, the apparatus for processing media data may further include a display 160 for displaying video data to be played. For example, background video and one or more overlays.
其中,处理器110、存储器120、输入/输出接口130、通信接口140、显示器160通过总线150实现彼此之间的通信连接。Among them, the processor 110, the memory 120, the input / output interface 130, the communication interface 140, and the display 160 implement a communication connection with each other through the bus 150.
处理器110可以采用通用的中央处理器(Central Processing Unit,CPU),微处理器,应用专用集成电路(Application Specific Integrated Circuit,ASIC),或者一个或多个集成电路,用于执行相关程序,以实现本申请实施例的服务器中的模块所需执行的功能,或者执行本申请方法实施例的处理媒体数据的方法。处理器110可能是一种 集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器110中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器110可以是通用处理器、数字信号处理器(Digital Signal Processing,DSP)、专用集成电路(ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器120,处理器110读取存储器120中的信息,结合其硬件完成本申请实施例的服务器中包括的模块所需执行的功能,或者执行本申请方法实施例的处理媒体数据的方法。The processor 110 may use a general-purpose central processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits for executing related programs to To implement the functions required by the modules in the server in the embodiment of the present application, or to execute the method for processing media data in the method embodiment of the present application. The processor 110 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method may be completed by an integrated logic circuit of hardware in the processor 110 or an instruction in the form of software. The aforementioned processor 110 may be a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA), or other programmable logic device, Discrete gate or transistor logic devices, discrete hardware components. Various methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed. A general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in combination with the embodiments of the present application may be directly implemented by a hardware decoding processor, or may be performed by using a combination of hardware and software modules in the decoding processor. The software module may be located in a mature storage medium such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory, or an electrically erasable programmable memory, a register, and the like. The storage medium is located in the memory 120, and the processor 110 reads the information in the memory 120 and, in conjunction with its hardware, completes the functions required by the modules included in the server in the embodiment of the present application, or performs processing of media data in the embodiments of the method of the present application method.
存储器120可以是只读存储器(Read Only Memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(Random Access Memory,RAM)。存储器120可以存储操作系统以及其他应用程序。在通过软件或者固件来实现本申请实施例的服务器中包括的模块所需执行的功能,或者执行本申请方法实施例的处理媒体数据的方法时,用于实现本申请实施例提供的技术方案的程序代码保存在存储器120中,并由处理器110来执行服务器100中包括的模块所需执行的操作,或者执行本申请方法实施例提供的处理媒体数据的方法。The memory 120 may be a read-only memory (Read Only Memory, ROM), a static storage device, a dynamic storage device, or a random access memory (Random Access Memory, RAM). The memory 120 may store an operating system and other application programs. When software or firmware is used to implement the functions required by the modules included in the server in the embodiment of the present application, or the method for processing media data in the method embodiment of the present application, the method for implementing the technical solution provided in the embodiment of the present application is implemented. The program code is stored in the memory 120, and the processor 110 performs operations required by the modules included in the server 100, or executes the method for processing media data provided by the method embodiment of the present application.
输入/输出接口130用于接收输入的数据和信息,输出操作结果等数据。The input / output interface 130 is used to receive input data and information, and output data such as operation results.
通信接口140使用例如但不限于收发器一类的收发装置,来实现处理媒体数据的装置与其他设备或通信网络之间的通信。可以作为处理媒体数据的装置中的获取模块或者发送模块。The communication interface 140 uses a transceiving device such as, but not limited to, a transceiver to implement communication between a device that processes media data and other devices or a communication network. It can be used as an obtaining module or a sending module in a device for processing media data.
总线150可包括在处理媒体数据的装置各个部件(例如处理器110、存储器120、输入/输出接口130和通信接口140)之间传送信息的通路。The bus 150 may include a path for transmitting information between various components of a device that processes media data, such as the processor 110, the memory 120, the input / output interface 130, and the communication interface 140.
应注意,尽管图2所示的处理媒体数据的装置仅仅示出了处理器110、存储器120、输入/输出接口130、通信接口140以及总线150,但是在具体实现过程中,本领域的技术人员应当明白,装置100还包括实现正常运行所必须的其他器件。同时,根据具体需要,本领域的技术人员应当明白,处理媒体数据的装置还可包括实现其他附加功能的硬件器件。此外,本领域的技术人员应当明白,处理媒体数据的装置也可仅仅包括实现本申请实施例所必须的器件,而不必包括图2中所示的全部器件。It should be noted that although the apparatus for processing media data shown in FIG. 2 only shows the processor 110, the memory 120, the input / output interface 130, the communication interface 140, and the bus 150, in the specific implementation process, those skilled in the art It should be understood that the apparatus 100 also includes other devices necessary for achieving normal operation. At the same time, according to specific needs, those skilled in the art should understand that the apparatus for processing media data may further include hardware devices that implement other additional functions. In addition, those skilled in the art should understand that the apparatus for processing media data may also include only the components necessary to implement the embodiments of the present application, and not necessarily all the components shown in FIG. 2.
例如,处理媒体数据的装置还可以包括一个或多个网卡用于在服务器100和终端200之间形成会话通道,以便传输媒体业务。For example, the apparatus for processing media data may further include one or more network cards for forming a session channel between the server 100 and the terminal 200 to transmit media services.
本申请实施例中的overlay指的是在背景层媒体内容上叠加的覆盖层媒体内容,overlay可以作为媒体内容进行单独编码,也可以是背景层媒体内容的一部分。如果overlay是背景层媒体内容的一部分,那overlay可不进行单独编码,服务器对媒体数据封装后得到的媒体数据码流中会包括overlay的信息。如果overlay可以作为媒体内容进行单独编码,将得到每个overlay对应的overlay码流。The overlay in the embodiments of the present application refers to the overlay media content superimposed on the background layer media content, and the overlay may be separately encoded as the media content or may be a part of the background layer media content. If the overlay is part of the media content of the background layer, the overlay may not be separately encoded, and the media data code stream obtained by the server after the media data is encapsulated will include the overlay information. If the overlay can be separately encoded as media content, the overlay codestream corresponding to each overlay will be obtained.
其中,媒体内容为媒体数据进行播放所显示的内容。The media content is content displayed by playing media data.
在介绍本申请实施例之前首先介绍一下本申请实施例中每个overlay的overlay结构:Before introducing the embodiments of the present application, first introduce the overlay structure of each overlay in the embodiments of the present application:
当前OMAF标准文档中,已定义了overlay的基本数据结构(简称:overlay结构)和携带方式,如下表1所示:In the current OMAF standard document, the basic data structure of the overlay (abbreviated as overlay structure) and the carrying method have been defined, as shown in Table 1 below:
表1Table 1
Figure PCTCN2019108514-appb-000001
Figure PCTCN2019108514-appb-000001
表1中所示overlay结构定义了overlay结构的一些基本属性,包括overlay的个数(number,简称:num)、标识信息(例如,Id号),overlay控制符号,以及overlay控制结构(control structure)等。其中overlay控制符号语法元素overlay_control_flag的值可以用于表示overlay控制结构的功能。例如,overlay_control_flag的语义中包含了overlay的关联源、层级顺序、透明度、用户操作信息、标记、优先级等,具体如表2所示:The overlay structure shown in Table 1 defines some basic attributes of the overlay structure, including the number of overlays (number, abbreviation: num), identification information (for example, Id number), overlay control symbols, and overlay control structure. Wait. The value of the overlay control symbol syntax element overlay_control_flag can be used to indicate the function of the overlay control structure. For example, the semantics of overlay_control_flag include the overlay's associated source, hierarchical order, transparency, user operation information, flags, and priorities, as shown in Table 2:
表2 OMAF中定义的overlay控制符号语义Table 2 Overlay control symbol semantics defined in OMAF
Figure PCTCN2019108514-appb-000002
Figure PCTCN2019108514-appb-000002
Figure PCTCN2019108514-appb-000003
Figure PCTCN2019108514-appb-000003
Figure PCTCN2019108514-appb-000004
Figure PCTCN2019108514-appb-000004
当overlay_control_flag的i值为7时,第i位的值为1时,定义了一种交互控制结构(OverlayInteraction control structure),可以理解OverlayInteraction控制结构为overlay control structure中的一种。其中,OverlayInteraction控制结构包含了overlay可能被用户进行操作的交互类型。该结构如表3所示:When the i value of overlay_control_flag is 7, and the value of the i-th bit is 1, an interactive control structure (OverlayInteraction control structure) is defined. It can be understood that the OverlayInteraction control structure is one of the overlay control structures. Among them, the OverlayInteraction control structure contains the types of interaction that the overlay may be operated by the user. The structure is shown in Table 3:
表3table 3
Figure PCTCN2019108514-appb-000005
Figure PCTCN2019108514-appb-000005
例如,表3中涉及到的语法元素的语义如下表4所示:For example, the semantics of the syntax elements involved in Table 3 are shown in Table 4 below:
表4Table 4
Figure PCTCN2019108514-appb-000006
Figure PCTCN2019108514-appb-000006
应理解,表4中尽是列举了部分操作功能,在实际过程中,还可以存在其他对overlay的操作功能,当然还可以存在其他操作功能。It should be understood that Table 4 is only a list of some operation functions. In the actual process, there may be other operation functions for overlay, and of course, there may be other operation functions.
如图3所示,图3示出了本申请实施例提供的一种处理媒体数据的方法的流程示意图,该方法包括:As shown in FIG. 3, FIG. 3 shows a schematic flowchart of a method for processing media data according to an embodiment of the present application. The method includes:
步骤101、服务器获取媒体数据。Step 101: The server obtains media data.
可选的,上述媒体数据可以为可以为视频图像,例如,全景视频。媒体数据对应的一个或者多个overlay可以为在媒体数据上显示的一个或者多个覆盖层。例如,覆盖层可以为显示在媒体数据上的视频或者图片,例如,以媒体数据内容为该球员为例,则覆盖在上面的图片可以为姓名或者年龄等。Optionally, the foregoing media data may be a video image, for example, a panoramic video. The one or more overlays corresponding to the media data may be one or more overlays displayed on the media data. For example, the overlay layer may be a video or a picture displayed on the media data. For example, if the content of the media data is the player as an example, the picture overlaid on it may be a name or an age.
步骤102、服务器处理媒体数据,得到媒体数据对应的至少两个覆盖层overlay。Step 102: The server processes the media data to obtain at least two overlay layers corresponding to the media data.
其中,覆盖层为用于叠加在背景视频或者背景图像上进行显示的视频、图像或者文本。The overlay layer is a video, image, or text that is used to be superimposed on a background video or a background image for display.
应理解,对媒体数据处理包括:对媒体数据做前处理,编码以及封装等操作。It should be understood that the processing of media data includes operations such as preprocessing, encoding, and encapsulation of the media data.
一种示例,overlay对应第一信息。An example is that the overlay corresponds to the first information.
示例性的,第一信息包括所述overlay的群组标识信息或者与所述overlay属于同一个群组的其他overlay的标识信息。Exemplarily, the first information includes group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay.
例如,其他overlay的标识信息用于确定与该overlay属于同一个群组的其他overlay。例如,overlay1对应的其他overlay的标识信息为overlay1和overlay2。这样即表示overlay1,overlay2以及,overlay3属于同一个群组。For example, the identification information of other overlays is used to determine other overlays that belong to the same group as the overlay. For example, the identification information of other overlays corresponding to overlay1 is overlay1 and overlay2. This means that overlay1, overlay2, and overlay3 belong to the same group.
另一种示例,overlay对应第二信息和第三信息。In another example, the overlay corresponds to the second information and the third information.
其中,第一信息和第三信息分别用于确定该overlay的群组。The first information and the third information are used to determine a group of the overlay, respectively.
示例性的,第二信息用于指示所述overlay对应的操作功能。所述第三信息用于指示所述overlay的群组标识信息或者与所述overlay属于同一个群组的其他overlay的标识信息。Exemplarily, the second information is used to indicate an operation function corresponding to the overlay. The third information is used to indicate group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay.
示例性的,同一个群组中的overlay的群组标识信息相同。Exemplarily, the group identification information of the overlay in the same group is the same.
应理解,同一个群组中的所有overlay至少具有一个相同的操作功能。It should be understood that all overlays in the same group have at least one same operation function.
应理解,同一个群组中overlay对应的操作功能相同具有如下含义:同一个群组中包括的所有overlay对应的全部操作功能全部相同。例如,overlay1和overlay2属于群组1,overlay1和overlay2对应的操作功能均包括旋转和窗口的大小尺寸可以改变。It should be understood that the operation functions corresponding to overlays in the same group have the same meaning: all operation functions corresponding to all overlays included in the same group are all the same. For example, overlay1 and overlay2 belong to group 1, and the operation functions corresponding to overlay1 and overlay2 both include rotation and the size of the window can be changed.
同一个群组中overlay对应的操作功能相同指:同一个群组中包括的所有overlay对应的至少一个操作功能相同。例如,overlay1对应的操作功能包括旋转和窗口的大小尺寸可以改变。overlay2对应的操作功能包括旋转,则服务器在处理媒体数据时,也可以将overlay1和overlay2划分至群组1中。The same operation function corresponding to overlays in the same group means that at least one operation function corresponding to all overlays included in the same group is the same. For example, the operation functions corresponding to overlay1 include rotation and the size of the window can be changed. The corresponding operation functions of overlay2 include rotation. When the server processes media data, overlay1 and overlay2 can also be divided into group 1.
可选的,群组标识信息用于确定overlay所在的群组。群组标识信息可以为群组ID,或者群组名称,此处不做限定。Optionally, the group identification information is used to determine a group to which the overlay belongs. The group identification information may be a group ID or a group name, which is not limited herein.
应理解,本申请实施例中每个overlay包括overlay结构,该overlay结构中包括用于指示overlay操作功能的指示信息。例如,该操作功能可以通过OverlayInteraction控制结构确定。例如,旋转、自由选择深度、窗口的大小尺寸可以改变等。It should be understood that each overlay in the embodiment of the present application includes an overlay structure, and the overlay structure includes indication information for indicating an overlay operation function. For example, the operation function can be determined through the OverlayInteraction control structure. For example, rotation, free selection depth, window size can be changed, and so on.
具体的,服务器在对媒体数据进行编码,得到媒体数据流包括的一个或者多个overlay,然后得媒体数据流封装时可以确定每个overlay具有的至少一个操作功能,这样对于任意两个或两个以上的overlay,如果任意两个或两个以上的overlay中具有至少一个相同的操作功能,则服务器可以将其分别所在的群组标识信息设置为相同。例如,overlay1和overlay2对应的操作功能为旋转,则服务器可以将overlay1和overlay2对应的第一信息/第三信息用于指示群组1。Specifically, the server encodes the media data to obtain one or more overlays included in the media data stream, and then determines at least one operation function that each overlay has when the media data stream is encapsulated, so that for any two or two In the above overlay, if any two or more overlays have at least one of the same operation functions, the server may set the group identification information of each of them to be the same. For example, if the operation function corresponding to overlay1 and overlay2 is rotation, the server may use the first information / third information corresponding to overlay1 and overlay2 to indicate group 1.
此外,overlay control structure中还定义一个与该overlay相关联的区域(例如,球面区域)的控制结构,用于表示当视频图像中的区域被触发时,可以触发与该区域相关联的overlay展示。例如,以该overlay相关联的区域的控制结构为overlay关联区域控制结构(AssociatedSphereRegionStruct)为例。其中,AssociatedSphereRegionStruct的语法如下表5所示:In addition, the overlay control structure also defines a control structure of an area (eg, a spherical area) associated with the overlay, which is used to indicate that when an area in a video image is triggered, the overlay display associated with the area can be triggered. For example, the control structure of the area associated with the overlay is an overlay associated area control structure (AssociatedSphereRegionStruct) as an example. The syntax of AssociatedSphereRegionStruct is shown in Table 5 below:
表5table 5
Figure PCTCN2019108514-appb-000007
Figure PCTCN2019108514-appb-000007
表5中SphereRegionStruct(1)定义了一个与overlay相关联的球面区域。SphereRegionStruct (1) in Table 5 defines a spherical area associated with the overlay.
当表5中定义的球面区域出现在用户视角范围内时,依赖于客户端的配置或者是用户界面的提示,用户可以通过点击该球面区域来触发与该球面区域来相关联的覆盖层(overlay)的展示或关闭。When the spherical area defined in Table 5 appears in the user's perspective, depending on the configuration of the client or the prompt of the user interface, the user can click the spherical area to trigger the overlay associated with the spherical area. On or off.
应理解,上述overlay的区域可以是指overlay的区域刚好覆盖或者占据的区域,也就是说,overlay的区域内的媒体数据均属于overlay,overlay中的媒体数据均在overlay的区域内。It should be understood that the above-mentioned area of the overlay may refer to an area just covered or occupied by the area of the overlay, that is, the media data in the area of the overlay belong to the overlay, and the media data in the overlay are all in the area of the overlay.
上述overlay的区域空间信息也可以称为overlay的区域的区域空间信息,overlay的区域空间信息用于指示与overlay关联的区域的空间范围或者空间位置。这样当用户在观看视频图像时可以通过触发该区域使得与该区域相关联的overlay显示在视频图像中。The above-mentioned area spatial information of the overlay may also be referred to as area spatial information of the area of the overlay. The area spatial information of the overlay is used to indicate the spatial range or spatial position of the area associated with the overlay. In this way, when the user is watching a video image, the area associated with the area can be displayed in the video image by triggering the area.
上述与overlay关联的区域的空间位置具体可以是针对一个坐标系而言的,该坐标系可以是一个三维坐标系也可以是一个二维坐标系。例如,当采用三维坐标系来表示与overlay关联的区域的空间位置时,三维坐标系的原点可以是全景视频图像的中心点或者全景视频图像左上角的点或者全景视频图像中其它固定位置点。另外,上述与overlay关联的区域的空间位置也可以是overlay在全景视频图像区域中的位置(此时可以采用三维坐标系之外的其它坐标系,如球面坐标系来表示与overlay关联的区域的空间位置)。The above-mentioned spatial position of the area associated with the overlay may specifically be directed to a coordinate system, and the coordinate system may be a three-dimensional coordinate system or a two-dimensional coordinate system. For example, when a three-dimensional coordinate system is used to represent the spatial position of the area associated with the overlay, the origin of the three-dimensional coordinate system may be the center point of the panoramic video image, the point in the upper left corner of the panoramic video image, or other fixed position points in the panoramic video image. In addition, the spatial position of the area associated with the overlay may also be the position of the overlay in the panoramic video image area. Spatial location).
本申请实施例可以适用于如下场景1和场景2:The embodiments of the present application may be applicable to the following scenario 1 and scenario 2:
场景1、一个或者多个overlay需要以触发的方式共同展示或者共同关闭,则可以将一个或者多个overlay加入一个群组(下述以群组的名称为共同展示群组为例,可以理解,该群组还可以具有其他名称)中。则一个或多个overlay中每个overlay对应第二信息和第三信息。 Scenario 1. One or more overlays need to be displayed or closed together in a triggered manner, then one or more overlays can be added to a group (the name of the group is used as a common display group as an example below. It can be understood that The group can also have other names). Then each overlay in the one or more overlays corresponds to the second information and the third information.
场景2、可以根据一个或者多个overlay具有OverlayInteraction控制结构中定义的交互操作,将具有某一类交互操作的一个或者多个overlay加入一个群组(可以称此类用于交互操作的群组名称为:交互群组,应理解,该用于交互操作的群组还可以为其他名称)中。使得在终端可以基于对该交互群组的触发操作,使得交互群组中的所有overlay执行某一类被定义的操作功能,例如交互操作。示例性的,交互操作可以如表4所示,此处不再赘述。此时,一个或多个overlay中每个overlay对应第一信息。Scenario 2: You can add one or more overlays with a certain type of interaction to a group based on one or more overlays with the interaction defined in the OverlayInteraction control structure (you can call this group name for interaction) For: interaction group, it should be understood that the group for interaction operation may also be in another name). This enables the terminal to perform a certain type of defined operation function, such as an interactive operation, based on a trigger operation on the interaction group. Exemplarily, the interactive operation may be shown in Table 4, and details are not described herein again. At this time, each of the one or more overlays corresponds to the first information.
第一种可能的实现方式,本申请实施例中的步骤102具体可以通过以下方式实现:S1、服务器对媒体数据进行编码得到媒体数据对应的媒体数据码流。S2、服务器对编码后得到的媒体数据码流封装,封装后的媒体数据码流包括一个或者多个overlay的信息,且一个或者多个overlay中每个overlay对应的第一信息。或者每个overlay对应的第二信息和第三信息。该一个或者多个overlay中每个overlay对应文件格式。As a first possible implementation manner, step 102 in the embodiment of the present application may be specifically implemented in the following manner: S1. The server encodes the media data to obtain a media data code stream corresponding to the media data. S2. The server encapsulates the media data stream obtained after encoding. The encapsulated media data stream includes information of one or more overlays, and first information corresponding to each overlay in the one or more overlays. Or the second information and the third information corresponding to each overlay. Each overlay in the one or more overlays corresponds to a file format.
应理解,下述实施例中overlay可以是背景层媒体内容(即媒体数据)的一部分,这时overlay可不进行单独编码。即对服务器对媒体数据进行编码时,得到的媒体数据流中包括一个或者多个overlay。然后服务器可以对包括一个或者多个overlay的媒体数据流进行封装。例如使得封装后的媒体数据流对应文件描述。或者服务器对媒体数据流进行封装后使得媒体数据流包括的一个或者多个overlay具有overlay结构。It should be understood that the overlay in the following embodiments may be part of the media content (ie, media data) of the background layer, and the overlay may not be separately encoded at this time. That is, when the server encodes the media data, the obtained media data stream includes one or more overlays. The server may then encapsulate the media data stream including one or more overlays. For example, make the encapsulated media data stream correspond to the file description. Or the server encapsulates the media data stream so that one or more overlays included in the media data stream have an overlay structure.
下述实施例中overlay也可以作为媒体内容进行单独编码,这时服务器对媒体数据进行编码得到媒体数据流,然后对媒体数据包括的overlay进行编码得到overlay码流。服务器在对媒体数据流和overlay码流封装时,封装后的媒体数据流中具有overlay的信息。其中,overlay的信息可以是overlay结构。In the following embodiments, the overlay can also be separately encoded as media content. At this time, the server encodes the media data to obtain a media data stream, and then encodes the overlay included in the media data to obtain an overlay code stream. When the server encapsulates the media data stream and the overlay code stream, the encapsulated media data stream has overlay information. The overlay information may be an overlay structure.
对应于场景1,则第二信息和第三信息可以携带在对包括overlay的媒体数据流封装后的overlay的文件格式中。Corresponding to scenario 1, the second information and the third information may be carried in a file format of an overlay that encapsulates a media data stream including the overlay.
其中,文件格式包括:overlay结构,以及位于overlay结构中的overlay关联区域控制结构以及overlay群组box。第三信息位于overlay群组box中,和第二信息位于overlay关联区域控制结构中。此时,可以理解第二信息即可以为overlay关联区域控制结构。overlay关联区域控制结构的操作功能可以参见上述描述。示例性的,第三信息指示的操作功能可以为展示或者关闭。The file format includes: an overlay structure, and an overlay-related area control structure and an overlay group box located in the overlay structure. The third information is located in the overlay group box, and the second information is located in the overlay associated area control structure. At this time, it can be understood that the second information may be an overlay associated area control structure. For the operation function of the control structure of the overlay associated area, refer to the above description. Exemplarily, the operation function indicated by the third information may be display or shutdown.
对应于场景2,在overlay的文件格式中携带第一信息即可。即封装后得到的一个或者多个overlay的文件格式中可以不具有overlay关联区域控制结构。Corresponding to scenario 2, the first information may be carried in the file format of the overlay. That is, one or more overlay file formats obtained after encapsulation may not have an overlay-related area control structure.
其中,文件格式包括:overlay群组box。第一信息位于overlay群组box中。应理解在场景2下,文件格式还可以包括overlay结构。The file format includes: overlay group box. The first information is located in an overlay group box. It should be understood that in scenario 2, the file format may also include an overlay structure.
示例性的,服务器可以按照OMAF标准文件格式封装包括一个或者多个overlay的媒体数据流,在封装文件中,一个或者多个overlay中每个overlay的文件格式中具有overlay控制结构。此外,对于场景1可以使得该overlay结构中具有overlay关联区域控制结构。对于场景2可以使得该overlay结构中具有OverlayInteraction控制结构。Exemplarily, the server may encapsulate the media data stream including one or more overlays according to the OMAF standard file format. In the encapsulated file, the file format of each overlay in the one or more overlays has an overlay control structure. In addition, for scenario 1, the overlay structure may have an overlay-related area control structure. For scenario 2, the overlay structure may be provided with an OverlayInteraction control structure.
具体的,服务器在封装过程中可以将overlay关联区域控制结构对应的box加入到文件格式中,以使得overlay的文件格式中具有overlay控制结构。Specifically, during the encapsulation process, the server may add a box corresponding to the overlay control region control structure to the file format, so that the overlay file format has an overlay control structure.
此外,在OMAF中,针对多个overlay定义了实体群组(entity groups)。定义了一种包含可选择性切换的群组(例如,切换群组),在这个切换群组中的多个overlay可以相互切换。切换群组的具体语法如下表6所示:In addition, in OMAF, entity groups are defined for multiple overlays. A type of group (eg, a switching group) including selective switching is defined. Multiple overlays in this switching group can be switched to each other. The specific syntax for switching groups is shown in Table 6 below:
表6Table 6
Figure PCTCN2019108514-appb-000008
Figure PCTCN2019108514-appb-000008
其中,ref_overlay_id[i]表示与一个overlay属于同一个群组中的其他overlay的标识信息。Among them, ref_overlay_id [i] represents identification information of other overlays that belong to the same group as an overlay.
此外,对于场景2,服务器在封装时还可以使得每个overlay的overlay结构中还可以定义一个与该overlay同属于一个群组的其他overlay的标识信息,这样以替代上述群组标识信息。In addition, for scenario 2, the server can also make it possible to define identification information of other overlays that belong to the same group as the overlay in the overlay structure of each overlay, so as to replace the above-mentioned group identification information.
上述overlay的标识信息用于识别overlay。该标识信息可以是overlay的ID号。The identification information of the overlay is used to identify the overlay. The identification information may be an ID number of the overlay.
需要说明的是,如果一个群组中仅有一个overlay,则该overlay中可以不定义其他overlay的标识信息。It should be noted that if there is only one overlay in a group, identification information of other overlays may not be defined in the overlay.
例如,overlay1属于群组1、overlay2也属于群组1,因此,在overlay1中可以定义群组1的标识信息,以及overlay2的标识信息。在overlay2中可以定义群组1的标识信息,以及overlay1的标识信息。For example, overlay1 belongs to group1 and overlay2 also belongs to group1. Therefore, the identification information of group1 and the identification information of overlay2 can be defined in overlay1. In overlay2, identification information of group 1 and identification information of overlay1 can be defined.
其中,通过定义在overlay所在的群组,可以使得当对该群组中的任一个overlay执行该overlay对应的操作功能,位于该群组中的所有overlay均响应该操作功能。Wherein, by defining the group in which the overlay is located, when any overlay in the group performs an operation function corresponding to the overlay, all overlays in the group respond to the operation function.
本申请实施例中的可以在每个overlay对应的文件格式中具有entity groups和overlay结构。其中,entity groups的表现形式实体群组盒子EntityToGroupBox。In the embodiment of the present application, there may be an entity group and an overlay structure in a file format corresponding to each overlay. Among them, the entity entity group box EntityToGroupBox.
本申请实施例中文件格式还包括overlay群组box,该overlay群组box用于表示当对该overlay群组box中的任一个overlay该overlay对应的操作功能,位于该overlay群组box中的所有overlay均响应该操作功能。The file format in the embodiment of the present application further includes an overlay group box. The overlay group box is used to indicate an operation function corresponding to the overlay when overlaying any one of the overlay group boxes. The overlays all respond to this operation function.
具体的overlay群组box的名称可以参考该overlay群组box中的overlay控制结构的功能命名。例如,overlay控制结构为overlay关联区域控制结构,则可以定义overlay群组box为OverlayConditionalShownGroupBox,表示一组overlay在能够在用户针对,某一个overlay进行触发展示时,进行共同展示或者关闭。例如,对于进行交互操作,也即具有OverlayInteraction控制结构的overlay群组box可以为OverlayRelationGroupBox。For the specific name of the overlay group box, refer to the function name of the overlay control structure in the overlay group box. For example, if the overlay control structure is an overlay-related area control structure, the overlay group box can be defined as an OverlayConditionalShownGroupBox, which means that a group of overlays can be displayed or closed together when the user targets and triggers a certain overlay. For example, for an interactive operation, that is, an overlay group box with an OverlayInteraction control structure may be an OverlayRelationGroupBox.
应理解,本申请实施例中overlay群组box中的OverlayConditionalShownGroupBox和OverlayRelationGroupBox还可以存在其他名称,本申请实施例对此不作限定。It should be understood that the OverlayConditionalShownGroupBox and OverlayRelationGroupBox in the overlay group box in the embodiment of the present application may also have other names, which are not limited in the embodiment of the present application.
示例性的,可以在entity groups中具有overlay群组box。Exemplarily, there may be an overlay group box in the entity groups.
示例性的,如果一个群组中的所有overlay都具有overlay关联区域控制结构,则可以将该群组以共同展示群组命名。该共同展示群组可以表示该共同展示群组中的多个overlay可以共同展示或者关闭。或者该群组对应的第一提示信息,用于提示该群组可以进行共同展开或关闭。如果是对一个群组中的所有overlay进行交互操作,则可以定义该群组名称为交互群组,或者该群组对应的第二提示信息,用于提示该群组可以进行共同进行该群组对应的操作类型指示的交互操作。可以理解的是,此处仅是示例,群组的名称还可以为其他名称,本申请实施例对此不作限定。Exemplarily, if all overlays in a group have an overlay-related area control structure, the group may be named after a common display group. The common display group may indicate that multiple overlays in the common display group may be displayed or closed together. Or, the first prompt information corresponding to the group is used to indicate that the group can be expanded or closed together. If you perform interactive operations on all overlays in a group, you can define the group name as an interactive group, or the second prompt information corresponding to the group, which is used to prompt the group to perform the group together. The interactive operation indicated by the corresponding operation type. It can be understood that this is only an example, and the name of the group may also be another name, which is not limited in the embodiment of the present application.
示例1-1、以共同展示群组为例,则本申请实施例中的overlay群组box可以为OverlayConditionalShownGroupBox,表示一组overlay在可以在用户针对任一个overlay或者该群组进行触发展示时,进行共同展示。此时第三信息可以携带在OverlayConditionalShownGroupBox中。Example 1-1. Taking the common display group as an example, the overlay group box in the embodiment of the present application may be OverlayConditionalShownGroupBox, which means that a group of overlays can be performed when the user triggers the display for any overlay or the group Show together. At this time, the third information may be carried in the OverlayConditionalShownGroupBox.
当用户对该群组或者群组中的任一个overlay进行关闭时,进行共同关闭。其具体语法如表7所示:When the user closes the group or any overlay in the group, a common close is performed. The specific syntax is shown in Table 7:
表7Table 7
Figure PCTCN2019108514-appb-000009
Figure PCTCN2019108514-appb-000009
上述表7中ref_overlay_id[i]表示第i个entity_id表示的track或图像项所对应的overlay_id在这个群组内是可以在用户的触发下展示的overlay。被引用的第i个track或图像项中将存在对应于ref_overlay_id[i]的overlay_id。当在这个entity group中用entity_id表示的每个track或图像项只包含一个overlay时,那么也允许该结构中ref_overlay_id[i]语法元素不存在。The ref_overlay_id [i] in the above Table 7 indicates that the overlay_id corresponding to the track or image item indicated by the i-th entity_id is an overlay that can be displayed under the trigger of the user in this group. There will be an overlay_id corresponding to ref_overlay_id [i] in the referenced i-th track or image item. When each track or image item represented by entity_id in this entity group contains only one overlay, then the ref_overlay_id [i] syntax element in the structure is also allowed to exist.
使用OverlayConditionalShownGroupBox将多个overlay加入到同一群组中。Use OverlayConditionalShownGroupBox to add multiple overlays to the same group.
示例2-1、以交互群组为例,则服务器对媒体数据处理后得到的一个或者多个overlay中每个overlay的文件格式中具有该overlay的群组标识信息。Example 2-1. Taking interactive groups as an example, the file format of each overlay in one or more overlays obtained by the server after processing the media data has the group identification information of the overlay.
本申请实施例中的overlay群组box可以为OverlayRelationGroupBox,用于将多个overlay组成一个交互群组,在该交互群组中的所有overlay可以有相同的交互操作。此时第一信息携带在OverlayRelationGroupBox中。The overlay group box in the embodiment of the present application may be an OverlayRelationGroupBox, which is used to form multiple overlays into an interaction group, and all overlays in the interaction group may have the same interaction operation. At this time, the first information is carried in the OverlayRelationGroupBox.
需要说明的是,同一个交互群组指定了针对该交互群组中的所有overlay可以进行某一类的操作功能指示的交互操作。针对OverlayRelationGroupBox里的任一个overlay进行该OverlayRelationGroupBox对应的操作功能时,则该OverlayRelationGroupBox中的其他overlay也响应该OverlayRelationGroupBox对应的操作功能。其具体语法如表8所示:It should be noted that the same interaction group specifies an interaction operation that can perform a certain type of operation function instruction for all overlays in the interaction group. When performing an operation function corresponding to the OverlayRelationGroupBox for any overlay in the OverlayRelationGroupBox, the other overlays in the OverlayRelationGroupBox also respond to the operation function corresponding to the OverlayRelationGroupBox. The specific syntax is shown in Table 8:
表8Table 8
Figure PCTCN2019108514-appb-000010
Figure PCTCN2019108514-appb-000010
可选地,关于OverlayInteraction控制结构中包含的交互信息语法元素,当存在多个overlay构成交互群组OverlayRelationGroupBox时。如果交互群组中的任一个overlay被触发,则OverlayInteraction控制结构中定义的操作功能将共同应用于交互群组中的各个overlay。Optionally, regarding the interactive information syntax element included in the OverlayInteraction control structure, when there are multiple overlays forming an interaction group OverlayRelationGroupBox. If any overlay in the interaction group is triggered, the operation functions defined in the OverlayInteraction control structure will be applied to each overlay in the interaction group together.
并且交互群组中每个overlay的OverlayRelationGroupBox定义的操作功能所对应的OverlayInteraction控制结构的语法元素的值相同。And the value of the syntax element of the OverlayInteraction control structure corresponding to the operation function defined by the OverlayRelationGroupBox of each overlay in the interaction group is the same.
例如,以OverlayRelationGroupBox定义的是针对该OverlayRelationGroupBox中的所有overlay进行尺度缩放的操作功能为例。For example, the OverlayRelationGroupBox defines an operation function for scaling all overlays in the OverlayRelationGroupBox as an example.
此时,如果OverlayRelationGroupBox中的overlayA、overlayB、overlayC,构成 交互群组1,则overlayA、overlayB、overlayC分别对应的OverlayInteraction控制结构中resize_flag=1。当该交互群组1被触发时,overlayA、overlayB、overlayC将被执行尺度缩放操作。同样,当OverlayRelationGroupBox定义是针对该OverlayRelationGroupBox中的所有overlay进行位置改变的操作功能,则overlayA、overlayB、overlayC分别对应的OverlayInteraction控制结构中change_position_flag=1。当overlayA、overlayB、overlayC所在的群组被触发时,overlayA、overlayB、overlayC将进行位置改变的操作。At this time, if overlayA, overlayB, and overlayC in the OverlayRelationGroupBox form interaction group 1, then resize_flag = 1 in the OverlayInteraction control structure corresponding to overlayA, overlayB, and overlayC, respectively. When this interaction group 1 is triggered, overlayA, overlayB, and overlayC will be scaled. Similarly, when the OverlayRelationGroupBox definition is an operation function for changing the position of all overlays in the OverlayRelationGroupBox, change_position_flag = 1 in the OverlayInteraction control structure corresponding to overlayA, overlayB, and overlayC respectively. When the group to which overlayA, overlayB, and overlayC belong is triggered, overlayA, overlayB, and overlayC will perform a position change operation.
需要说明的是,同一个交互群组的所有overlay将都进行相同的操作功能。It should be noted that all overlays of the same interaction group will perform the same operation function.
示例性的,本申请实施例中的overlay可以与背景视频共同显示,则可以将overlay与背景视频绑定以进行共同显示。例如,overlay与背景视频共同显示的语法结构如表9所示:Exemplarily, the overlay in the embodiment of the present application can be displayed together with the background video, and the overlay and the background video can be bound for common display. For example, the syntax structure of overlay and background video is shown in Table 9:
表9Table 9
Figure PCTCN2019108514-appb-000011
Figure PCTCN2019108514-appb-000011
第二种可能的实现方式,本申请实施例中的步骤102具体可以通过以下方式实现:S3、服务器对媒体数据编码得到媒体数据流,对媒体数据包括的一个或者多个overlay进行编码得到每个overlay对应的overlay码流,每个overlay码流中包括SEI。The second possible implementation manner, step 102 in the embodiment of the present application may be specifically implemented in the following manner: S3. The server encodes the media data to obtain a media data stream, and encodes one or more overlays included in the media data to obtain An overlay code stream corresponding to the overlay, and each overlay code stream includes a SEI.
S4、服务器将媒体数据流和每个overlay对应的overlay码流进行封装得到包括一个或者多个overlay信息的媒体数据流。S4. The server encapsulates the media data stream and the overlay code stream corresponding to each overlay to obtain a media data stream including one or more overlay information.
需要说明的是,当overlay可以作为媒体内容单独编码时,服务器也可以对overlay码流进行单独封装。然后将封装后的overlay码流发送给终端。It should be noted that when the overlay can be separately encoded as the media content, the server can also separately encapsulate the overlay code stream. Then send the encapsulated overlay code stream to the terminal.
其中,SEI的载荷类型用于指示SEI中携带overlay的群组标识信息。The SEI payload type is used to indicate that the SEI carries overlay group identification information.
对应于场景1,第三信息可以作为一个指示字段携带在overlay码流的SEI中。或者,第一信息可以作为一种overlay码流的SEI,此时,SEI中具有一个指示字段用于指示该overlay的群组标识信息。Corresponding to scenario 1, the third information may be carried as an indication field in the SEI of the overlay code stream. Alternatively, the first information may be used as an SEI of an overlay code stream. At this time, the SEI has an indication field for indicating the group identification information of the overlay.
应理解,服务器在执行S4时,也可以使得封装后的overlay包括:overlay关联区域控制结构。具体封装过程可以参考上述S2,此处不在赘述。It should be understood that when the server executes S4, the encapsulated overlay may also include: an overlay associated area control structure. The specific encapsulation process can refer to the above S2, which will not be repeated here.
例如,第三信息携带在overlay对应的overlay码流的SEI中,第二信息携带在overlay的overlay关联区域控制结构中。For example, the third information is carried in the SEI of the overlay code stream corresponding to the overlay, and the second information is carried in the overlay associated area control structure of the overlay.
也可以理解,在第二种实现方式中,对应于场景1该SEI中携带群组标识信息时可以以共同群组命名,这样可以不携带第二信息,也即在overlay的overlay结构中不定义overlay关联区域控制结构。It can also be understood that in the second implementation manner, the SEI corresponding to the scenario 1 may be named after a common group when carrying the group identification information, so that the second information may not be carried, that is, it is not defined in the overlay structure of the overlay overlay associated area control structure.
第三信息的携带方式可以参考第一种可能的实现方式中的描述,此处不再赘述。For the manner of carrying the third information, refer to the description in the first possible implementation manner, and details are not described herein again.
对应于场景2,第一信息可以作为一个指示字段携带在overlay对应的overlay码流的SEI中。此时封装后的overlay可以不具有overlay关联区域控制结构。这时为了使得终端在接收到overlay之后可以知道overlay的操作功能,SEI中携带群组标识信息时可以群组中每个overlay对应的操作功能为群组命名。Corresponding to scenario 2, the first information may be carried as an indication field in the SEI of the overlay code stream corresponding to the overlay. At this time, the encapsulated overlay may not have an overlay-related area control structure. At this time, in order that the terminal can know the operation function of the overlay after receiving the overlay, when the SEI carries the group identification information, the operation function corresponding to each overlay in the group can be used to name the group.
例如,第一信息携带在overlay对应的overlay码流的SEI中。For example, the first information is carried in the SEI of the overlay code stream corresponding to the overlay.
其中,SEI用于指示该overlay的群组标识信息。例如,该SEI的语法结构如表10所示:The SEI is used to indicate group identification information of the overlay. For example, the syntax structure of the SEI is shown in Table 10:
表10Table 10
Figure PCTCN2019108514-appb-000012
Figure PCTCN2019108514-appb-000012
表10中sei_payload定义了SEI负载信息,包含两个参数payloadType和payloadSize。其中,payloadType指示SEI的类型,payloadSize指示SEI的大小。The sei_payload in Table 10 defines the SEI payload information, including two parameters payloadType and payloadSize. Among them, the payloadType indicates the type of the SEI, and the payloadSize indicates the size of the SEI.
其中,表10中的OLG为一个变量,表示一个SEI的payloadType取值。例如,OLG的值可以为190。下述但凡涉及到OLG均可以参考此处的描述,后续不再赘述。本申请实施例对OLG的具体数值不做限定。payloadSize表示载荷大小。Among them, OLG in Table 10 is a variable, which represents the value of the payloadType of an SEI. For example, the value of OLG may be 190. For the following references to OLG, you can refer to the description here, and will not repeat them later. The embodiment of the present application does not limit the specific value of OLG. payloadSize indicates the payload size.
示例3-1,以操作功能为展示或者关闭为例,则一个overlay的群组可以表示为overlay条件展示群组(overlay_conditional_shown_group)。In Example 3-1, taking an operation function as a display or a shutdown as an example, an overlay group can be represented as an overlay condition display group (overlay_conditional_shown_group).
因此,可以使用overlay_conditional_shown_group_info(信息)替换表10中的overlay的群组标识信息。例如,overlay_conditional_shown_group_info的语法结构如表11所示:Therefore, the group identification information of the overlay in Table 10 may be replaced with overlay_conditional_shown_group_info (information). For example, the syntax structure of overlay_conditional_shown_group_info is shown in Table 11:
表11Table 11
Figure PCTCN2019108514-appb-000013
overlay_conditional_shown_group_id该值指示该overlay的群组的ID号。
Figure PCTCN2019108514-appb-000013
overlay_conditional_shown_group_id This value indicates the ID number of the group of the overlay.
示例4-1,以操作功能为交互操作为例,则overlay的群组可以为overlay_relation_group,可以使用overlay_relation_group_info替换表10中的overlay的群组标识信息。Example 4-1, taking the operation function as an interactive operation, the overlay group can be overlay_relation_group, and overlay_relation_group_info can be used to replace the overlay group identification information in Table 10.
上述交互操作可以指针对某一类操作功能进行共同操作,也可以针对overlay所支持的所有操作功能进行共同操作,本申请实施例中不做限定。The above interactive operations may refer to a common operation on a certain type of operation function, or a common operation on all operation functions supported by the overlay, which are not limited in the embodiments of the present application.
第三种可能的实现方式,本申请实施例中的步骤S102具体可以通过以下方式实现:S5、对媒体数据编码得到包括一个或者多个overlay的媒体数据流。S6、服务器封装包括一个或者多个overlay的媒体数据流,得到媒体数据流对应的描述文件。A third possible implementation manner, step S102 in the embodiment of the present application may be specifically implemented in the following manner: S5. Encode the media data to obtain a media data stream including one or more overlays. S6. The server encapsulates a media data stream including one or more overlays, and obtains a description file corresponding to the media data stream.
具体的,S5的实现可以参考S1,此处不再赘述。Specifically, for implementation of S5, reference may be made to S1, and details are not described herein again.
示例性的,S6具体可以通过以下方式实现:服务器可以基于DASH传输协议标准 对包括一个或者多个overlay的媒体数据流进行封装,以得到媒体数据流的媒体呈现描述MPD作为描述文件。在MPD的adaptation set level或者representation level的overlay描述字中携带overlay所在群组标识信息。Exemplarily, S6 may be specifically implemented in the following manner: The server may encapsulate a media data stream including one or more overlays based on a DASH transmission protocol standard to obtain a media presentation description MPD of the media data stream as a description file. The overlay descriptor of the MPD's adaptation, level, or representation level carries the group identification information of the overlay.
对应于场景1,一个或多个overlay中每个overlay对应第二信息和第三信息。此时,该描述文件至少包括一个或者多个overlay中每个overlay的第三信息,第三信息可以作为一个指示字段携带在媒体数据流的描述文件中。应理解,对媒体数据流封装后,媒体数据流包括的一个或者多个overlay具有overlay关联区域控制结构。具体封装过程可以参考上述S2,此处不在赘述。Corresponding to scenario 1, each overlay in the one or more overlays corresponds to the second information and the third information. At this time, the description file includes at least third information of each overlay in one or more overlays, and the third information may be carried as an indication field in the description file of the media data stream. It should be understood that after the media data stream is encapsulated, one or more overlays included in the media data stream have an overlay associated area control structure. The specific encapsulation process can refer to the above S2, which will not be repeated here.
例如,第三信息携带在对包括一个或者多个overlay的媒体数据流封装后得到的媒体数据流对应的MPD中,所述第二信息携带在overlay的overlay关联区域控制结构中。For example, the third information is carried in an MPD corresponding to a media data stream obtained by encapsulating a media data stream including one or more overlays, and the second information is carried in an overlay-related area control structure of the overlay.
第三种可能的实现方式中,对应于场景1还可以使用如下方式替换,即在包含overlay的媒体数据的描述文件中的overlay群组以overlay的操作功能命名,此时封装后的overlay可以不具有overlay关联区域控制结构。In a third possible implementation manner, corresponding to scenario 1 can also be replaced in the following manner, that is, the overlay group in the description file containing the media data of the overlay is named after the operation function of the overlay. Has an overlay associated area control structure.
第二信息的携带方式可以参考第二种可能的实现方式中关于场景一处的描述,此处不再赘述。For the manner of carrying the second information, reference may be made to the description of the first scenario in the second possible implementation manner, and details are not described herein again.
对应于场景2,一个或多个overlay中每个overlay对应第一信息。此时第一信息可以作为一个指示字段携带在包含overlay的媒体数据的描述文件中。可以理解,群组的名称也可以以码流的操作功能命名。Corresponding to scenario 2, each overlay in one or more overlays corresponds to the first information. At this time, the first information may be carried as an indication field in a description file containing the media data of the overlay. It can be understood that the name of the group can also be named by the operation function of the code stream.
例如,第一信息携带在对包含overlay的媒体数据的封装后得到的媒体数据的媒体呈现描述MPD中。For example, the first information is carried in a media presentation description MPD of the media data obtained by encapsulating the media data including the overlay.
综上所述,第一信息或所述第三信息位于所述MPD的adaptation set level或者representation level的overlay描述字中。In summary, the first information or the third information is located in an overlay description word of an adaptation set level or a representation level of the MPD.
示例5-1,以操作功能为展示或者关闭为例,故可为overlay描述字定义一个新的@schemeIdUri,其值为:"urn:mpeg:mpegI:omaf:2018:ocsg",语义为overlay共同展示的分组信息(OCSG)描述字。最多一个OCSG描述字允许出现在adaptation set level或者是representation level。Example 5-1. Take the operation function as the display or shutdown as an example, so you can define a new @schemeIdUri for the overlay descriptor, the value is: "urn: mpeg: mpegI: omaf: 2018: ocsg", the semantics is common to overlay OCSG descriptor. A maximum of one OCSG descriptor is allowed to appear at the adaptation level or the representation level.
OCSG描述字的值是一串以逗号分隔的字符串,其具体的值和语义定义如下表12:The value of the OCSG descriptor is a comma-separated string. The specific values and semantics are defined in Table 12 below:
表12Table 12
Figure PCTCN2019108514-appb-000014
Figure PCTCN2019108514-appb-000014
其中,M表示必选参数,O表示可选参数。拥有相同overlay_relation_group_id值的adaptation set属于同一个交互群组,属于不同群组的adaptation set中的该值可以不相同。Among them, M represents a required parameter, and O represents an optional parameter. An adaptation set that has the same overlay_relation_group_id value belongs to the same interaction group. The values in the adaptation set that belong to different groups can be different.
示例性的,表13示出了一个MPD中携带指示overlay的群组为共同展示群组的示例:Exemplarily, Table 13 shows an example of an MPD carrying a group indicating an overlay as a common display group:
表13Table 13
Figure PCTCN2019108514-appb-000015
Figure PCTCN2019108514-appb-000015
Figure PCTCN2019108514-appb-000016
Figure PCTCN2019108514-appb-000016
表13中描述了两个共同展示群组,分别为"urn:mpeg:mpegI:omaf:2018:ocsg"value=”1”(可以简称为共同展示群组1)和"urn:mpeg:mpegI:omaf:2018:ocsg"value=”2”(可以简称为共同展示群组2。表13所示的两个共同展示群组中每个共同展示群组包括:两个overlay。例如,共同展示群组1中包括overlay1和overlay2,共同展示群组2中包括overlay3和overlay4。Table 13 describes two common display groups, namely "urn: mpeg: mpegI: omaf: 2018: ocsg" value = "1" (can be referred to as common display group 1) and "urn: mpeg: mpegI: omaf: 2018: ocsg "value =" 2 "(may be referred to as common display group 2. Each of the two common display groups shown in Table 13 includes: two overlays. For example, a common display group Group 1 includes overlay1 and overlay2, and together shows that group 2 includes overlay3 and overlay4.
示例6-1,以overlay所在群组共同交互群组为例。服务器可为overlay描述字定义一个新的@schemeIdUri,其值为:"urn:mpeg:mpegI:omaf:2018:ovly",语义为overlay共同交互分组信息(OVLY)描述字,描述了针对overlay所在群组进行某一类交互操作。如overlay的位置移动。最多一个OVLY描述字允许出现在adaptation set level或者是representation level。Example 6-1. Take the common interaction group of the group where the overlay is located as an example. The server can define a new @schemeIdUri for the overlay descriptor, whose value is: "urn: mpeg: mpegI: omaf: 2018: ovly", the semantics is overlay common interaction grouping information (OVLY) descriptor, which describes the group for the overlay. Groups perform some kind of interaction. If the position of the overlay moves. A maximum of one OVLY descriptor can appear at the adaptation level or the representation level.
OVLY描述字的值是一串以逗号分隔的字符串,其具体的值和语义定义如下表14所示:The value of the OVLY descriptor is a string separated by commas. The specific values and semantic definitions are shown in Table 14 below:
表14Table 14
Figure PCTCN2019108514-appb-000017
Figure PCTCN2019108514-appb-000017
应理解,拥有相同overlay_relation_group_id值的adaptation set属于同一个共同交互群组,属于不同群组的adaptation set中的该值必须不相同。It should be understood that the adaptation sets having the same overlay_relation_group_id value belong to the same common interaction group, and the values in the adaptation sets belonging to different groups must be different.
示例性的,表15示出了一个MPD中携带指示overlay所在群组为共同交互群组的示例:Exemplarily, Table 15 shows an example of an MPD carrying a group indicating that the overlay is a common interaction group:
表15Table 15
Figure PCTCN2019108514-appb-000018
Figure PCTCN2019108514-appb-000018
Figure PCTCN2019108514-appb-000019
Figure PCTCN2019108514-appb-000019
表15中描述了两个共同交互群组,分别为"urn:mpeg:mpegI:omaf:2018:ovly"value "=”1”(可以简称为共同交互群组1)和"urn:mpeg:mpegI:omaf:2018:ovly"value=”2”(可以简称为共同交互群组2。表15所示的两个共同交互群组中每个共同交互群组两个overlay。例如,共同交互群组1中包括overlay1和overlay2,共同交互群组2中包括overlay3和overlay4。Table 15 describes two common interaction groups, namely "urn: mpeg: mpegI: omaf: 2018: ovly" value "=" 1 "(can be referred to as common interaction group 1 for short) and" urn: mpeg: mpegI : omaf: 2018: ovly "value =" 2 "(can be referred to as common interaction group 2. For each of the two common interaction groups shown in Table 15, there are two overlays. For example, the common interaction group 1 includes overlay1 and overlay2, and common interaction group 2 includes overlay3 and overlay4.
应理解,服务器在确定每个overlay具有的操作功能之后,可以确定具有相同操作功能的一个或者多个overlay属于同一个群组。也可以理解为:如果两个或两个以上的overlay具有相同操作功能,则可以将具有相同操作功能的两个或两个以上的overlay划分至同一个群组中。并可以为该群组以该两个或两个overlay具有的共同操作命名。这时该群组可以对应一个操作选项,用于提示该群组中的overlay共同具有的操作功能。It should be understood that after determining the operation function that each overlay has, the server may determine that one or more overlays having the same operation function belong to the same group. It can also be understood that if two or more overlays have the same operation function, two or more overlays with the same operation function can be divided into the same group. And the group can be named after the two or two overlays have a common operation. At this time, the group can correspond to an operation option, which is used to prompt the operation functions that the overlay in the group has in common.
例如,overlay1对应旋转和缩小、overlay2对应旋转,则服务器可以在overlay1和overlay2中携带指示群组1的标识信息。应理解,群组1对应的操作选项用于指示overlay1和overlay2共同具有的操作功能。For example, if overlay1 corresponds to rotation and reduction, and overlay2 corresponds to rotation, the server may carry identification information indicating group 1 in overlay1 and overlay2. It should be understood that the operation option corresponding to group 1 is used to indicate the operation function shared by overlay1 and overlay2.
应理解,服务器可以通过每个overlay各自的overlay控制结构确定每个overlay各自的操作功能。It should be understood that the server may determine the respective operation function of each overlay through the respective overlay control structure of each overlay.
步骤103、服务器向终端发送一个或者多个overlay。Step 103: The server sends one or more overlays to the terminal.
例如,服务器可以通过发送传输装置向终端发送一个或者多个overlay。For example, the server may send one or more overlays to the terminal through a transmitting and transmitting device.
需要说明的是,本申请实施例中服务器处理完一个或者多个overlay后可以直接将处理后得到的一个或者多个overlay发送给终端。也可以在接收到终端发送的用于请求overlay的请求消息之后再发送处理后的一个或者多个overlay。It should be noted that, in the embodiment of the present application, after the server processes one or more overlays, it may directly send the processed one or more overlays to the terminal. It is also possible to send the processed one or more overlays after receiving a request message for requesting an overlay sent by the terminal.
应理解,对于上述第一种可能的实现方式和第二种可能的实现方式中,第一信息,第二信息和第三信息是包含在overlay中的。对于第三种可能的实现方式,第一信息,第二信息和第三信息是包含在MPD文件中的。当服务采用第三种可能的实现方式处理媒体数据时,S103中服务器发送的一个或者多个overlay,可以理解也向终端发送了一个或者多个overlay对应的MPD。一个或者多个overlay对应的MPD中包括每个overlay的信息。It should be understood that, for the foregoing first possible implementation manner and the second possible implementation manner, the first information, the second information, and the third information are included in the overlay. For the third possible implementation manner, the first information, the second information, and the third information are included in the MPD file. When the service uses the third possible implementation manner to process the media data, it is understood that the one or more overlays sent by the server in S103 also send the MPD corresponding to the one or more overlays to the terminal. The MPD corresponding to one or more overlays includes information of each overlay.
需要说明的是,本申请实施例中当第一信息或第三信息携带在SEI中时,可以携带在该overlay对应的overlay码流的SEI中。如果服务器向终端发送的是overlay码流,则终端对overlay码流解码播放时可以显示overlay。It should be noted that in the embodiment of the present application, when the first information or the third information is carried in the SEI, it may be carried in the SEI of the overlay code stream corresponding to the overlay. If the server sends an overlay stream to the terminal, the terminal can display the overlay when decoding and playing the overlay stream.
步骤104、终端接收服务器发送的一个或者多个overlay。Step 104: The terminal receives one or more overlays sent by the server.
示例性的,终端可以通过接收装置接收服务器发送的一个或者多个overlay。应理解,服务器发送的一个或者多个overlay可以通过以下方式实现:服务器向终端发送封装后的媒体数据流以及媒体数据流包括的一个或者多个overlay。或者服务器向终端发送封装后的一个或者多个overlay中每个overlay对应的overlay码流。Exemplarily, the terminal may receive one or more overlays sent by the server through a receiving device. It should be understood that the one or more overlays sent by the server may be implemented in the following manner: the server sends the encapsulated media data stream and the one or more overlays included in the media data stream to the terminal. Or, the server sends an overlay code stream corresponding to each overlay in the encapsulated one or more overlays to the terminal.
应理解,如果服务器在包括的一个或者多个overlay的媒体数据流的MPD中携带第一信息,或者在包括的一个或者多个overlay的媒体数据流的MPD中携带第二信息和第三信息时,步骤104中终端还需要接收包括的一个或者多个overlay的媒体数据流的MPD。It should be understood that if the server carries the first information in the MPD of the included media stream of one or more overlays, or carries the second information and the third information in the MPD of the included media stream of one or more overlays In step 104, the terminal also needs to receive the MPD of the included media data stream of one or more overlays.
具体的,当第一信息位于overlay群组box时,终端通过解析overlay的文件格式,可以从文件格式中的overlay群组box获取到群组标识信息。当第三信息位于overlay 群组box时,终端通过解析overlay的文件格式,可以从文件格式中的overlay群组box获取到群组标识信息。Specifically, when the first information is located in the overlay group box, the terminal can obtain the group identification information from the overlay group box in the file format by analyzing the file format of the overlay. When the third information is located in the overlay group box, the terminal can obtain the group identification information from the overlay group box in the file format by analyzing the file format of the overlay.
步骤105、当所述overlay对应第一信息时,所述终端根据所述至少两个overlay的所述第一信息对至少两个overlay进行处理;或者,当overlay对应所述第二信息和所述第三信息时,所述终端根据所述至少两个overlay对应的第二信息和第三信息对所述至少两个overlay进行处理。Step 105: When the overlay corresponds to the first information, the terminal processes at least two overlays according to the first information of the at least two overlays; or, when the overlay corresponds to the second information and the first information, In the third information, the terminal processes the at least two overlays according to the second information and the third information corresponding to the at least two overlays.
具体的,S105可以通过以下方式实现:终端在接收到服务器发送的一个或者多个overlay后,对overlay进行解封装,以得到一个或者多个overlay各自对应的第一信息。或者终端解封装后得到一个或者多个overlay各自对应的第二信息和第三信息。然后终端在对媒体数据进行解码播放时,可在客户端配置或用户界面提示中,包含针对同一个群组中的overlay对应一个操作选项,用于提示该群组中所有overlay可以进行共同操作的操作功能。Specifically, S105 may be implemented in the following manner: After receiving one or more overlays sent by the server, the terminal decapsulates the overlays to obtain first information corresponding to one or more overlays. Or after the terminal is decapsulated, the second information and the third information corresponding to one or more overlays are obtained. Then when the terminal decodes and plays media data, it can include in the client configuration or user interface prompts an operation option corresponding to the overlay in the same group, which is used to prompt all overlays in the group to perform common operations. Operational functions.
应理解,当每个overlay对应第一信息时,终端可以根据每个overlay对应的第一信息确定每个overlay各自的群组,然后可以确定属于同一个群组中的所有overlay。此外,对于交互操作,服务器可以根据每个overlay的OverlayInteraction控制结构确定每个overlay各自对应的交互操作。It should be understood that when each overlay corresponds to the first information, the terminal may determine each group of each overlay according to the first information corresponding to each overlay, and then may determine all overlays belonging to the same group. In addition, for interactive operations, the server may determine the interactive operation corresponding to each overlay according to the OverlayInteraction control structure of each overlay.
当每个overlay对应第三信息和第二信息时,终端可以根据每个overlay对应的第三信息确定每个overlay各自的群组。然后可以确定属于同一个群组中的所有overlay。终端可以根据每个overlay的AssociatedSphereRegionStruct确定每个overlay对应展示或者关闭操作功能。When each overlay corresponds to the third information and the second information, the terminal may determine a respective group of each overlay according to the third information corresponding to each overlay. You can then determine all overlays that belong to the same group. The terminal can determine the corresponding display or close operation function of each overlay according to the AssociatedSphereRegionStruct of each overlay.
一种示例,终端可以通过以下方式确定属于同一个群组中的所有overlay:终端根据每个overlay对应的群组标识信息,将群组标识信息相同的overlay划分至同一个群组中。In one example, the terminal may determine all overlays belonging to the same group in the following manner: The terminal divides the overlays with the same group identification information into the same group according to the group identification information corresponding to each overlay.
例如,overlay1的群组标识信息为群组1、overlay2的群组标识信息为群组2、overlay3的群组标识信息为群组1、overlay4的群组标识信息为群组2。则终端可以确定存在两个群组,即群组1和群组2。For example, the group identification information of overlay1 is group1, the group identification information of overlay2 is group2, the group identification information of overlay3 is group1, and the group identification information of overlay4 is group2. Then, the terminal may determine that there are two groups, that is, group 1 and group 2.
另一种示例,终端可以通过以下方式确定属于同一个群组中的所有overlay:终端根据任一个overlay对应的其他overlay的标识信息,将任一个overlay以及任一个overlay对应的其他overlay的标识信息指示的其他overlay划分至同一个群组中。In another example, the terminal may determine all overlays belonging to the same group by: the terminal indicates the identification information of any overlay and other overlays corresponding to any overlay according to the identification information of any overlay corresponding to any overlay The other overlays are grouped into the same group.
本申请实施例提供一种处理媒体数据的方法,终端通过根据至少两个overlay中每个overlay对应的第一信息,这样可以对群组标识信息相同的一个或者多个overlay进行处理。与现有技术中对一个或者多个overlay进行相同处理时只能逐个处理每个overlay相比,可以降低操作的复杂度,提升用户主观体验。An embodiment of the present application provides a method for processing media data. A terminal can process one or more overlays having the same group identification information by using first information corresponding to each overlay in at least two overlays. Compared with the prior art, in which one or more overlays are processed the same, each overlay can be processed one by one, which can reduce the complexity of operations and improve the user's subjective experience.
作为一种可能的实施例,如图4所示,本申请实施例提供的方法还包括:As a possible embodiment, as shown in FIG. 4, the method provided in this embodiment of the present application further includes:
步骤106、终端显示至少一个群组,以及用于指示至少一个群组中每个群组对应的操作功能的信息和属于每个群组中的overlay。Step 106: The terminal displays at least one group, and information indicating an operation function corresponding to each group in the at least one group and an overlay belonging to each group.
一种可能的实现方式,至少一个群组由至少两个overlay中每个overlay对应的第一信息确定;一个群组对应的操作功能由所述每个群组中的overlay包括的overlay结构确定。In a possible implementation manner, at least one group is determined by first information corresponding to each overlay in at least two overlays; an operation function corresponding to one group is determined by an overlay structure included in the overlay in each group.
另一种可能的实现方式,至少一个群组由所述至少两个overlay中每个overlay对应的第三信息确定,一个群组对应的操作功能由所述群组中的overlay包括的overlay关联区域控制结构确定。In another possible implementation manner, at least one group is determined by third information corresponding to each overlay in the at least two overlays, and an operation function corresponding to one group is an overlay associated area included in the overlay in the group The control structure is determined.
应理解,终端在显示至少一个群组时还可以对接收到的媒体数据流进行解码播放以显示媒体数据。该至少一个群组可以覆盖在媒体数据上显示。It should be understood that, when displaying at least one group, the terminal may also decode and play the received media data stream to display the media data. The at least one group may be displayed overlaid on the media data.
步骤107、当至少一个群组中任一个群组被触发,属于该任一个群组中的所有overlay响应该群组对应的操作功能。或者当任一个overlay被触发,该任一个overlay以及与该任一个overlay属于同一个群组中的其他overlay响应该任一个overlay被触发的操作功能。Step 107: When any group in at least one group is triggered, all overlays belonging to any one group respond to the operation function corresponding to the group. Or when any overlay is triggered, any overlay and other overlays belonging to the same group as any overlay respond to the operation function of any overlay being triggered.
应理解,一个群组对应的操作功能由该群组中所有overlay共同具有的操作功能确定。It should be understood that the operation function corresponding to a group is determined by the operation function shared by all overlays in the group.
如果一个群组中两个或两个以上的overlay具有多个共同的操作功能,则此时,一个群组可以对应多个操作功能,当任一个群组被触发时,该任一个群组中所有overlay响应该群组被触发的操作功能。应理解,如果该群组对应的多个操作功能均被触发,则该任一个群组中的所有overlay响应该多个操作功能。如果该群组对应的多个操作功能中任一个操作功能被触发,则该任一个群组中的所有overlay响应该被触发的任一个操作功能。If two or more overlays in a group have multiple common operating functions, then a group can correspond to multiple operating functions. When any group is triggered, any group All overlays respond to the group's triggered operation functions. It should be understood that if multiple operation functions corresponding to the group are triggered, all overlays in any one group respond to the multiple operation functions. If any one of a plurality of operation functions corresponding to the group is triggered, all overlays in any one group respond to any one of the triggered operation functions.
例如,overlay1和overlay2属于群组1,其中,overlay1和overlay2对应的操作功能为旋转和尺寸缩放。则群组1对应的操作功能也为旋转和尺寸缩放,如果旋转和尺寸缩放均被触发,则overlay1和overlay2响应旋转和尺寸缩放操作。如果被触发的操作功能为旋转,则overlay1和overlay2响应旋转操作。For example, overlay1 and overlay2 belong to group 1, where the operation functions corresponding to overlay1 and overlay2 are rotation and size scaling. The operation function corresponding to group 1 is also rotation and size scaling. If both rotation and size scaling are triggered, overlay1 and overlay2 respond to the rotation and size scaling operations. If the triggered operation function is rotation, overlay1 and overlay2 respond to the rotation operation.
例如,本申请实施例中终端可以为每个群组赋予一个操作选项,该操作选项用于提示该群组中的所有overlay可以响应的操作功能。For example, in the embodiment of the present application, the terminal may assign an operation option to each group, and the operation option is used to prompt an operation function that all overlays in the group can respond to.
应理解,如果一个群组对应多个操作功能则可以为该群组赋予多个操作选项,每个操作选项对应一个操作功能。此外,还可以为每个群组赋予一个用于指示执行所有操作功能的操作选项1。当该操作选项1被触发,如果该群组有多个操作功能,群组中的所有overlay响应多个操作功能。It should be understood that if a group corresponds to multiple operation functions, multiple operation options can be assigned to the group, and each operation option corresponds to one operation function. In addition, each group may be assigned an operation option 1 for instructing execution of all operation functions. When the operation option 1 is triggered, if the group has multiple operation functions, all overlays in the group respond to multiple operation functions.
如果同一个群组中的所有overlay具有多个共同的操作功能,则也可以为每个overlay赋予多个操作选项。具体过程可以参考一个群组对应多个操作功能则可以为该群组赋予多个操作选项的过程,此处不再赘述。If all overlays in the same group have multiple common operating functions, multiple overlay options can also be given to each overlay. For the specific process, refer to the process in which a group corresponds to multiple operation functions, and multiple operation options can be given to the group, which will not be repeated here.
应理解,当用户未触发群组,且用户的操作位于群组上时,可以提示用户该群组中包括的所有overlay以及群组中所有overlay共同具有的操作功能。例如,鼠标位于操作选项上,但是未触发点击操作时,也可以提示用户该群组中包括的所有overlay以及群组中所有overlay共同具有的操作功能。It should be understood that when the user does not trigger the group and the user's operation is on the group, the user may be prompted for all overlays included in the group and the operation functions common to all overlays in the group. For example, when the mouse is on the operation option, but the click operation is not triggered, the user can also be prompted for all overlays included in the group and the operation functions common to all overlays in the group.
示例1-2,对应于示例1-1,作为一种可能的实现方式,步骤105具体可以通过以下方式实现:Example 1-2, corresponding to Example 1-1, as a possible implementation manner, step 105 may be specifically implemented in the following manner:
终端解析每个overlay,以得到每个overlay各自的AssociatedSphereRegionStruct。服务器根据AssociatedSphereRegionStruct确定每个overlay各自的操作功能展示或关闭。终端解析到包含一个或者多个overlay的媒体数据码流中的entity group,获取到 OverlayConditionalShownGroupBox。The terminal parses each overlay to obtain the respective AssociatedSphereRegionStruct of each overlay. The server determines whether each overlay's respective operation function is displayed or closed according to the AssociatedSphereRegionStruct. The terminal parses the entity group in the media data stream containing one or more overlays, and obtains the OverlayConditionalShownGroupBox.
由于该OverlayConditionalShownGroupBox还可以用于指示与一个overlay属于同一个共同展示群组中的其他overlay的标识信息,因此终端可以进一步得到ref_overlay_id。因此,终端便可以确定对该一个overlay以及与该overlay位于同一个共同展示群组中的其他overlay可以共同进行的操作功能是展示或关闭。Since the OverlayConditionalShownGroupBox can also be used to indicate identification information of other overlays that belong to the same common display group as an overlay, the terminal can further obtain ref_overlay_id. Therefore, the terminal can determine whether the operation function that can be performed on the one overlay and other overlays in the same common display group as the overlay is display or close.
此处,作为一种可能的实现方式中,步骤106具体可以通过以下方式(1-1)实现:Here, as a possible implementation manner, step 106 may be specifically implemented in the following manner (1-1):
方式(1-1),终端对媒体数据进行视频解码并在显示界面上播放时,可在客户端配置或用户界面提示中,包含针对共同展示群组进行触发展示或关闭的操作选项。In the method (1-1), when the terminal decodes the media data and plays the video on the display interface, the client configuration or the user interface prompt may include operation options for triggering the display or closing of the common display group.
应理解,媒体数据可以在终端的客户端或者用户界面上显示。It should be understood that the media data may be displayed on the client or user interface of the terminal.
步骤107具体可以通过以下方式(1-2),或者方式(1-3)实现:Step 107 may be specifically implemented in the following manner (1-2) or (1-3):
方式(1-2),当任一个群组被触发,终端显示该任一个群组中的所有overlay。也即该任一个群组中的所有overlay被展示在显示界面。Method (1-2). When any group is triggered, the terminal displays all overlays in the any group. That is, all overlays in the group are displayed on the display interface.
方式(1-3),如果第一群组中的多个overlay被显示在显示界面上时,当第一群组对应的操作选项被触发。或者,任一个overlay被触发,则终端关闭第一群组中的所有overlay。也即第一群组中的所有overlay被取消展示。In the method (1-3), if multiple overlays in the first group are displayed on the display interface, when an operation option corresponding to the first group is triggered. Or, if any overlay is triggered, the terminal closes all overlays in the first group. That is, all overlays in the first group are canceled.
应理解,取消展示的所有overlay位于第一群组中。如图5所示。It should be understood that all overlays that are canceled are in the first group. As shown in Figure 5.
应理解,该操作选项可以以图标或者文字的形式显示在显示界面。当操作选项以图表的形式显示时,当用户的触摸操作或者点击操作位于该图表上,或者该图表附近时,则可以在显示界面上显示出用于提示该操作选项对应功能的文字。It should be understood that the operation option may be displayed on the display interface in the form of icons or text. When the operation option is displayed in the form of a chart, when the user's touch operation or click operation is located on the chart or near the chart, the text used to prompt the corresponding function of the operation option can be displayed on the display interface.
应理解,上述操作选项或者overlay被触发的方式可以为触摸操作或者点击操作。It should be understood that the foregoing operation option or overlay may be triggered by a touch operation or a click operation.
示例性的,以共同展示群组为例,则如图5所示,可以在客户端配置或终端的用户界面提示中针对该共同展示群组内的overlay进行触发展示或关闭展示的操作选项。例如,图5中的群组1对应操作选项1、群组2对应一个操作选项2。图5中以操作选项的显示方式为文字的形式为例。Exemplarily, taking the common display group as an example, as shown in FIG. 5, an operation option of triggering display or closing the display may be performed on the overlay in the common display group in the client configuration or the user interface prompt of the terminal. For example, group 1 in FIG. 5 corresponds to operation option 1 and group 2 corresponds to one operation option 2. In FIG. 5, the display mode of the operation options is text as an example.
具体的,如果在显示界面上仅显示第一群组,未显示第一群组中的所有overlay时,例如,图5则当第一群组对应的操作选项被触发,则终端对第一群组中的所有overlay执行操作功能。Specifically, if only the first group is displayed on the display interface and all overlays in the first group are not displayed, for example, in FIG. 5, when an operation option corresponding to the first group is triggered, the terminal controls the first group. All overlays in a group perform operational functions.
示例性的,以共同展示为例,以第一群组为图5所示的群组1为例,当群组1的操作选项1被触发后,终端在显示界面展示该群组1中的所有overlay。如图6所示,以群组1的操作选项1被触发的方式为触摸操作为例,当群组1的操作选项1被触发时,如图6所示,在显示界面显示群组1包括的:媒体内容1对应的名称1、媒体内容2对应的名称2以及媒体内容3对应的名称3。Exemplarily, the common display is taken as an example, and the first group is taken as the group 1 shown in FIG. 5 as an example. When the operation option 1 of the group 1 is triggered, the terminal displays the information in the group 1 on the display interface. All overlays. As shown in FIG. 6, a touch operation is used as an example for the operation option 1 of the group 1 being triggered. When the operation option 1 of the group 1 is triggered, as shown in FIG. 6, the display of the group 1 includes : Name corresponding to media content 1, name 2 corresponding to media content 2, and name 3 corresponding to media content 3.
可以理解的是,当群组1中的所有overlay展示在显示界面时,当群组1的操作选项对应的关闭操作功能被触发时,则终端关闭群组1包括的所有overlay,并以群组1的方式显示在显示界面。即响应关闭展示的操作功能,此时显示界面可以如图5所示。It can be understood that when all overlays in group 1 are displayed on the display interface, when the close operation function corresponding to the operation option of group 1 is triggered, the terminal closes all overlays included in group 1 and uses the group to 1 way is displayed on the display interface. That is, in response to closing the operation function of the display, the display interface at this time may be as shown in FIG. 5.
具体的,当显示界面未显示一个或者多个overlay时,如果一个群组中包括该一个或者多个可被展示的overlay时。当该群组对应的操作选项被触发,则终端可以响应用户的触发操作,在背景视频上同时展示该一个或者多个overlay。当显示界面上显示一个或者多个overlay时,则当该群组对应的操作选项被触发或者任一个overlay被触发 时,则可以响应用户的触发操作,同时关闭该一个或者多个overlay。Specifically, when one or more overlays are not displayed on the display interface, if the one or more overlays that can be displayed are included in a group. When the corresponding operation option of the group is triggered, the terminal can simultaneously display the one or more overlays on the background video in response to the user's trigger operation. When one or more overlays are displayed on the display interface, when the corresponding operation option of the group is triggered or any overlay is triggered, the one or more overlays can be closed at the same time in response to the user's trigger operation.
示例2-2,对应于示例2-1,作为一种可能的实现方式,步骤105具体可以通过以下方式实现:Example 2-2 corresponds to Example 2-1. As a possible implementation manner, step 105 can be specifically implemented in the following manner:
终端解析每个overlay的overlay结构,以得到每个overlay各自的overlay结构中携带的第一信息,进而根据第一信息确定每个overlay各自的群组标识信息。此外,终端还可以解析每个overlay的overlay结构得到每个overlay各自的操作功能。The terminal parses the overlay structure of each overlay to obtain the first information carried in the overlay structure of each overlay, and then determines the group identification information of each overlay according to the first information. In addition, the terminal can parse the overlay structure of each overlay to obtain the operation function of each overlay.
应理解,如果第一信息中携带的是与该overlay同属于一个群组的其他overlay的标识信息,则终端可以确定与该overlay同属于一个群组的其他overlay。It should be understood that if the first information carries identification information of other overlays that belong to the same group as the overlay, the terminal may determine other overlays that belong to the same group as the overlay.
例如,终端解析包含一个或者多个overlay的媒体数据码流中的entity group。终端可以从entity group中获取到OverlayRelationGroupBox,从而获得每个overlay各自的群组标识信息,和/或,ref_overlay_id。同时,终端还可以根据overlay结构确定overlay的操作功能,进而可以确定属于同一个群组的所有overlay的标识信息。以及所有overlay的OverlayInteraction控制结构中的语法元素。For example, the terminal parses the entity group in the media data stream that contains one or more overlays. The terminal may obtain the OverlayRelationGroupBox from the entity group, thereby obtaining the respective group identification information of each overlay, and / or ref_overlay_id. At the same time, the terminal can also determine the operation function of the overlay according to the overlay structure, and then can determine the identification information of all overlays belonging to the same group. And all the syntax elements in the overlay's OverlayInteraction control structure.
此处,作为另一种可能的实现方式,步骤106具体可以通过以下方式(2-1)实现:Here, as another possible implementation manner, step 106 may be specifically implemented in the following manner (2-1):
方式(2-1),终端在对媒体数据解码播放时,可在客户端配置或用户界面提示中,包含针对共同交互群组进行对应交互操作的操作选项。In the method (2-1), when the terminal decodes and plays the media data, the terminal configuration or the user interface prompt may include operation options corresponding to a common interaction group for a corresponding interaction operation.
示例性的,当第一群组的任一个overlay被触发OverlayRelationGroupBox语义对应的操作功能时,终端根据该任一个overlay对应的ref_overlay_id,或者所有overlay的标识信息,确定该第一群组中的所有overlay。然后终端对该第一群组中所有overlay执行共同的操作功能。Exemplarily, when any overlay of the first group is triggered with an operation function corresponding to the OverlayRelationGroupBox semantics, the terminal determines all overlays in the first group according to the ref_overlay_id or the identification information of all overlays corresponding to the overlay. . The terminal then performs a common operation function on all overlays in the first group.
例如,OverlayRelationGroupBox语义对应的操作功能为尺寸缩放,则当第一群组的操作选项被触发,则该第一群组中的所有overlay响应尺寸缩放操作。如果,该第一群组中的所有overlay显示在显示界面上,如果该第一群组中的任一个overlay对应的尺寸缩放功能被触发,则被触发的任一个overlay响应尺寸缩放操作时,第一群组中的其他未被触发的overlay也响应尺寸缩放操作。For example, the operation function corresponding to the OverlayRelationGroupBox semantics is size scaling. When the operation option of the first group is triggered, all overlays in the first group respond to the size scaling operation. If all overlays in the first group are displayed on the display interface, if the size scaling function corresponding to any of the overlays in the first group is triggered, then any of the triggered overlays responds to the size scaling operation. Other untriggered overlays in a group also respond to size scaling operations.
示例3-2,对应于示例3-1,作为一种可能的实现方式,步骤105具体可以通过以下方式实现:Example 3-2 corresponds to Example 3-1. As a possible implementation manner, step 105 can be specifically implemented in the following manner:
终端对一个或者多个overlay的NALU解码可以获取到该一个或者多个overlay中每个overlay码流各自包含的SEI。SEI荷载类型为OLG代表的值时,表示该SEI中携带共同展示群组消息。终端继续解码SEI获得overlay_conditional_shown_group_id,或者终端继续解码SEI获得每个overlay对应的ref_overlay_id。对该部分解码完成后,终端便可以根据每个overlay对应的overlay_conditional_shown_group_id,或者ref_overlay_id确定属于同一个共同展示群组的所有overlay。此外,终端寻找并解析到每个overlay的overlay控制结构中的AssociatedSphereRegionStruct,获知该overlay是进行用户触发展示或关闭展示的。The terminal decodes the NALU of one or more overlays to obtain the SEI contained in each overlay stream in the one or more overlays. When the SEI payload type is a value represented by OLG, it means that the SEI carries a common display group message. The terminal continues to decode the SEI to obtain the overlay_conditional_shown_group_id, or the terminal continues to decode the SEI to obtain the ref_overlay_id corresponding to each overlay. After decoding the part, the terminal can determine all overlays belonging to the same common display group according to the overlay_conditional_shown_group_id or ref_overlay_id corresponding to each overlay. In addition, the terminal searches for and resolves to the AssociatedSphereRegionStruct in the overlay control structure of each overlay, and learns that the overlay is triggered by the user or turned off.
具体的,对于示例3-2处步骤106和步骤107的具体实现方式可以参考示例1-2处步骤106和步骤107的描述,此处不再赘述。Specifically, for the specific implementation of steps 106 and 107 in Example 3-2, reference may be made to the description of steps 106 and 107 in Example 1-2, and details are not described herein again.
示例4-2,对应于示例4-1,作为一种可能的实现方式,步骤105具体可以通过以下方式实现:Example 4-2 corresponds to Example 4-1. As a possible implementation manner, step 105 may be specifically implemented in the following manner:
终端对一个或者多个overlay的NALU解码可以获取到该一个或者多个overlay中每个overlay码流各自包含的SEI。当SEI荷载类型为OLG代表的值时,表示该SEI携带共同交互群组消息。终端对SEI继续解码可获得overlay_relation_group_id,表示该overlay的共同交互群组的ID号或者终端继续解码SEI获得ref_overlay_id。对该部分解码完成后,终端可以根据获取到的每个overlay对应的overlay_relation_group_id,或者每个overlay对应的ref_overlay_id确定属于同一个共同交互群组的所有overlay。The terminal decodes the NALU of one or more overlays to obtain the SEI contained in each overlay stream in the one or more overlays. When the SEI payload type is a value represented by OLG, it means that the SEI carries a common interaction group message. The terminal continues to decode the SEI to obtain the overlay_relation_group_id, which indicates the ID number of the common interaction group of the overlay or the terminal continues to decode the SEI to obtain the ref_overlay_id. After decoding the part, the terminal may determine all overlays belonging to the same common interaction group according to the obtained overlay_relation_group_id corresponding to each overlay or ref_overlay_id corresponding to each overlay.
具体的,对于示例4-2处步骤106和步骤107的具体实现方式可以参考示例2-2处步骤106和步骤107的描述,此处不再赘述。Specifically, for the specific implementation of steps 106 and 107 in Example 4-2, reference may be made to the description of steps 106 and 107 in Example 2-2, and details are not described herein again.
示例5-2,对应于示例5-1,作为一种可能的实现方式,步骤105具体可以通过以下方式实现:Example 5-2 corresponds to Example 5-1. As a possible implementation manner, step 105 can be specifically implemented in the following manner:
终端获取一个或者overlay对应的MPD,通过解析每个overlay对应的MPD得到adaptation set级的属性并获取到overlay描述字及其属性的值。然后便可以获取到每个overlay所在的群组标识信息或者与其属于同一个群组的其他overlay的标识信息。此外,终端在对每个overlay解封装时可以通过解析每个overlay的overlay结构中的AssociatedSphereRegionStruct,获知该overlay的操作功能为进行触发展示或关闭展示的。The terminal obtains one or an MPD corresponding to the overlay, and obtains an adaptation-level attribute by parsing the MPD corresponding to each overlay and obtains the overlay descriptor and the value of the attribute. Then, the identification information of the group to which each overlay belongs or the identification information of other overlays belonging to the same group can be obtained. In addition, when the terminal decapsulates each overlay, it can learn that the operation function of the overlay is to trigger the display or close the display by analyzing the AssociatedSphereRegionStruct in the overlay structure of each overlay.
此外,如果overlay结构中不具有AssociatedSphereRegionStruct时,终端可以根据overlay的群组操作功能为展示或者关闭确定其操作功能。In addition, if there is no AssociatedSphereRegionStruct in the overlay structure, the terminal can determine its operation function for the display or shutdown according to the group operation function of the overlay.
具体的,对于示例5-2处步骤106和步骤107的具体实现方式可以参考示例1-2处步骤106和步骤107的描述,此处不再赘述。Specifically, for the specific implementation of steps 106 and 107 in Example 5-2, reference may be made to the description of steps 106 and 107 in Example 1-2, and details are not described herein again.
示例6-2,对应于示例6-1,作为一种可能的实现方式,步骤105具体可以通过以下方式实现:Example 6-2 corresponds to Example 6-1. As a possible implementation manner, step 105 may be specifically implemented in the following manner:
终端获取一个或者overlay对应的MPD,通过解析每个overlay对应的MPD得到adaptation set级的属性并获取到overlay描述字及其属性的值。然后便可以获取到每个overlay所在的群组标识信息或者与其属于同一个群组的其他overlay的标识信息。以及根据每个overlay中overlay结构确定每个overlay具有的交互操作。The terminal obtains one or an MPD corresponding to the overlay, and obtains an adaptation-level attribute by parsing the MPD corresponding to each overlay and obtains the overlay descriptor and the value of the attribute. Then, the identification information of the group to which each overlay belongs or the identification information of other overlays belonging to the same group can be obtained. And according to the overlay structure in each overlay, determine the interactive operation that each overlay has.
具体的,对于示例6-2处步骤106和步骤107的具体实现方式可以参考示例2-2处步骤106和步骤107的描述,此处不再赘述。Specifically, for the specific implementation of steps 106 and 107 in Example 6-2, reference may be made to the description of steps 106 and 107 in Example 2-2, and details are not described herein again.
需要说明的是,对于示例3-2、示例4-2、示例5-2以及示例6-2,终端可以将群组标识信息相同的overlay划分至同一个群组中。或者终端根据每个overlay中携带的指示与其属于同一个群组中的其他overlay的标识信息,将其化为同一个群组。It should be noted that, for Example 3-2, Example 4-2, Example 5-2, and Example 6-2, the terminal may divide the overlay with the same group identification information into the same group. Or, the terminal converts it into the same group according to the identification information carried in each overlay and the identification information of other overlays that belong to the same group.
虽然多个overlay可以位于同一个群组中,当该群组中的任一个overlay响应该overlay对应的操作功能时,同一个群组中的所有overlay可以响应该任一个overlay具有的操作功能。或者,仅有该任一个overlay响应被触发的操作功能。因此,对应于场景2,本申请实施例可以限定对一个overlay所在群组进行群组操作的条件。Although multiple overlays can be located in the same group, when any overlay in the group responds to the operation function corresponding to the overlay, all overlays in the same group can respond to the operation function of any overlay. Or, there is only an operation function for which any of the overlay responses is triggered. Therefore, corresponding to scenario 2, the embodiment of the present application can define conditions for performing a group operation on a group where an overlay is located.
一种可能的实现方式,本申请实施例中的一个或者多个overlay还包括第四信息,第四信息用于指示overlay响应第一操作功能的情况下,overlay所在群组中的所有overlay响应第一操作功能(也即群组操作)。可以理解,第一操作功能为overlay具有的多个操作功能的任一个。需要说明的是,第一操作功能为overlay对应的多个操作 功能中被触发的任一个操作功能。In a possible implementation manner, one or more overlays in the embodiment of the present application further include fourth information, where the fourth information is used to indicate that in the case where the overlay responds to the first operation function, all overlays in the group to which the overlay belongs respond to An operation function (ie group operation). It can be understood that the first operation function is any one of a plurality of operation functions of the overlay. It should be noted that the first operation function is any operation function that is triggered among a plurality of operation functions corresponding to the overlay.
其中,群组操作指:在对一个群组中的任一个overlay执行操作功能的情况下,overlay所在群组中的所有overlay执行该overlay被触发的操作功能。例如,overlay1和overlay2位于群组1中,当overlay1被触发,则overlay1和overlay2共同响应该overlay1被触发的操作功能。Wherein, the group operation refers to: in the case of performing an operation function on any overlay in a group, all overlays in the group to which the overlay belongs perform the operation function triggered by the overlay. For example, overlay1 and overlay2 are located in group 1. When overlay1 is triggered, overlay1 and overlay2 jointly respond to the operation function of overlay1 being triggered.
相应的,本申请实施例中的步骤105具体可以通过以下方式实现:终端根据所述至少两个overlay的所述第一信息和所述第四信息对所述至少两个overlay进行处理。具体的,若overlay对应第四信息,则步骤106还包括:终端在客户端配置或用户界面提示中,可包含对该overlay所在群组执行群组操作的操作选项。Correspondingly, step 105 in the embodiment of the present application may be specifically implemented in the following manner: the terminal processes the at least two overlays according to the first information and the fourth information of the at least two overlays. Specifically, if the overlay corresponds to the fourth information, step 106 further includes that the terminal may include an operation option for performing a group operation on the group to which the overlay belongs in a client configuration or a user interface prompt.
则此时步骤107的具体实现方式可以参考上述描述,即终端以群组为粒度对群组中的所有overlay执行被触发的操作功能,此处不再赘述。即任一个overlay被触发,该任一个overlay对应的群组中的所有overlay响应该任一个overlay被触发的操作功能。For a specific implementation of step 107 at this time, reference may be made to the foregoing description, that is, the terminal uses the group as a granularity to perform triggered operation functions on all overlays in the group, and details are not described herein again. That is, any overlay is triggered, and all overlays in the group corresponding to any overlay respond to the operation function of any overlay being triggered.
另一种可能的实现方式中,本申请实施例中的一个或者多个overlay还包括:第五信息。该第五信息用于指示overlay响应第一操作功能的情况下,所述overlay所在群组中的所有overlay响应第一操作功能(例如,群组操作),或者overlay响应第一操作功能(单独操作)。In another possible implementation manner, one or more overlays in the embodiments of the present application further include: fifth information. The fifth information is used to indicate that in the case where the overlay responds to the first operation function, all overlays in the group to which the overlay belongs respond to the first operation function (for example, group operation), or the overlay responds to the first operation function (single operation ).
相应的,步骤105具体可以通过以下方式实现:终端根据至少两个overlay的第一信息和第五信息对至少两个overlay进行处理。Accordingly, step 105 may be specifically implemented in the following manner: the terminal processes the at least two overlays according to the first information and the fifth information of the at least two overlays.
相应的步骤106具体还包括:终端显示该至少一个群组对应的执行群组操作的操作选项和执行单独操作的操作选项。The corresponding step 106 further specifically includes: the terminal displaying an operation option corresponding to the at least one group to perform a group operation and an operation option to perform an individual operation.
相应的步骤107还包括:如果单独操作被触发,当任一个overlay被触发,则该任一个overlay响应被触发的操作功能。如果群组操作被触发,当任一个overlay被触发,则该任一个overlay以及与该任一个overlay属于同一个群组的其他overlay响应被触发的操作功能。The corresponding step 107 further includes: if an individual operation is triggered, when any overlay is triggered, any overlay responds to the triggered operation function. If a group operation is triggered, when any overlay is triggered, any overlay and other overlays belonging to the same group as the overlay respond to the triggered operation function.
可以理解的时,同一个群组中的所有overlay均具有第四信息。或者,同一个群组中的所有overlay均具有第五信息。Understandably, all overlays in the same group have fourth information. Alternatively, all overlays in the same group have fifth information.
其中,单独操作指:在对一个群组中的任一个overlay执行操作功能的情况下,该任一个overlay响应被触发的操作功能。例如,overlay1被触发,则群组1中的其他overlay不响应overlay1对应的操作功能,仅overlay1响应overlay1的操作功能。The individual operation refers to: in the case of performing an operation function on any overlay in a group, the any overlay responds to the triggered operation function. For example, when overlay1 is triggered, other overlays in group 1 do not respond to the operation function corresponding to overlay1, and only overlay1 responds to the operation function of overlay1.
示例性的,第四信息或第五信息可以携带在媒体数据流的MPD、overlay码流的SEI或者文件格式中。Exemplarily, the fourth information or the fifth information may be carried in an MPD of a media data stream, an SEI of an overlay code stream, or a file format.
应理解,如果overlay中携带第四信息,则终端可以确定该overlay所在的群组仅有以群组操作为粒度的权限。如果overlay中携带第五信息,则终端可以确定该overlay所在的群组有以群组操作和单独操作为粒度的权限。具体以哪种粒度操作取决于用户的选择。It should be understood that if the fourth information is carried in the overlay, the terminal may determine that the group to which the overlay belongs has only the permission with the granularity of the group operation. If the overlay carries the fifth information, the terminal may determine that the group to which the overlay belongs has permissions with group operations and individual operations as the granularity. The granularity of operation depends on the user's choice.
示例2-3,结合上述示例2-1,例如,第四信息或第五信息位于overlay结构中定义的entity group中。下述以第三信息和第四信息为条件类型(condition_type)为例。Example 2-3, combined with the above example 2-1, for example, the fourth information or the fifth information is located in the entity group defined in the overlay structure. The following uses the third information and the fourth information as a condition type (condition_type) as an example.
例如,在OMAF标准文件格式中entity group里新定义基于overlay进行共同交互的群组box为OverlayRelationGroupBox,然后在其语法结构中定义condition_type,语 法结构如表16所示:For example, in the entity file group in the OMAF standard file format, a new group box for mutual interaction based on overlays is OverlayRelationGroupBox, and then the condition_type is defined in its syntax structure. The syntax structure is shown in Table 16:
表16Table 16
Figure PCTCN2019108514-appb-000020
Figure PCTCN2019108514-appb-000020
其中,condition_type用于指示用户对该群组进行某一类共同操作的条件。Among them, condition_type is used to indicate a condition for the user to perform a certain type of common operation on the group.
应理解,condition_type可以具有不同的值,condition_type值不同,表示该overlay的群组具有的权限不同。例如,condition_type值为0,表示具有群组操作权限。即该群组中任一个overlay被触发时,该群组中的其他overlay也响应该任一个overlay的操作功能。condition_type值为1,表示该群组具有群组操作和单独操作权限。如果群组操作被触发,则该群组中任一个overlay被操作时,该群组中的其他overlay码也响应该任一个overlay被触发的操作功能。如果单独操作被触发,则该群组中任一个overlay被操作时,则只有该任一个overlay响应被触发的操作功能。It should be understood that condition_type may have different values, and different condition_type values indicate that the groups of the overlay have different permissions. For example, the condition_type value is 0, which means that it has group operation authority. That is, when any overlay in the group is triggered, other overlays in the group also respond to the operation function of any overlay. The condition_type value is 1, indicating that the group has group operation and individual operation permissions. If a group operation is triggered, when any overlay in the group is operated, other overlay codes in the group also respond to the operation function of any of the overlays being triggered. If an individual operation is triggered, when any overlay in the group is operated, only any overlay responds to the triggered operation function.
示例性的,同一个群组中的所有个overlay对应的condition_type的值相同。Exemplarily, the values of condition_type corresponding to all overlays in the same group are the same.
应理解,如果overlay对应第四信息或第五信息,则终端解析到overlay结构中的entity group,便可以获取第四信息或第五信息。It should be understood that if the overlay corresponds to the fourth information or the fifth information, the terminal can obtain the fourth information or the fifth information by parsing to the entity group in the overlay structure.
示例性的,若condition_type=1,客户端配置或用户界面提示中,可包含overlay执行群组操作的操作选项,也可以包括执行单独操作的操作选项。若condition_type=1,则当群组操作的操作选项未被触发时,仅对被触发的overlay响应交互操作。当群组操作对应的操作选项被触发,该群组中的任一个overlay触发OverlayRelationGroupBox语义对应的操作功能时,根据该群组中的所有ref_overlay_id,该群组中所有overlay将共同响应被触发的overlay的交互操作。Exemplarily, if condition_type = 1, the client configuration or the user interface prompt may include an operation option for performing overlay group operations or an operation option for performing individual operations. If condition_type = 1, when the operation option of the group operation is not triggered, only interactive operations are responded to the triggered overlay. When the operation option corresponding to the group operation is triggered, and any overlay in the group triggers the operation function corresponding to the OverlayRelationGroupBox semantics, according to all ref_overlay_id in the group, all overlays in the group will jointly respond to the triggered overlay. Interaction.
示例4-3,结合上述示例4-1,例如,第四信息或第五信息位于overlay码流SEI中。即在一个或者多个overlay中每个overlay对应的overlay码流的SEI中定义condition_type,语法结构如表17所示:Example 4-3, combined with the above example 4-1, for example, the fourth information or the fifth information is located in the overlay code stream SEI. That is, the condition_type is defined in the SEI of the overlay stream corresponding to each overlay in one or more overlays, and the syntax structure is shown in Table 17:
表17Table 17
Figure PCTCN2019108514-appb-000021
Figure PCTCN2019108514-appb-000021
需要说明的是,无论第四信息或第五信息位于每个overlay对应的overlay码流SEI中或者包括一个或者多个overlay的媒体数据的描述文件或者文件格式中,步骤105- 步骤107的具体实现方式可以参考上述实施例中的描述,此处不再赘述。It should be noted that whether the fourth information or the fifth information is located in the overlay code stream SEI corresponding to each overlay or the description file or file format of the media data including one or more overlays, the specific implementation of step 105 to step 107 For the manner, refer to the description in the foregoing embodiment, and details are not described herein again.
示例6-3,结合上述示例6-1,例如,第四信息或第五信息位于MPD中。具体的,第四信息和第五信息可以和群组标识信息共同位于overlay的overlay描述字中。Example 6-3, combined with the above example 6-1, for example, the fourth information or the fifth information is located in the MPD. Specifically, the fourth information and the fifth information may be located in the overlay description word of the overlay together with the group identification information.
例如,在一个或者多个overlay对应的MPD中定义overlay_relation_group_id和condition_type,语法结构如表18所示:For example, overlay_relation_group_id and condition_type are defined in the MPD corresponding to one or more overlays, and the syntax structure is shown in Table 18:
表18Table 18
Figure PCTCN2019108514-appb-000022
Figure PCTCN2019108514-appb-000022
具体的,condition_type的定义和取值可以参考上述表16处的描述,此处不再赘述。Specifically, for the definition and value of condition_type, reference may be made to the description in Table 16 above, which is not repeated here.
示例性的,表19示出了在MPD文件中携带condition_type,语法结构,如表19所示:Exemplarily, Table 19 shows the condition_type and syntax structure carried in the MPD file, as shown in Table 19:
Figure PCTCN2019108514-appb-000023
Figure PCTCN2019108514-appb-000023
Figure PCTCN2019108514-appb-000024
Figure PCTCN2019108514-appb-000024
Figure PCTCN2019108514-appb-000025
Figure PCTCN2019108514-appb-000025
在表19中,描述了总共两个可进行共同交互的群组,分别为群组1和群组2,其各自包括两个overlay。群组1中包括overlay1和overlay2,且该群组1的condition_type=0,表示对该群组1中的overlay1或overlay2进行任一类交互操作时,该群组1中所有overlay均同时响应该交互操作。群组2中包括overlay3和overlay4,该群组2中的condition_type=1,表示只有在用户选中群组2执行群组操作时,该群组2中的所有overlay响应共同的交互操作,否则仅有被触发的overlay响应交互操作。In Table 19, a total of two groups that can be interacted together are described, namely, group 1 and group 2, each of which includes two overlays. Group 1 includes overlay1 and overlay2, and the condition_type = 0 of group 1 indicates that when any type of interactive operation is performed on overlay1 or overlay2 in group 1, all overlays in group 1 respond to the interaction at the same time operating. Group 2 includes overlay 3 and overlay 4. Condition_type = 1 in group 2 means that only when the user selects group 2 to perform a group operation, all overlays in the group 2 respond to a common interactive operation, otherwise only The triggered overlay responds to interactive operations.
由于,一个overlay可以对应一个或者多个群组,作为本申请的另一个实施例,本申请实施例中overlay的群组标识信息包括overlay的一个或者多个群组标识信息。即可以在overlay的overlay结构、描述文件或者SEI中指示该overlay对应多个群组。此时,第一信息还用于指示overlay对应的群组的数量。Since an overlay may correspond to one or more groups, as another embodiment of the present application, the group identification information of the overlay in this embodiment of the present application includes one or more group identification information of the overlay. That is, the overlay structure, description file, or SEI of the overlay can be used to indicate that the overlay corresponds to multiple groups. At this time, the first information is also used to indicate the number of groups corresponding to the overlay.
示例A,当指示overlay对应多个群组的信息存在于overlay结构时,则每个overlay所在的overlay结构如表20所示:Example A, when the information indicating that the overlay corresponds to multiple groups exists in the overlay structure, the overlay structure where each overlay is located is shown in Table 20:
表20Table 20
Figure PCTCN2019108514-appb-000026
Figure PCTCN2019108514-appb-000026
其中,overlay_relation_group_number表示该overlay所属的群组数量。Among them, overlay_relation_group_number indicates the number of groups to which the overlay belongs.
overlay_relation_group_id[i]表示该overlay所在的第i个群组的ID号。overlay_relation_group_id [i] represents the ID number of the i-th group in which the overlay is located.
在表20所示的情况下,终端在对媒体数据解码播放时,客户端配置或用户界面提示中可设置针对相同群组overlay进行共同操作的操作选项,不同的群组有不同的操作选项。当任一个群组中的overlay被触发时,该任一个群组中所有overlay将共同响应被触发的overlay被触发的操作功能。若一个overlay处于多个群组,则该overlay将依次响应用户针对该overlay所在的不同群组的操作。In the case shown in Table 20, when the terminal decodes and plays the media data, the client configuration or the user interface prompt may set operation options for common operations on the same group overlay, and different groups have different operation options. When the overlay in any one group is triggered, all overlays in any one group will jointly respond to the triggered operation function of the triggered overlay. If an overlay is in multiple groups, the overlay will respond to user operations for different groups in which the overlay is located in turn.
当overlay的群组标识信息包括overlay的一个或者多个群组标识信息时,每个overlay的overlay结构中还可以具有condition_type的语法元素,如下表21所示:When the group identification information of the overlay includes one or more group identification information of the overlay, the overlay structure of each overlay may also have a syntax element of condition_type, as shown in Table 21 below:
表21Table 21
Figure PCTCN2019108514-appb-000027
Figure PCTCN2019108514-appb-000027
在表21的情况下,若某个overlay处于多个群组,则该overlay将按照触发类型信息condition_type规定的条件,依次响应用户针对该overlay所在的不同群组的操作。In the case of Table 21, if an overlay is in multiple groups, the overlay will sequentially respond to user operations for different groups in which the overlay is located according to the conditions specified by the trigger type information condition_type.
示例B,当指示overlay对应多个群组的信息存在于overlay码流的SEI时,则每个overlay码流对应的SEI语法结构如表22所示:Example B. When the information indicating that multiple groups correspond to the overlay exists in the SEI of the overlay stream, the SEI syntax structure corresponding to each overlay stream is shown in Table 22:
表22Table 22
Figure PCTCN2019108514-appb-000028
Figure PCTCN2019108514-appb-000028
应理解,表22情况下,终端对一个overlay对应多个群组时的操作方式可以参考表20处的描述,此处不再赘述。It should be understood that, in the case of Table 22, the operation manner of the terminal when one overlay corresponds to multiple groups may refer to the description in Table 20, and details are not described herein again.
可选的,表22所示的SEI语法中还可以具有condition_type的语法元素,如下表23所示:Optionally, the SEI syntax shown in Table 22 may also have a syntax element of condition_type, as shown in Table 23 below:
表23Table 23
Figure PCTCN2019108514-appb-000029
Figure PCTCN2019108514-appb-000029
应理解,表23所示的情况下,终端对一个overlay对应多个群组以及具有condition_type的语法元素的操作可以参考表21处的描述,此处不再赘述。It should be understood that, in the case shown in Table 23, operations performed by the terminal on an overlay corresponding to multiple groups and syntax elements with condition_type can refer to the description in Table 21, and are not repeated here.
示例C,以指示overlay对应多个群组的信息描述文件为例,例如多个群组的信息位于OVLY描述字中,则每个overlay对应的OVLY描述字如表24所示:For example C, an information description file indicating multiple groups corresponding to the overlay is taken as an example. For example, the information of multiple groups is located in the OVLY description word, and the OVLY description word corresponding to each overlay is shown in Table 24:
表24Table 24
Figure PCTCN2019108514-appb-000030
Figure PCTCN2019108514-appb-000030
Figure PCTCN2019108514-appb-000031
Figure PCTCN2019108514-appb-000031
其中,具有相同overlay_relation_group_id值的adaptation set属于同一个群组,同一个overlay可以属于多个不同的群组。Among them, the adaptation sets having the same overlay_relation_group_id value belong to the same group, and the same overlay can belong to multiple different groups.
overlay所属的群组数量由overlay_relation_group_number指定,overlay_relation_group_id则指示了该overlay属于的群组ID号。The number of groups to which the overlay belongs is specified by overlay_relation_group_number, and overlay_relation_group_id indicates the ID number of the group to which the overlay belongs.
此外,表24中还示出了多个群组时,OVLY描述字中具有condition_type。其中,condition_type是对应于overlay_relation_group_id的群组进行共同交互的条件类型。In addition, when multiple groups are shown in Table 24, the OVLY descriptor has condition_type. Among them, condition_type is a condition type in which the groups corresponding to the overlay_relation_group_id interact together.
应理解,表24情况下,终端对一个overlay对应多个群组以及具有condition_type的语法元素的操作可以参考表20和表21处的描述,此处不再赘述。It should be understood that, in the case of Table 24, the operations of the terminal on one overlay corresponding to multiple groups and syntax elements with condition_type can refer to the descriptions in Tables 20 and 21, and are not repeated here.
示例性的,表25示出了在MPD中具有多个群组标识信息和condition_type的语法元素的具体示例,如表25所示:Exemplarily, Table 25 shows a specific example of syntax elements with multiple group identification information and condition_type in the MPD, as shown in Table 25:
表25Table 25
Figure PCTCN2019108514-appb-000032
Figure PCTCN2019108514-appb-000032
Figure PCTCN2019108514-appb-000033
Figure PCTCN2019108514-appb-000033
表25中,描述了三个群组,分别为群组1、群组2和群组3,其中群组1中包括overlay1和overlay3,群组2中包括overlay1,群组3包含overlay2,overlay4。In Table 25, three groups are described, which are group 1, group 2, and group 3, where group 1 includes overlay 1 and overlay 3, group 2 includes overlay 1, and group 3 includes overlay 2 and overlay 4.
需要说明的是,当一个overlay位于多个群组时,多个群组对应不同的操作选项。如果该overlay具有第四信息,以overlay位于第二群组和第三群组为例。当第二群组中的该overlay被触发的情况下,终端对第二群组中的所有overlay执行第二群组对应 的操作功能,以及对第三群组中的所有overlay执行第三群组对应的操作功能。也即第二群组中的所有overlay响应第二群组的操作功能,以及第三群组中的所有overlay响应第三群组对应的操作功能。It should be noted that when an overlay is located in multiple groups, the multiple groups correspond to different operation options. If the overlay has fourth information, the overlay is located in the second group and the third group as an example. When the overlay in the second group is triggered, the terminal performs an operation function corresponding to the second group on all overlays in the second group, and performs a third group on all overlays in the third group. Corresponding operation function. That is, all overlays in the second group respond to the operation function of the second group, and all overlays in the third group respond to the operation function corresponding to the third group.
如果该overlay具有第五信息,若群组操作被触发,则当第二群组中的该overlay被触发的情况下,终端对第二群组中的所有overlay执行第二群组对应的操作功能(也即第二群组中的所有overlay响应第二群组对应的操作功能),以及对第三群组中的所有overlay执行第三群组对应的操作功能(也即第三群组中的所有overlay响应第三群组对应的操作功能)。若单独操作被触发,则当第二群组中的该overlay被触发的情况下,终端对该overlay执行第二群组对应的操作功能,以及在第三群组中对overlay执行第三群组对应的操作功能。If the overlay has the fifth information, if the group operation is triggered, when the overlay in the second group is triggered, the terminal performs an operation function corresponding to the second group on all overlays in the second group. (That is, all overlays in the second group respond to the operation functions corresponding to the second group), and perform the operation functions corresponding to the third group on all overlays in the third group (that is, the All overlays respond to the operation function corresponding to the third group). If a separate operation is triggered, when the overlay in the second group is triggered, the terminal performs an operation function corresponding to the second group on the overlay, and performs a third group on the overlay in the third group. Corresponding operation function.
又一种可能的示例,本申请实施例中的一个或者多个overlay中每个overlay对应第一信息时,所述第一信息还包括第一指示信息,overlay所在的群组对应的操作功能由第一指示信息指示的操作类型确定。In another possible example, when each overlay in one or more overlays in the embodiment of the present application corresponds to first information, the first information further includes first indication information, and an operation function corresponding to a group in which the overlay is located is The operation type indicated by the first instruction information is determined.
具体的,该第一指示信息用于指示交互操作的类型(overlay_interaction_type)。即用于指示该overlay对应的具体操作功能。Specifically, the first indication information is used to indicate a type of an interaction operation (overlay_interaction_type). That is, it is used to indicate the specific operation function corresponding to the overlay.
以第一信息携带在overlay码流的SEI中为例,则表26示出了SEI语法中携带第一指示信息的示例。Taking the first information carried in the SEI of the overlay code stream as an example, Table 26 shows an example of the first indication information carried in the SEI syntax.
表26Table 26
overlay_relation_group_info(payloadSize){overlay_relation_group_info (payloadSize) { DescriptorDescriptor
overlay_relation_group_idoverlay_relation_group_id  Zh
overlay_interaction_typeoverlay_interaction_type  Zh
}}  Zh
其中,overlay_interaction_type的值指示当前overlay的群组能够进行共同交互操作的类型。一种表示方式是按bit位进行指示,如下表27所示:The value of overlay_interaction_type indicates the type of the group that the current overlay can perform common interaction operations on. One representation is to indicate by bit, as shown in Table 27 below:
表27Table 27
Figure PCTCN2019108514-appb-000034
Figure PCTCN2019108514-appb-000034
Figure PCTCN2019108514-appb-000035
Figure PCTCN2019108514-appb-000035
当overlay_interaction_type的某个bit位有值时,代表该overlay可以以群组操作方式执行该bit位对应的交互操作。When a bit of the overlay_interaction_type has a value, it means that the overlay can perform an interaction operation corresponding to the bit in a group operation mode.
需要说明的是,位于同一个群组中的overlay对应的overlay_interaction_type的值必须相同。It should be noted that the values of the overlay_interaction_type corresponding to the overlays in the same group must be the same.
应理解,表27仅是示例出了部分交互操作的类型,overlay的共同交互操作的类型可不限于上表27所示。It should be understood that Table 27 only exemplifies the types of partial interaction operations, and the types of common interaction operations of the overlay may not be limited to those shown in Table 27 above.
需要说明的是,终端在解析到overlay_interaction_type之后,基于overlay_interaction_type的值确定可以进行群组操作的交互操作的类型。例如,终端确定同一个群组中的所有overlay对应的overlay_interaction_type的值为表24中Bit位索引6时,则终端确定该群组所具有的操作功能为可共同进行旋转。如果终端确定一个或者多个overlay属于多个群组,则终端为每个群组对应一个第一指示信息所指示的交互操作的类型对应的操作选项。当任一个群组对应的操作选项被触发时,该任一个群组中中的所有overlay根据第一指示信息所指示的操作功能操作任一个群组中的所有overlay。It should be noted that after the terminal resolves to overlay_interaction_type, the terminal determines the type of interaction operation that can be performed in a group operation based on the value of overlay_interaction_type. For example, when the terminal determines that the values of the overlay_interaction_type corresponding to all overlays in the same group are the bit index 6 in Table 24, the terminal determines that the operation function possessed by the group can be rotated together. If the terminal determines that one or more overlays belong to multiple groups, the terminal corresponds to each group with an operation option corresponding to a type of the interactive operation indicated by the first indication information. When an operation option corresponding to any one group is triggered, all overlays in any one group operate all overlays in any one group according to the operation function indicated by the first instruction information.
在表27所示的情况下,终端对一个或者多个overlay码流进行NALU解码可以获取到overlay码流包含的SEI,当SEI荷载类型为OLG代表的值时,表示该SEI为overlay共同交互群组消息。终端继续解码可得到SEI中的overlay_relation_group_id与该overlay属于同一个群组的其他overlay的标识信息,,以及overlay_interaction_type值,即获得该overlay的共同交互群组的ID号,以及可以进行群组交互的类型。终端对该部分解码完成后,获取所有overlay的群组标识信息,确定属于同一个群组的overlay。此外,终端还可以根据overlay_interaction_type值确定可以进行交互操作的类型。客户端配置或用户界面提示中可设置针对具有相同ID号的overlay作为一组的用户交互选项信息,并依据overlay_interaction_type值指定能够进行群组操作的类型。为不同ID号的overlay组给出不同的交互选项。当用户点击或激活对应ID号的选项时,即可对ID号对应组的overlay依据指定群组操作类型进行同时操作。In the situation shown in Table 27, the terminal performs NALU decoding on one or more overlay code streams to obtain the SEI contained in the overlay code stream. When the SEI payload type is a value represented by OLG, it indicates that the SEI is an overlay common interaction group Group of messages. The terminal continues to decode to obtain the identification information of the overlay_relation_group_id in the SEI and other overlays that belong to the same group, and the overlay_interaction_type value, that is, the ID number of the common interaction group of the overlay and the type of group interaction . After the terminal decodes the part, it obtains the group identification information of all overlays, and determines that the overlays belong to the same group. In addition, the terminal can also determine the type of interaction that can be performed according to the overlay_interaction_type value. The client configuration or user interface prompt can set user interaction option information for overlays with the same ID number as a group, and specify the type of group operation that can be performed according to the overlay_interaction_type value. Different interaction options are given for overlay groups with different ID numbers. When the user clicks or activates the option corresponding to the ID number, the overlay of the group corresponding to the ID number can be operated simultaneously according to the operation type of the specified group.
应理解,也可以在MPD中携带overlay_interaction_type,具体过程可以参见表27处的描述,此处不再赘述。It should be understood that the overlay_interaction_type may also be carried in the MPD. For the specific process, refer to the description at Table 27, which is not repeated here.
作为本申请的再一种实施例,服务器可以在每个overlay对应的文件格式中添加共同交互群组描述信息。与示例1-1的区别在于,本实施例添加的是overlay的群组标识信息位于overlay控制结构中。也即第三信息位于overlay结构包括的overlay控制结构中。As another embodiment of the present application, the server may add common interaction group description information to a file format corresponding to each overlay. The difference from Example 1-1 is that in this embodiment, the group identification information of the overlay is located in the overlay control structure. That is, the third information is located in an overlay control structure included in the overlay structure.
具体的,服务器对包括一个或者多个的overlay媒体数据进行编码,得到包括一个或者多个的overlay媒体数据流,然后对包括一个或者多个的overlay媒体数据流封装。封装后的媒体数据流中每个overlay具有overlay控制结构,在每个overlay具有overlay控制结构还可以包括Overlay控制符号Overlay group,用于表示overlay的群组标识信息。具体的,Overlay控制符号如表28所示:Specifically, the server encodes one or more overlay media data to obtain one or more overlay media data streams, and then encapsulates the one or more overlay media data streams. In the encapsulated media data stream, each overlay has an overlay control structure, and each overlay has an overlay control structure, and may further include an Overlay control symbol Overlay group, which is used to indicate group identification information of the overlay. Specifically, the Overlay control symbols are shown in Table 28:
表28 Overlay控制符号语义Table 28.Overlay control symbol semantics
Figure PCTCN2019108514-appb-000036
Figure PCTCN2019108514-appb-000036
示例性的,overlay的群组结构如表29所示:Exemplarily, the group structure of the overlay is shown in Table 29:
表29Table 29
Figure PCTCN2019108514-appb-000037
Figure PCTCN2019108514-appb-000037
其中,overlay_relation_group_id表示overlay所在的群组的ID号。Among them, overlay_relation_group_id represents the ID number of the group to which the overlay belongs.
表29定义的overlay群组中,属于同一个群组的overlay的群组标识信息相同,属于不同群组的overlay的群组标识信息不相同。In the overlay group defined in Table 29, the group identification information of the overlays belonging to the same group is the same, and the group identification information of the overlays belonging to different groups is different.
具体的,终端对一个或者多个overlay解封装后获取到overlay控制结构语法元素overlay_control_flag,从而获取到表29中第10位所表示的Overlay group,进而获得OverlayGroup结构信息,得到overlay的群组标识信息。解析完成后,获取所有overlay的群组信息。Specifically, the terminal obtains the overlay control structure syntax element overlay_control_flag after decapsulating one or more overlays, thereby obtaining the Overlay group represented by the tenth bit in Table 29, and then obtains the OverlayGroup structure information to obtain the overlay group identification information . After the parsing is completed, obtain the group information of all overlays.
综上所述,本申请实施例通过在文件格式、overlay码流的SEI或是MPD中加入用于指示overlay所在的群组标识信息,这样终端可以以群组为单位显示具有相同群组标识信息的一个或者多个overlay。这样对于属于同一个群组中的overlay,用户可以对该群组的overlay同时进行该群组对应的操作功能,减少了用户对一个或者多个overlay执行相同操作时的步骤,提升了用户观看VR视频的主观体验。In summary, the embodiment of the present application adds the group identification information used to indicate the overlay to the file format, the SEI or the MPD of the overlay code stream, so that the terminal can display the same group identification information in groups. One or more overlays. In this way, for overlays belonging to the same group, the user can perform the corresponding operation function of the group's overlay at the same time, which reduces the steps when the user performs the same operation on one or more overlays, and improves the user's watching VR. The subjective experience of the video.
一种可能的实现方式,对于场景1,本申请实施例中的第二信息和第三信息可以携带在对包括overlay的媒体数据流封装后的overlay的文件格式中,该文件格式包括实体群组盒子(EntityToGroupBox)和overlay结构,其中,overlay结构中具有overlay关联区域控制结构。第二信息位于overlay关联区域控制结构中,第三信息位于EntityToGroupBox。A possible implementation manner. For scenario 1, the second information and the third information in the embodiment of the present application may be carried in an overlay file format that encapsulates a media data stream including an overlay, and the file format includes an entity group. Box (EntityToGroupBox) and overlay structure, wherein the overlay structure has an overlay associated area control structure. The second information is located in the overlay association area control structure, and the third information is located in the EntityToGroupBox.
EntityToGroupBox的语法和语义如下表30所示:The syntax and semantics of EntityToGroupBox are shown in Table 30 below:
表30Table 30
Figure PCTCN2019108514-appb-000038
Figure PCTCN2019108514-appb-000038
其中,表30中group_id指示该群组的唯一ID号,该ID号与其他任何EntityToGroupBox结构ID号不同。num_entities_in_group表示当前群组实体的个数,entity_id对应文件格式中一个实体的ID号。The group_id in Table 30 indicates the unique ID number of the group, which is different from any other EntityToGroupBox structure ID number. num_entities_in_group represents the number of entities in the current group, and entity_id corresponds to the ID number of an entity in the file format.
可选的,以共同展示群组为例,当第二信息和第三信息位于文件格式中,且第三信息位于EntityToGroupBox时,本申请实施例中的文件格式中还可以具有如表31所示的语法:Optionally, taking the common display group as an example, when the second information and the third information are located in a file format and the third information is located in an EntityToGroupBox, the file format in the embodiment of the present application may also have the following table 31 The syntax:
表31Table 31
Figure PCTCN2019108514-appb-000039
Figure PCTCN2019108514-appb-000039
其中,ocsg表示grouping_type(群组类型)的一种,用于表示群组类型为共同展示群组。Among them, ocsg represents a type of grouping_type, which is used to indicate that the group type is a common display group.
这样终端可以从EntityToGroupBox中解析到OverlayConditionalShownGroupBox,表示一组overlay在可以在用户针对任一个overlay或者该群组进行触发展示时,进行共同展示。当用户对该群组或者群组中的任一个overlay进行关闭时,进行共同关闭。当前overlay群组中包含的overlay的ID号由当前EntityToGroupBox中的entity_id表示。In this way, the terminal can resolve from EntityToGroupBox to OverlayConditionalShownGroupBox, which means that a group of overlays can be displayed together when the user triggers the display for any overlay or the group. When the user closes the group or any overlay in the group, a common close is performed. The ID number of the overlay contained in the current overlay group is represented by the entity_id in the current EntityToGroupBox.
在OverlayConditionalShownGroupBox中,还可以包含关于overlay群组的名称信息,语法结构如下表32所示:The OverlayConditionalShownGroupBox can also contain information about the name of the overlay group. The syntax structure is shown in Table 32 below:
表32Table 32
Figure PCTCN2019108514-appb-000040
Figure PCTCN2019108514-appb-000040
overlay_group_label是一个不限制长度的UTF-8编码字符串,表示overlay群组的描述。可以为空字符overlay_group_label is a UTF-8 encoded string of unlimited length, representing the description of the overlay group. Can be null
overlay_group_label用于给出群组的描述信息,可作为群组信息显示在用户的显示界面上。overlay_group_label is used to give group description information, and it can be displayed on the user's display interface as group information.
另一种可能的实现方式,对于场景2,本申请实施例中的第一信息可以携带在对包括overlay的媒体数据流封装后的overlay的文件格式中,该文件格式包括EntityToGroupBox和overlay结构,其中,overlay结构包括控制符号。第一信息位于EntityToGroupBox。For another possible implementation manner, for scenario 2, the first information in the embodiment of the present application may be carried in an overlay file format that encapsulates a media data stream including an overlay, and the file format includes an EntityToGroupBox and an overlay structure, where The overlay structure includes control symbols. The first information is in EntityToGroupBox.
可选的,以交互群组为例,当第一信息位于文件格式中,且第一信息位于EntityToGroupBox时,本申请实施例中的文件格式中还可以具有如表32所示的语法:Optionally, taking an interactive group as an example, when the first information is in a file format and the first information is in an EntityToGroupBox, the file format in the embodiment of the present application may also have the syntax shown in Table 32:
表33Table 33
Figure PCTCN2019108514-appb-000041
Figure PCTCN2019108514-appb-000041
其中,ovrg也表示grouping_type(群组类型)的一种,用于表示群组类型为共同交互群组。Among them, ovrg also represents a type of grouping_type (group type), which is used to indicate that the group type is a common interaction group.
当采用表33的语法结构时,overlay结构可以采用上述描述,此处不作限定。When the syntax structure of Table 33 is adopted, the overlay structure can adopt the above description, which is not limited here.
这样终端可以从EntityToGroupBox中解析到OverlayRelationGroupBox,表示一组overlay在可以在用户针对任一个overlay或者该群组中的所有overlay进行交互操作时,并基于每个overlay的overlay结构确定overlay交互操作的操作功能。当前overlay群组中包含的overlay的ID号由当前EntityToGroupBox中的entity_id表示。In this way, the terminal can resolve from EntityToGroupBox to OverlayRelationGroupBox, which means that when a group of overlays can perform interactive operations on any overlay or all overlays in the group, the operation function of the overlay interactive operation is determined based on the overlay structure of each overlay. . The ID number of the overlay contained in the current overlay group is represented by the entity_id in the current EntityToGroupBox.
在OverlayRelationGroupBox中,还可以包含关于overlay群组的名称信息,语法结构如下表34所示:The OverlayRelationGroupBox can also contain information about the name of the overlay group. The syntax structure is shown in Table 34 below:
表34Table 34
Figure PCTCN2019108514-appb-000042
Figure PCTCN2019108514-appb-000042
overlay_group_label用于给出群组的名称信息,可作为群组信息显示在用户的显示界面上。overlay_group_label is used to give group name information, which can be displayed on the user's display interface as group information.
可以理解的是,对于表32和表34中,overlay群组的名称信息可以携带在overlay群组box中。这样终端在获取到overlay之后,可以解析到overlay群组box携带的overlay群组的名称信息,并显示overlay群组的名称信息指示的群组名称。一种示例, 当overlay群组box被定义为OverlayRelationGroupBox时,overlay群组的名称信息位于OverlayRelationGroupBox中。It can be understood that, for Table 32 and Table 34, the name information of the overlay group can be carried in the overlay group box. In this way, after acquiring the overlay, the terminal can resolve the name information of the overlay group carried by the overlay group box, and display the group name indicated by the name information of the overlay group. In one example, when the overlay group box is defined as an OverlayRelationGroupBox, the name information of the overlay group is located in the OverlayRelationGroupBox.
另一种示例,当overlay群组box被定义为OverlayConditionalShownGroupBox时,overlay群组的名称信息位于OverlayConditionalShownGroupBox中。In another example, when the overlay group box is defined as an OverlayConditionalShownGroupBox, the name information of the overlay group is located in the OverlayConditionalShownGroupBox.
上述主要从各个网元之间交互的角度对本申请实施例的方案进行了介绍。可以理解的是,各个网元,例如处理媒体数据的装置等为了实现上述功能,其包括了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。The above mainly introduces the solution of the embodiment of the present application from the perspective of interaction between various network elements. It can be understood that, in order to implement the above functions, each network element, such as a device for processing media data, includes a hardware structure and / or a software module corresponding to each function. Those skilled in the art should easily realize that, with reference to the units and algorithm steps of each example described in the embodiments disclosed herein, this application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is performed by hardware or computer software-driven hardware depends on the specific application of the technical solution and design constraints. Professional technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.
本申请实施例可以根据上述方法示例处理媒体数据的装置进行功能单元的划分,例如,可以对应各个功能划分各个功能单元,也可以将两个或两个以上的功能集成在一个处理单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。需要说明的是,本申请实施例中对单元的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the embodiment of the present application, functional units may be divided according to the apparatus for processing media data according to the foregoing method. For example, each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit. The above integrated unit may be implemented in the form of hardware or in the form of software functional unit. It should be noted that the division of the units in the embodiments of the present application is schematic, and is only a logical function division. There may be another division manner in actual implementation.
下面以采用对应各个功能划分各个功能模块为例进行说明:The following description is made by taking each functional module as an example:
图7是本申请实施例的处理媒体数据的装置的示意性框图,该处理媒体数据的装置可以为终端,或者为应用于终端中的芯片。图7所示的处理媒体数据的装置500包括:获取模块510和处理模块520。FIG. 7 is a schematic block diagram of an apparatus for processing media data according to an embodiment of the present application. The apparatus for processing media data may be a terminal or a chip applied to the terminal. The apparatus 500 for processing media data shown in FIG. 7 includes an obtaining module 510 and a processing module 520.
处理媒体数据的装置500中的获取模块510和处理模块520可以执行图3和图4中所示的方法中由终端执行的各个步骤。The obtaining module 510 and the processing module 520 in the apparatus 500 for processing media data may perform various steps performed by the terminal in the methods shown in FIG. 3 and FIG. 4.
处理媒体数据的装置500执行图3所示的方法时,获取模块510和处理模块520的具体作用如下:When the apparatus 500 for processing media data executes the method shown in FIG. 3, the specific functions of the obtaining module 510 and the processing module 520 are as follows:
获取模块510,用于接收媒体数据对应的至少两个覆盖层overlay。其中,每个overlay对应第一信息,所述第一信息包括所述overlay的群组标识信息;或者,所述overlay对应第二信息和第三信息,其中,所述第二信息用于指示所述overlay对应的操作功能,所述第三信息包括所述overlay的群组标识信息或者与所述overlay属于同一个群组的其他overlay的标识信息。The obtaining module 510 is configured to receive at least two overlay layers corresponding to the media data. Wherein, each overlay corresponds to first information, and the first information includes group identification information of the overlay; or, the overlay corresponds to second information and third information, and the second information is used to indicate all In the operation function corresponding to the overlay, the third information includes group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay.
处理模块520,用于当所述overlay对应所述第一信息时,根据所述至少两个overlay的所述第一信息对所述至少两个overlay进行处理。或者,处理模块520,用于当所述overlay对应所述第二信息和所述第三信息时,所述终端根据所述至少两个overlay对应的第二信息和第三信息对所述至少两个overlay进行处理。A processing module 520 is configured to process the at least two overlays according to the first information of the at least two overlays when the overlay corresponds to the first information. Alternatively, a processing module 520 is configured to, when the overlay corresponds to the second information and the third information, the terminal performs a pairing process on the at least two according to the second information and the third information corresponding to the at least two overlays. An overlay is processed.
处理媒体数据的装置500执行图4所示的方法时,处理媒体数据的装置500还包括:显示模块530。其中,显示模块530和处理模块520的具体作用如下:When the apparatus 500 for processing media data executes the method shown in FIG. 4, the apparatus 500 for processing media data further includes a display module 530. The specific functions of the display module 530 and the processing module 520 are as follows:
显示模块530,用于显示至少一个群组,以及用于指示至少一个群组中每个群组对应的操作功能的信息和属于每个群组中的overlay。The display module 530 is configured to display at least one group, and information indicating an operation function corresponding to each group in the at least one group and an overlay belonging to each group.
一种可能的实现方式,至少一个群组由至少两个overlay中每个overlay对应的第一信息确定;一个群组对应的操作功能由所述每个群组中的overlay包括的overlay结 构确定。In a possible implementation manner, at least one group is determined by first information corresponding to each overlay in at least two overlays; an operation function corresponding to one group is determined by an overlay structure included in the overlay in each group.
另一种可能的实现方式,至少一个群组由所述至少两个overlay中每个overlay对应的第三信息确定,一个群组对应的操作功能由所述群组中的overlay包括的overlay关联区域控制结构确定。In another possible implementation manner, at least one group is determined by third information corresponding to each overlay in the at least two overlays, and an operation function corresponding to one group is an overlay associated area included in the overlay in the group The control structure is determined.
处理模块520,用于当至少一个群组中任一个群组被触发,处理属于该任一个群组中的所有overlay响应该群组对应的操作功能。或者当任一个overlay被触发,处理该任一个overlay以及与该任一个overlay属于同一个群组中的其他overlay响应该任一个overlay被触发的操作功能。The processing module 520 is configured to process, when any one of the at least one group is triggered, all overlays belonging to any one group in response to an operation function corresponding to the group. Or when any overlay is triggered, it handles any overlay and other overlays that belong to the same group as any overlay in response to the operation function of any overlay being triggered.
图8是本申请实施例的处理媒体数据的装置600示意性框图。该处理媒体数据的装置600可以为服务器。或者为应用于服务器中的芯片。图8所示的处理媒体数据的装置600包括:获取模块610、处理模块620和发送模块630。FIG. 8 is a schematic block diagram of an apparatus 600 for processing media data according to an embodiment of the present application. The apparatus 600 for processing media data may be a server. Or a chip used in a server. The apparatus 600 for processing media data shown in FIG. 8 includes an obtaining module 610, a processing module 620, and a sending module 630.
处理媒体数据的装置600中的获取模块610、处理模块620和发送模块630可以执行图3和图4中所示的方法由服务器执行的各个步骤。The obtaining module 610, the processing module 620, and the sending module 630 in the apparatus 600 for processing media data may perform each step of the method shown in FIG. 3 and FIG. 4 by the server.
处理媒体数据的装置600执行图3或图4所示的方法时,获取模块610、处理模块620和发送模块630的具体作用如下:When the apparatus 600 for processing media data executes the method shown in FIG. 3 or FIG. 4, the specific functions of the obtaining module 610, the processing module 620, and the sending module 630 are as follows:
获取模块610,用于获取媒体数据。The obtaining module 610 is configured to obtain media data.
处理模块620,用于处理媒体数据,得到所述媒体数据对应的至少两个覆盖层overlay;A processing module 620, configured to process media data to obtain at least two overlay layers corresponding to the media data;
发送模块630,用于向终端发送一个或者多个overlay。The sending module 630 is configured to send one or more overlays to the terminal.
应理解,当处理媒体数据的装置600采用图2所示的结构时,输入/输出接口130,用于获取媒体数据。处理器110用于处理媒体数据,得到所述媒体数据对应的至少两个覆盖层overlay。输入/输出接口130还用于向终端发送一个或者多个overlay。It should be understood that when the apparatus 600 for processing media data adopts the structure shown in FIG. 2, the input / output interface 130 is configured to acquire media data. The processor 110 is configured to process media data to obtain at least two overlay layers corresponding to the media data. The input / output interface 130 is also used to send one or more overlays to the terminal.
应理解,当处理媒体数据的装置500采用图2所示的结构时,输入/输出接口130,用于接收媒体数据对应的至少两个覆盖层overlay。其中,每个overlay对应第一信息,所述第一信息包括所述overlay的群组标识信息;或者,所述overlay对应第二信息和第三信息,其中,所述第二信息用于指示所述overlay对应的操作功能,所述第三信息包括所述overlay的群组标识信息或者与所述overlay属于同一个群组的其他overlay的标识信息。It should be understood that when the apparatus 500 for processing media data adopts the structure shown in FIG. 2, the input / output interface 130 is configured to receive at least two overlay layers corresponding to the media data. Wherein, each overlay corresponds to first information, and the first information includes group identification information of the overlay; or, the overlay corresponds to second information and third information, and the second information is used to indicate all In the operation function corresponding to the overlay, the third information includes group identification information of the overlay or identification information of other overlays that belong to the same group as the overlay.
处理器110,用于当所述overlay对应所述第一信息时,根据所述至少两个overlay的所述第一信息对所述至少两个overlay进行处理。或者,处理器110,用于当所述overlay对应所述第二信息和所述第三信息时,所述终端根据所述至少两个overlay对应的第二信息和第三信息对所述至少两个overlay进行处理。The processor 110 is configured to process the at least two overlays according to the first information of the at least two overlays when the overlay corresponds to the first information. Alternatively, the processor 110 is configured to: when the overlay corresponds to the second information and the third information, the terminal performs, on the at least two, the second information and the third information corresponding to the at least two overlays. An overlay is processed.
可选的,显示器160,用于显示至少一个群组,以及用于指示至少一个群组中每个群组对应的操作功能的信息和属于每个群组中的overlay。Optionally, the display 160 is configured to display at least one group, and information used to indicate an operation function corresponding to each group in the at least one group and an overlay belonging to each group.
一种可能的实现方式,至少一个群组由至少两个overlay中每个overlay对应的第一信息确定;一个群组对应的操作功能由所述每个群组中的overlay包括的overlay结构确定。In a possible implementation manner, at least one group is determined by first information corresponding to each overlay in at least two overlays; an operation function corresponding to one group is determined by an overlay structure included in the overlay in each group.
另一种可能的实现方式,至少一个群组由所述至少两个overlay中每个overlay对应的第三信息确定,一个群组对应的操作功能由所述群组中的overlay包括的overlay 关联区域控制结构确定。In another possible implementation manner, at least one group is determined by third information corresponding to each overlay in the at least two overlays, and an operation function corresponding to one group is an overlay association area included in the overlay in the group The control structure is determined.
显示器160,用于当至少一个群组中任一个群组被触发,处理属于该任一个群组中的所有overlay响应该群组对应的操作功能。或者当任一个overlay被触发,处理该任一个overlay以及与该任一个overlay属于同一个群组中的其他overlay响应该任一个overlay被触发的操作功能。The display 160 is configured to process all overlays belonging to any one group in response to an operation function corresponding to the group when any one of the groups is triggered. Or when any overlay is triggered, it handles any overlay and other overlays that belong to the same group as any overlay in response to the operation function of any overlay being triggered.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art may realize that the units and algorithm steps of each example described in connection with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Professional technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working processes of the systems, devices, and units described above can refer to the corresponding processes in the foregoing method embodiments, and are not repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other ways. For example, the device embodiments described above are only schematic. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application is essentially a part that contributes to the existing technology or a part of the technical solution can be embodied in the form of a software product. The computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present application. The aforementioned storage media include: U disks, mobile hard disks, read-only memories (ROMs), random access memories (RAMs), magnetic disks or compact discs and other media that can store program codes .
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above is only a specific implementation of this application, but the scope of protection of this application is not limited to this. Any person skilled in the art can easily think of changes or replacements within the technical scope disclosed in this application. It should be covered by the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims (78)

  1. 一种处理媒体数据的方法,其特征在于,包括:A method for processing media data, comprising:
    终端获取媒体数据对应的至少两个覆盖层overlay;The terminal acquires at least two overlay layers corresponding to the media data;
    所述overlay对应第一信息,所述第一信息包括所述overlay的群组标识信息;The overlay corresponds to first information, and the first information includes group identification information of the overlay;
    或者,所述overlay对应第二信息和第三信息,其中,所述第二信息用于指示所述overlay对应的操作功能,或者与所述overlay属于同一个群组的其他overlay的标识信息;Alternatively, the overlay corresponds to second information and third information, wherein the second information is used to indicate an operation function corresponding to the overlay or identification information of other overlays that belong to the same group as the overlay;
    当所述overlay对应所述第一信息时,所述终端根据所述至少两个overlay的所述第一信息对所述至少两个overlay进行处理,When the overlay corresponds to the first information, the terminal processes the at least two overlays according to the first information of the at least two overlays,
    或者,or,
    当所述overlay对应所述第二信息和所述第三信息时,所述终端根据所述至少两个overlay对应的第二信息和第三信息对所述至少两个overlay进行处理。When the overlay corresponds to the second information and the third information, the terminal processes the at least two overlays according to the second information and the third information corresponding to the at least two overlays.
  2. 根据权利要求1所述的方法,其特征在于,所述终端根据所述至少两个overlay的所述第一信息对所述至少两个overlay进行处理,包括:The method according to claim 1, wherein the processing the at least two overlays according to the first information of the at least two overlays comprises:
    所述终端显示至少一个群组,以及用于指示所述至少一个群组中每个群组对应的操作功能的信息和属于每个群组中的overlay;所述至少一个群组由所述至少两个overlay中每个overlay对应的第一信息确定;一个群组对应的操作功能由所述每个群组中的overlay包括的overlay结构确定;The terminal displays at least one group, and information for indicating an operation function corresponding to each group in the at least one group and an overlay belonging to each group; the at least one group is controlled by the at least one group. The first information corresponding to each overlay in the two overlays is determined; the operation function corresponding to a group is determined by the overlay structure included in the overlay in each group;
    或者,or,
    所述终端根据所述至少两个overlay对应的第二信息和第三信息对所述至少两个overlay进行处理,包括:The processing, by the terminal, according to the second information and the third information corresponding to the at least two overlays, includes:
    所述终端显示至少一个群组,以及用于指示所述至少一个群组中每个群组对应的操作功能的信息和属于每个群组中的overlay;所述至少一个群组由所述至少两个overlay中每个overlay对应的第三信息确定,一个群组对应的操作功能由所述群组中的overlay包括的overlay关联区域控制结构确定。The terminal displays at least one group, and information for indicating an operation function corresponding to each group in the at least one group and an overlay belonging to each group; the at least one group is controlled by the at least one group. The third information corresponding to each overlay in the two overlays is determined, and the operation function corresponding to a group is determined by the overlay associated area control structure included in the overlay in the group.
  3. 根据权利要求2所述的方法,其特征在于,当所述至少一个群组中的任一个群组被触发,属于所述任一个群组中的所有overlay响应所述任一个群组对应的操作功能。The method according to claim 2, characterized in that, when any one of the at least one group is triggered, all overlays belonging to the any one group respond to an operation corresponding to the any one group Features.
  4. 根据权利要求2所述的方法,其特征在于,当任一个overlay被触发,所述任一个overlay以及与所述任一个overlay属于同一个群组中的其他overlay响应所述overlay被触发的操作功能。The method according to claim 2, characterized in that when any overlay is triggered, any of the overlays and other overlays belonging to the same group as any of the overlays respond to the operation function of the triggered overlay. .
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述overlay还对应第四信息,所述第四信息用于指示在对所述overlay执行第一操作功能的情况下,所述overlay所在群组中的所有overlay响应所述第一操作功能,所述方法还包括:当至少一个群组中的任一个群组被触发,属于所述任一个群组中的所有overlay响应所述任一个群组被触发的操作功能;或者,The method according to any one of claims 1-4, wherein the overlay further corresponds to fourth information, and the fourth information is used to indicate that when the first operation function is performed on the overlay, all the All overlays in the group to which the overlay belongs respond to the first operation function, and the method further includes: when any one of the at least one group is triggered, all overlays belonging to the any group respond to the first operation function. Describe the operation function triggered by any group; or,
    当任一个overlay被触发,所述任一个overlay以及与所述任一个overlay属于同一个群组中的其他overlay响应所述任一个overlay被触发的操作功能。When any overlay is triggered, any overlay and other overlays belonging to the same group as any overlay respond to the operation function of any overlay being triggered.
  6. 根据权利要求1-4任一项所述的方法,其特征在于,所述overlay还对应第五信息,所述第五信息用于指示在对所述overlay执行第一操作功能的情况下,所述 overlay所在群组中的所有overlay响应所述第一操作功能,或者所述overlay响应所述第一操作功能:The method according to any one of claims 1-4, wherein the overlay further corresponds to fifth information, and the fifth information is used to indicate that when the first operation function is performed on the overlay, all All overlays in the group to which the overlay belongs respond to the first operation function, or the overlay responds to the first operation function:
    在任一个overlay被触发的情况下,所述任一个overlay以及与所述任一个overlay属于同一个群组中的其他overlay响应所述overlay被触发的操作功能,或者所述任一个overlay响应被触发的操作功能。In the case that any overlay is triggered, any of the overlays and other overlays belonging to the same group as any of the overlays respond to the operation function of the overlay being triggered, or the overlay response is triggered. Operational functions.
  7. 根据权利要求1-6任一项所述的方法,其特征在于,所述overlay对应第一信息时,所述overlay还对应第六信息,所述第六信息用于指示所述overlay对应的操作功能;The method according to any one of claims 1-6, wherein when the overlay corresponds to the first information, the overlay also corresponds to sixth information, and the sixth information is used to indicate an operation corresponding to the overlay. Features;
    当所述overlay被触发,所述overlay响应所述第六信息指示的操作功能。When the overlay is triggered, the overlay responds to the operation function indicated by the sixth information.
  8. 根据权利要求1-7任一项所述的方法,其特征在于,所述第二信息和所述第三信息携带于所述overlay的文件格式中。The method according to any one of claims 1-7, wherein the second information and the third information are carried in a file format of the overlay.
  9. 根据权利要求8所述的方法,其特征在于,所述文件格式包括:overlay结构,和位于所述overlay结构中的overlay关联区域控制结构以及overlay群组box,所述第三信息位于所述overlay群组box中,所述第二信息位于所述overlay关联区域控制结构中。The method according to claim 8, wherein the file format comprises: an overlay structure, an overlay association area control structure and an overlay group box located in the overlay structure, and the third information is located in the overlay In the group box, the second information is located in the overlay associated area control structure.
  10. 根据权利要求1-7任一项所述的方法,其特征在于,所述第一信息携带在所述overlay的文件格式中。The method according to any one of claims 1 to 7, wherein the first information is carried in a file format of the overlay.
  11. 根据权利要求10所述的方法,其特征在于,所述文件格式包括:overlay群组box,The method according to claim 10, wherein the file format comprises: an overlay group box,
    其中,所述第一信息位于所述overlay群组box中。The first information is located in the overlay group box.
  12. 根据权利要求1-7任一项所述的方法,其特征在于,所述第三信息携带在所述overlay对应的overlay码流的辅助增强信息(supplementary enhancement information,SEI)中,所述第二信息携带在所述overlay的overlay关联区域控制结构中。The method according to any one of claims 1 to 7, wherein the third information is carried in supplementary enhancement information (supplementary enhancement information) of an overlay code stream corresponding to the overlay, and the second information The information is carried in an overlay-related area control structure of the overlay.
  13. 根据权利要求1-7任一项所述的方法,其特征在于,所述第一信息携带在所述overlay对应的overlay码流的辅助增强信息SEI中。The method according to any one of claims 1-7, wherein the first information is carried in auxiliary enhancement information SEI of an overlay code stream corresponding to the overlay.
  14. 根据权利要求12或13所述的方法,其特征在于,所述SEI的载荷类型用于指示所述SEI中携带所述overlay的群组标识信息。The method according to claim 12 or 13, wherein a payload type of the SEI is used to indicate that the SEI carries group identification information of the overlay.
  15. 根据权利要求1-7任一项所述的方法,其特征在于,所述第三信息携带在包含overlay的媒体数据流的媒体呈现描述(Media Presentation Description,MPD)中,所述第二信息携带在所述overlay的overlay关联区域控制结构中。The method according to any one of claims 1 to 7, wherein the third information is carried in a Media Presentation Description (MPD) including a media data stream of overlay, and the second information is carried In the overlay associated area control structure of the overlay.
  16. 根据权利要求15所述的方法,其特征在于,所述第三信息位于所述MPD的自适应集合层级(adaptation set level)或者表述层级(representation level)的overlay描述字中。The method according to claim 15, wherein the third information is located in an adaptive set level (representation level) or an overlay description word of the representation level (representation level) of the MPD.
  17. 根据权利要求1-7任一项所述的方法,其特征在于,所述第一信息携带在包含overlay的媒体数据流的媒体呈现描述MPD中。The method according to any one of claims 1-7, wherein the first information is carried in a media presentation description MPD including a media data stream of an overlay.
  18. 根据权利要求16或17所述的方法,其特征在于,所述第一信息位于所述MPD的adaptation set level或者representation level的overlay描述字中。The method according to claim 16 or 17, wherein the first information is located in an adaptation description level or an overlay description word of the representation level of the MPD.
  19. 根据权利要求1-17任一项所述的方法,其特征在于,所述第一信息包括所述overlay的群组标识信息时,所述第一信息携带于实体群组盒子EntityToGroupBox。The method according to any one of claims 1-17, wherein when the first information includes group identification information of the overlay, the first information is carried in an entity group box EntityToGroupBox.
  20. 根据权利要求1-17任一项所述的方法,其特征在于,所述第三信息包括所述overlay的群组标识信息时,第三信息携带于EntityToGroupBox,所述第二信息携带在所述overlay的overlay关联区域控制结构中。The method according to any one of claims 1-17, wherein when the third information includes group identification information of the overlay, the third information is carried in an EntityToGroupBox, and the second information is carried in the EntityToGroupBox The overlay's overlay association area control structure.
  21. 根据权利要求1-18任一项所述的方法,其特征在于,所述overlay的文件格式还包括:overlay群组box,所述overlay群组box中携带所述overlay的群组的名称信息;所述方法还包括:The method according to any one of claims 1 to 18, wherein the file format of the overlay further comprises: an overlay group box, and the overlay group box carries name information of the overlay group; The method further includes:
    所述终端显示所述overlay的群组的名称信息指示的群组名称。The terminal displays the group name indicated by the name information of the group of the overlay.
  22. 根据权利要求1-21任一项所述的方法,其特征在于,所述overlay对应多个群组,当所述overlay被触发,所述overlay分别响应所述多个群组对应的操作功能;The method according to any one of claims 1 to 21, wherein the overlay corresponds to a plurality of groups, and when the overlay is triggered, the overlay responds to an operation function corresponding to the plurality of groups, respectively;
    或者,所述多个群组中的所有overlay分别响应所在群组对应的操作功能。Alternatively, all overlays in the plurality of groups respectively respond to operation functions corresponding to the group.
  23. 一种处理媒体数据的方法,其特征在于,包括:A method for processing media data, comprising:
    服务器获取媒体数据;The server obtains the media data;
    所述服务器处理所述媒体数据,得到所述媒体数据对应的至少两个覆盖层overlay;所述overlay对应第一信息,其中,所述第一信息包括所述overlay的群组标识信息;The server processes the media data to obtain at least two overlay overlays corresponding to the media data; the overlay corresponds to first information, wherein the first information includes group identification information of the overlay;
    或者,所述overlay对应第二信息和第三信息;所述第二信息用于指示所述overlay对应的操作功能,所述第三信息用于指示所述overlay的群组标识信息或者与所述overlay属于同一个群组的其他overlay的标识信息;Alternatively, the overlay corresponds to second information and third information; the second information is used to indicate an operation function corresponding to the overlay, and the third information is used to indicate group identification information of the overlay or the same as the overlay. identification information of other overlays belonging to the same group;
    所述服务器发送所述至少两个覆盖层overlay。The server sends the at least two overlay overlays.
  24. 根据权利要求23所述的方法,其特征在于,所述第二信息和所述第三信息携带在所述overlay的文件格式中。The method according to claim 23, wherein the second information and the third information are carried in a file format of the overlay.
  25. 根据权利要求24所述的方法,其特征在于,所述文件格式包括overlay结构,和位于所述overlay结构中的overlay关联区域控制结构,以及overlay群组box;The method according to claim 24, wherein the file format comprises an overlay structure, an overlay association area control structure located in the overlay structure, and an overlay group box;
    所述第三信息位于所述overlay群组box中,所述第二信息位于所述overlay关联区域控制结构中。The third information is located in the overlay group box, and the second information is located in the overlay associated area control structure.
  26. 根据权利要求23所述的方法,其特征在于,所述第一信息携带在所述overlay的文件格式中。The method according to claim 23, wherein the first information is carried in a file format of the overlay.
  27. 根据权利要求26所述的方法,其特征在于,所述文件格式包括overlay群组box,其中,所述第一信息位于所述overlay群组box中。The method according to claim 26, wherein the file format comprises an overlay group box, and wherein the first information is located in the overlay group box.
  28. 根据权利要求23所述的方法,其特征在于,所述第三信息携带在所述overlay对应的overlay码流的辅助增强信息SEI中,所述第二信息携带在所述overlay的overlay关联区域控制结构中。The method according to claim 23, wherein the third information is carried in an auxiliary enhanced information SEI of an overlay code stream corresponding to the overlay, and the second information is carried in an overlay associated area control of the overlay Structure.
  29. 根据权利要求23所述的方法,其特征在于,所述第一信息携带在所述overlay对应的overlay码流的SEI中。The method according to claim 23, wherein the first information is carried in an SEI of an overlay code stream corresponding to the overlay.
  30. 根据权利要求28或29所述的方法,其特征在于,所述SEI的载荷类型用于指示所述SEI中携带所述overlay的群组标识信息。The method according to claim 28 or 29, wherein a payload type of the SEI is used to indicate that the SEI carries group identification information of the overlay.
  31. 根据权利要求23所述的方法,其特征在于,所述第三信息携带在包含所述overlay的媒体数据流的媒体呈现描述MPD中,所述第二信息携带在所述overlay的overlay关联区域控制结构中。The method according to claim 23, wherein the third information is carried in a media presentation description MPD including a media data stream of the overlay, and the second information is carried in an overlay-related area control of the overlay Structure.
  32. 根据权利要求23所述的方法,其特征在于,所述第三信息位于所述MPD的 adaptation set level或者representation level的overlay描述字中。The method according to claim 23, wherein the third information is located in an overlay description word of an adaptation level or a representation level of the MPD.
  33. 根据权利要求23所述的方法,其特征在于,所述第一信息携带在包含所述overlay的媒体数据流的媒体呈现描述MPD中。The method according to claim 23, wherein the first information is carried in a media presentation description MPD including a media data stream of the overlay.
  34. 根据权利要求33所述的方法,其特征在于,所述第一信息位于所述MPD的adaptation set level或者representation level的overlay描述字中。The method according to claim 33, wherein the first information is located in an overlay description word of an adaptation level or a representation level of the MPD.
  35. 根据权利要求23所述的方法,其特征在于,所述第一信息包括所述overlay的群组标识信息时,所述第一信息携带于实体群组盒子EntityToGroupBox。The method according to claim 23, wherein when the first information includes group identification information of the overlay, the first information is carried in an entity group box EntityToGroupBox.
  36. 根据权利要求23所述的方法,其特征在于,所述第三信息包括所述overlay的群组标识信息时,第三信息携带于EntityToGroupBox,所述第二信息携带在所述overlay的overlay关联区域控制结构中。The method according to claim 23, wherein when the third information includes group identification information of the overlay, the third information is carried in an EntityToGroupBox, and the second information is carried in an overlay association area of the overlay. In the control structure.
  37. 根据权利要求23-36任一项所述的方法,其特征在于,所述overlay还对应第四信息,所述第四信息用于指示所述overlay响应触发的第一操作功能的情况下,所述overlay所在群组中的所有overlay响应所述第一操作功能。The method according to any one of claims 23 to 36, wherein the overlay further corresponds to fourth information, and the fourth information is used to indicate a case of a first operation function triggered by the overlay response. All overlays in the group to which the overlay belongs respond to the first operation function.
  38. 根据权利要求23-36任一项所述的方法,其特征在于,所述overlay还对应第五信息,所述第五信息用于指示所述overlay被触发第一操作功能的情况下,所述overlay所在群组中的所有overlay响应所述第一操作功能,或者所述overlay响应所述第一操作功能。The method according to any one of claims 23 to 36, wherein the overlay further corresponds to fifth information, and the fifth information is used to indicate that when the overlay is triggered by a first operation function, the overlay All overlays in the group to which the overlay belongs respond to the first operation function, or the overlay responds to the first operation function.
  39. 根据权利要求23-38任一项所述的方法,其特征在于,所述overlay对应第一信息时,所述overlay还对应第六信息,所述第六信息用于指示所述overlay对应的操作功能。The method according to any one of claims 23 to 38, wherein, when the overlay corresponds to the first information, the overlay also corresponds to sixth information, and the sixth information is used to indicate an operation corresponding to the overlay. Features.
  40. 一种终端,其特征在于,包括:获取模块和处理模块,其中,A terminal is characterized by comprising: an obtaining module and a processing module, wherein:
    所述获取模块,用于获取媒体数据对应的至少两个覆盖层overlay;The acquiring module is configured to acquire at least two overlay layers corresponding to the media data;
    所述overlay对应第一信息,所述第一信息包括所述overlay的群组标识信息;The overlay corresponds to first information, and the first information includes group identification information of the overlay;
    或者,所述overlay对应第二信息和第三信息,其中,所述第二信息用于指示所述overlay对应的操作功能,所述第三信息包括所述overlay的群组标识信息或者与所述overlay属于同一个群组的其他overlay的标识信息;Alternatively, the overlay corresponds to second information and third information, wherein the second information is used to indicate an operation function corresponding to the overlay, and the third information includes group identification information of the overlay or is related to the overlay. identification information of other overlays belonging to the same group;
    当所述overlay对应所述第一信息时,所述处理模块,用于根据所述至少两个overlay的所述第一信息处理所述至少两个overlay,When the overlay corresponds to the first information, the processing module is configured to process the at least two overlays according to the first information of the at least two overlays,
    或者,or,
    当所述overlay对应所述第二信息和所述第三信息时,所述处理模块,用于根据所述至少两个overlay对应的第二信息和第三信息处理所述至少两个overlay。When the overlay corresponds to the second information and the third information, the processing module is configured to process the at least two overlays according to the second information and the third information corresponding to the at least two overlays.
  41. 根据权利要求40所述的终端,其特征在于,所述终端还包括显示模块,所述显示模块,用于显示至少一个群组,以及用于指示所述至少一个群组中每个群组对应的操作功能的信息和属于每个群组中的overlay;所述至少一个群组由所述至少两个overlay中每个overlay对应的第一信息确定;一个群组对应的操作功能由所述每个群组中的overlay包括的overlay结构确定;The terminal according to claim 40, wherein the terminal further comprises a display module for displaying at least one group, and for indicating that each group in the at least one group corresponds to Information of the operation function and the overlay belonging to each group; the at least one group is determined by the first information corresponding to each overlay of the at least two overlays; the operation function corresponding to one group is determined by the each The overlay structure included in the overlay of each group is determined;
    或者,or,
    所述显示模块,用于显示至少一个群组,以及用于指示所述至少一个群组中每个群组对应的操作功能的信息和属于每个群组中的overlay;所述至少一个群组由所述至 少两个overlay中每个overlay对应的第三信息确定,一个群组对应的操作功能由所述群组中的overlay包括的overlay关联区域控制结构确定。The display module is configured to display at least one group, and information for indicating an operation function corresponding to each group in the at least one group and an overlay belonging to each group; the at least one group It is determined by the third information corresponding to each overlay in the at least two overlays, and an operation function corresponding to a group is determined by an overlay associated area control structure included in the overlay in the group.
  42. 根据权利要求40所述的终端,其特征在于,当所述至少一个群组中的任一个群组被触发,属于所述任一个群组中的所有overlay响应所述任一个群组对应的操作功能。The terminal according to claim 40, wherein when any one of the at least one group is triggered, all overlays belonging to the any one group respond to an operation corresponding to the any one group Features.
  43. 根据权利要求40所述的终端,其特征在于,当任一个overlay被触发,所述任一个overlay以及与所述任一个overlay属于同一个群组中的其他overlay响应所述overlay被触发的操作功能。The terminal according to claim 40, characterized in that when any overlay is triggered, any of the overlays and other overlays belonging to the same group as any of the overlays respond to the operation function of the triggered overlay. .
  44. 根据权利要求40-43任一项所述的终端,其特征在于,所述overlay还对应第四信息,所述第四信息用于指示在对所述overlay执行第一操作功能的情况下,所述overlay所在群组中的所有overlay响应所述第一操作功能,The terminal according to any one of claims 40-43, wherein the overlay further corresponds to fourth information, and the fourth information is used to indicate that when the first operation function is performed on the overlay, all the All overlays in the group to which the overlay belongs respond to the first operation function,
    当至少一个群组中的任一个群组被触发,属于所述任一个群组中的所有overlay响应所述任一个群组被触发的操作功能;或者,When any one of the at least one group is triggered, all overlays belonging to the any group respond to the triggered operation function of the any group; or,
    当任一个overlay被触发,所述任一个overlay以及与所述任一个overlay属于同一个群组中的其他overlay响应所述任一个overlay被触发的操作功能。When any overlay is triggered, any overlay and other overlays belonging to the same group as any overlay respond to the operation function of any overlay being triggered.
  45. 根据权利要求40-43任一项所述的终端,其特征在于,所述overlay还对应第五信息,所述第五信息用于指示在对所述overlay执行第一操作功能的情况下,所述overlay所在群组中的所有overlay响应所述第一操作功能,或者所述overlay响应所述第一操作功能:The terminal according to any one of claims 40-43, wherein the overlay further corresponds to fifth information, and the fifth information is used to indicate that when the first operation function is performed on the overlay, all the All overlays in the group to which the overlay belongs respond to the first operation function, or the overlay responds to the first operation function:
    在任一个overlay被触发的情况下,所述任一个overlay以及与所述任一个overlay属于同一个群组中的其他overlay响应所述overlay被触发的操作功能,或者所述任一个overlay响应被触发的操作功能。In the case that any overlay is triggered, any of the overlays and other overlays belonging to the same group as any of the overlays respond to the operation function of the overlay being triggered, or the overlay response is triggered. Operational functions.
  46. 根据权利要求40-45任一项所述的终端,其特征在于,所述overlay对应第一信息时,所述overlay还对应第六信息,所述第六信息用于指示所述overlay对应的操作功能;The terminal according to any one of claims 40-45, wherein when the overlay corresponds to the first information, the overlay also corresponds to sixth information, and the sixth information is used to indicate an operation corresponding to the overlay. Features;
    当所述overlay被触发,所述overlay响应所述第六信息指示的操作功能。When the overlay is triggered, the overlay responds to the operation function indicated by the sixth information.
  47. 根据权利要求40-46任一项所述的终端,其特征在于,所述第二信息和所述第三信息携带在所述overlay的文件格式中。The terminal according to any one of claims 40-46, wherein the second information and the third information are carried in a file format of the overlay.
  48. 根据权利要求47所述的终端,其特征在于,所述文件格式包括:overlay结构,和位于所述overlay结构中的overlay关联区域控制结构以及overlay群组box,所述第三信息位于所述overlay群组box中,所述第二信息位于所述overlay关联区域控制结构中。The terminal according to claim 47, wherein the file format comprises: an overlay structure, an overlay association area control structure and an overlay group box located in the overlay structure, and the third information is located in the overlay In the group box, the second information is located in the overlay associated area control structure.
  49. 根据权利要求40-46任一项所述的终端,其特征在于,所述第一信息携带在所述overlay的文件格式中。The terminal according to any one of claims 40-46, wherein the first information is carried in a file format of the overlay.
  50. 根据权利要求49所述的终端,其特征在于,所述文件格式包括:overlay群组box,The terminal according to claim 49, wherein the file format comprises: an overlay group box,
    其中,所述第一信息位于所述overlay群组box中。The first information is located in the overlay group box.
  51. 根据权利要求40-46任一项所述的终端,其特征在于,所述第三信息携带在所述overlay对应的overlay码流的辅助增强信息(supplementary enhancement  information,SEI)中,所述第二信息携带在所述overlay的overlay关联区域控制结构中。The terminal according to any one of claims 40 to 46, wherein the third information is carried in supplementary enhancement information (SEI) of an overlay code stream corresponding to the overlay, and the second The information is carried in an overlay-related area control structure of the overlay.
  52. 根据权利要求40-46任一项所述的终端,其特征在于,所述第一信息携带在所述overlay对应的overlay码流的辅助增强信息SEI中。The terminal according to any one of claims 40 to 46, wherein the first information is carried in auxiliary enhancement information SEI of an overlay code stream corresponding to the overlay.
  53. 根据权利要求51或52所述的终端,其特征在于,所述SEI的载荷类型用于指示所述SEI中携带所述overlay的群组标识信息。The terminal according to claim 51 or 52, wherein a payload type of the SEI is used to indicate that the SEI carries group identification information of the overlay.
  54. 根据权利要求40-46任一项所述的终端,其特征在于,所述第三信息携带在包含所述overlay的媒体数据流的媒体呈现描述(Media Presentation Description,MPD)中,所述第二信息携带在所述overlay的overlay关联区域控制结构中。The terminal according to any one of claims 40 to 46, wherein the third information is carried in a media presentation description (MPD) including a media data stream of the overlay, and the second information The information is carried in an overlay-related area control structure of the overlay.
  55. 根据权利要求54所述的终端,其特征在于,所述第三信息位于所述MPD的adaptation set level或者representation level的overlay描述字中。The terminal according to claim 54, wherein the third information is located in an overlay description word of an adaptation level or a representation level of the MPD.
  56. 根据权利要求40-46任一项所述的终端,其特征在于,所述第一信息携带在包括所述overlay的媒体数据流的媒体呈现描述MPD中。The terminal according to any one of claims 40 to 46, wherein the first information is carried in a media presentation description MPD including a media data stream of the overlay.
  57. 根据权利要求55或56所述的终端,其特征在于,所述第一信息位于所述MPD的adaptation set level或者representation level的overlay描述字中。The terminal according to claim 55 or 56, wherein the first information is located in an adaptation description level or an overlay description word of the representation level of the MPD.
  58. 根据权利要求40-46任一项所述的终端,其特征在于,所述第一信息包括所述overlay的群组标识信息时,所述第一信息携带于实体群组盒子EntityToGroupBox。The terminal according to any one of claims 40 to 46, wherein when the first information includes group identification information of the overlay, the first information is carried in an entity group box EntityToGroupBox.
  59. 根据权利要求40-46任一项所述的终端,其特征在于,所述第三信息包括所述overlay的群组标识信息时,第三信息携带于EntityToGroupBox,所述第二信息携带在所述overlay的overlay关联区域控制结构中。The terminal according to any one of claims 40-46, wherein when the third information includes group identification information of the overlay, the third information is carried in an EntityToGroupBox, and the second information is carried in the EntityToGroupBox The overlay's overlay association area control structure.
  60. 根据权利要求40-59任一项所述的终端,其特征在于,所述overlay对应多个群组,当所述overlay被触发,所述overlay分别响应所述多个群组对应的操作功能;The terminal according to any one of claims 40-59, wherein the overlay corresponds to a plurality of groups, and when the overlay is triggered, the overlay responds to an operation function corresponding to the plurality of groups, respectively;
    或者,所述多个群组中的所有overlay分别响应所在群组对应的操作功能。Alternatively, all overlays in the plurality of groups respectively respond to operation functions corresponding to the group.
  61. 根据权利要求40-60任一项所述的终端,其特征在于,所述overlay的文件格式还包括:overlay群组box,所述overlay群组box中携带所述overlay的群组的名称信息,显示单元,还用于显示所述overlay的群组的名称信息指示的群组名称。The terminal according to any one of claims 40 to 60, wherein the file format of the overlay further comprises: an overlay group box, and the overlay group box carries name information of the overlay group, The display unit is further configured to display the group name indicated by the name information of the group of the overlay.
  62. 一种服务器,其特征在于,包括:A server is characterized in that it includes:
    获取模块,用于获取媒体数据;An acquisition module for acquiring media data;
    处理模块,用于处理所述媒体数据,得到所述媒体数据对应的至少两个覆盖层overlay;所述overlay对应第一信息,其中,所述第一信息包括所述overlay的群组标识信息;A processing module, configured to process the media data to obtain at least two overlays corresponding to the media data; the overlay corresponds to first information, where the first information includes group identification information of the overlay;
    或者,所述overlay对应第二信息和第三信息;所述第二信息用于指示所述overlay对应的操作功能,所述第三信息用于指示所述overlay的群组标识信息或者与所述overlay属于同一个群组的其他overlay的标识信息;Alternatively, the overlay corresponds to second information and third information; the second information is used to indicate an operation function corresponding to the overlay, and the third information is used to indicate group identification information of the overlay or the same as the overlay. identification information of other overlays belonging to the same group;
    发送模块,用于向终端发送所述至少两个覆盖层overlay。A sending module, configured to send the at least two overlay layers to a terminal.
  63. 根据权利要求62所述的服务器,其特征在于,所述第二信息和所述第三信息携带在所述overlay的文件格式中。The server according to claim 62, wherein the second information and the third information are carried in a file format of the overlay.
  64. 根据权利要求63所述的服务器,其特征在于,所述文件格式包括overlay结构,和位于所述overlay结构中的overlay关联区域控制结构,和overlay群组box;The server according to claim 63, wherein the file format includes an overlay structure, an overlay association area control structure located in the overlay structure, and an overlay group box;
    所述第三信息位于所述overlay群组box中,所述第二信息位于所述overlay关联区域控制结构中。The third information is located in the overlay group box, and the second information is located in the overlay associated area control structure.
  65. 根据权利要求62所述的服务器,其特征在于,所述第一信息携带在所述overlay的的文件格式中。The server according to claim 62, wherein the first information is carried in a file format of the overlay.
  66. 根据权利要求65所述的服务器,其特征在于,所述文件格式包括overlay群组box,其中,所述第一信息位于所述overlay群组box中。The server according to claim 65, wherein the file format comprises an overlay group box, and wherein the first information is located in the overlay group box.
  67. 根据权利要求62所述的服务器,其特征在于,所述第三信息携带在所述overlay对应的overlay码流的辅助增强信息SEI中,所述第二信息携带在所述overlay的overlay关联区域控制结构中。The server according to claim 62, wherein the third information is carried in an auxiliary enhanced information SEI of an overlay code stream corresponding to the overlay, and the second information is carried in an overlay associated area control of the overlay Structure.
  68. 根据权利要求62所述的服务器,其特征在于,所述第一信息携带在所述overlay对应的overlay码流的SEI中。The server according to claim 62, wherein the first information is carried in an SEI of an overlay code stream corresponding to the overlay.
  69. 根据权利要求67或68所述的服务器,其特征在于,所述SEI的载荷类型用于指示所述SEI中携带所述overlay的群组标识信息。The server according to claim 67 or 68, wherein a payload type of the SEI is used to indicate that the SEI carries group identification information of the overlay.
  70. 根据权利要求62所述的服务器,其特征在于,所述第三信息携带在包含所述overlay的媒体数据流的媒体呈现描述MPD中,所述第二信息携带在所述overlay的overlay关联区域控制结构中。The server according to claim 62, wherein the third information is carried in a media presentation description MPD including a media data stream of the overlay, and the second information is carried in an overlay-related area control of the overlay Structure.
  71. 根据权利要求70所述的服务器,其特征在于,所述第三信息位于所述MPD的adaptation set level或者representation level的overlay描述字中。The server according to claim 70, wherein the third information is located in an overlay description word of an adaptation level or a representation level of the MPD.
  72. 根据权利要求62所述的服务器,其特征在于,所述第一信息携带在包含所述overlay的媒体数据流的媒体呈现描述MPD中。The server according to claim 62, wherein the first information is carried in a media presentation description MPD including a media data stream of the overlay.
  73. 根据权利要求72所述的服务器,其特征在于,所述第一信息位于所述MPD的adaptation set level或者representation level的overlay描述字中。The server according to claim 72, wherein the first information is located in an overlay description word of an adaptation level or a representation level of the MPD.
  74. 根据权利要求62所述的服务器,其特征在于,所述第一信息包括所述overlay的群组标识信息时,所述第一信息携带于实体群组盒子EntityToGroupBox。The server according to claim 62, wherein when the first information includes group identification information of the overlay, the first information is carried in an entity group box EntityToGroupBox.
  75. 根据权利要求74所述的服务器,其特征在于,所述第三信息包括所述overlay的群组标识信息时,第三信息携带于EntityToGroupBox,所述第二信息携带在所述overlay的overlay关联区域控制结构中。The server according to claim 74, wherein when the third information includes group identification information of the overlay, the third information is carried in an EntityToGroupBox, and the second information is carried in an overlay association area of the overlay. In the control structure.
  76. 根据权利要求62-75任一项所述的服务器,其特征在于,所述overlay还对应第四信息,所述第四信息用于指示所述overlay响应触发的第一操作功能的情况下,所述overlay所在群组中的所有overlay响应所述第一操作功能。The server according to any one of claims 62 to 75, wherein the overlay further corresponds to fourth information, and the fourth information is used to indicate a case of a first operation function triggered by the overlay response. All overlays in the group to which the overlay belongs respond to the first operation function.
  77. 根据权利要求62-76任一项所述的服务器,其特征在于,所述overlay还对应第五信息,所述第五信息用于指示所述overlay被触发第一操作功能的情况下,所述overlay所在群组中的所有overlay响应所述第一操作功能,或者所述overlay响应所述第一操作功能。The server according to any one of claims 62 to 76, wherein the overlay further corresponds to fifth information, and the fifth information is used to indicate that in the case that the first triggering function is triggered by the overlay, All overlays in the group to which the overlay belongs respond to the first operation function, or the overlay responds to the first operation function.
  78. 根据权利要求62-77任一项所述的服务器,其特征在于,所述overlay对应第一信息时,所述overlay还对应第六信息,所述第六信息用于指示所述overlay对应的操作功能。The server according to any one of claims 62 to 77, wherein when the overlay corresponds to the first information, the overlay also corresponds to sixth information, and the sixth information is used to indicate an operation corresponding to the overlay. Features.
PCT/CN2019/108514 2018-09-27 2019-09-27 Method for processing media data and terminal and server WO2020063850A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201862737900P 2018-09-27 2018-09-27
US62/737,900 2018-09-27

Publications (1)

Publication Number Publication Date
WO2020063850A1 true WO2020063850A1 (en) 2020-04-02

Family

ID=69950275

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/108514 WO2020063850A1 (en) 2018-09-27 2019-09-27 Method for processing media data and terminal and server

Country Status (1)

Country Link
WO (1) WO2020063850A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105828160A (en) * 2016-04-01 2016-08-03 腾讯科技(深圳)有限公司 Video play method and apparatus
WO2017202699A1 (en) * 2016-05-23 2017-11-30 Canon Kabushiki Kaisha Method, device, and computer program for adaptive streaming of virtual reality media content
CN107770601A (en) * 2016-08-16 2018-03-06 上海交通大学 A kind of method and system towards the personalized presentation of content of multimedia component
CN107888939A (en) * 2016-09-30 2018-04-06 华为技术有限公司 A kind of processing method and processing device of video data
CN108271044A (en) * 2016-12-30 2018-07-10 华为技术有限公司 A kind of processing method and processing device of information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105828160A (en) * 2016-04-01 2016-08-03 腾讯科技(深圳)有限公司 Video play method and apparatus
WO2017202699A1 (en) * 2016-05-23 2017-11-30 Canon Kabushiki Kaisha Method, device, and computer program for adaptive streaming of virtual reality media content
CN107770601A (en) * 2016-08-16 2018-03-06 上海交通大学 A kind of method and system towards the personalized presentation of content of multimedia component
CN107888939A (en) * 2016-09-30 2018-04-06 华为技术有限公司 A kind of processing method and processing device of video data
CN108271044A (en) * 2016-12-30 2018-07-10 华为技术有限公司 A kind of processing method and processing device of information

Similar Documents

Publication Publication Date Title
US11902350B2 (en) Video processing method and apparatus
US11245926B2 (en) Methods and apparatus for track derivation for immersive media data tracks
JP7058273B2 (en) Information processing method and equipment
US20200145736A1 (en) Media data processing method and apparatus
US10939086B2 (en) Methods and apparatus for encoding and decoding virtual reality content
US20200092600A1 (en) Method and apparatus for presenting video information
US20190238933A1 (en) Video stream transmission method and related device and system
EP3804335A2 (en) Method and apparatus for signaling user interactions on overlay and grouping overlays to background for omnidirectional content
US10931930B2 (en) Methods and apparatus for immersive media content overlays
US20200145716A1 (en) Media information processing method and apparatus
EP3716634A1 (en) Method and device for processing media data
US20210218792A1 (en) Media data transmission method, client, and server
US20200228837A1 (en) Media information processing method and apparatus
US20210218908A1 (en) Method for Processing Media Data, Client, and Server
TW202133610A (en) Methods and apparatus for using track derivations to generate new tracks for network based media processing applications
WO2020063850A1 (en) Method for processing media data and terminal and server
WO2022116822A1 (en) Data processing method and apparatus for immersive media, and computer-readable storage medium
WO2023169003A1 (en) Point cloud media decoding method and apparatus and point cloud media coding method and apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19866354

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19866354

Country of ref document: EP

Kind code of ref document: A1