WO2018072488A1 - 一种数据处理方法、相关设备及系统 - Google Patents

一种数据处理方法、相关设备及系统 Download PDF

Info

Publication number
WO2018072488A1
WO2018072488A1 PCT/CN2017/092772 CN2017092772W WO2018072488A1 WO 2018072488 A1 WO2018072488 A1 WO 2018072488A1 CN 2017092772 W CN2017092772 W CN 2017092772W WO 2018072488 A1 WO2018072488 A1 WO 2018072488A1
Authority
WO
WIPO (PCT)
Prior art keywords
code stream
complementary
content
identifier
media presentation
Prior art date
Application number
PCT/CN2017/092772
Other languages
English (en)
French (fr)
Inventor
邸佩云
方华猛
谢清鹏
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2018072488A1 publication Critical patent/WO2018072488A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • H04N21/2353Processing of additional data, e.g. scrambling of additional data or processing content descriptors specifically adapted to content descriptors, e.g. coding, compressing or processing of metadata
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream

Definitions

  • the present invention relates to the field of computer technologies, and in particular, to a data processing method, related device, and system.
  • VR virtual reality
  • the video is divided into multiple play periods from the time domain and each play period corresponds to a plurality of segments of different resolutions, and the user can obtain video clips of various qualities according to information such as network conditions (for example, HD video). , standard definition video, etc.) select a video clip that is more suitable for you.
  • network conditions for example, HD video
  • the content of the spatial object that is presented in the range of the user's perspective is a video with a relatively high video quality
  • the content of the spatial object that is presented outside the user's perspective is a video with a relatively low video quality, which ensures
  • the content of the spatial object within the perspective is as clear as possible.
  • the server providing the VR video performs low-quality encoding on all video content of any one playing period of the video, as a base layer, and the entire basic layer is low-quality encoded content (normally, after FOV switching)
  • the playback period also changes accordingly, and the corresponding base layer also changes); at the same time, the video of the same playback period is divided into multiple parts and the video of each part is encoded with high quality as an enhancement layer.
  • Each part is a high-quality encoded content of a spatial object, each spatial object corresponding to a set of spatial information; then the spatial object is determined according to the spatial information determined by the FOV (the FOV may correspond to one or more spatial objects) and further according to The determined spatial object determines high quality encoded content of the spatial object, and then transmits all low quality encoded content of the base layer to the client and transmits high quality encoded content of the spatial object determined based on the FOV to the client.
  • the client receives high quality encoded content of the spatial object determined based on the FOV and all low quality encoded content of the base layer.
  • the high-quality encoded content of the spatial object is presented in the current FOV range; when the user switches the FOV, if the spatial object corresponding to the FOV before the switching has not completely covered the spatial object corresponding to the new FOV.
  • the low-quality encoded content is used to decode the rendering in the uncoverable part, and the high-quality encoded content of the spatial object corresponding to the new FOV is obtained from the server in time; it can be understood that the client is requesting the spatial object corresponding to the new FOV.
  • the part or all of the low-quality encoded content is first presented in the new FOV, and the user's discomfort caused by waiting for the high-quality encoded content of the spatial object corresponding to the new FOV can be avoided.
  • a disadvantage of the prior art is that when the FOV of the user remains unchanged, the server not only sends the high-quality encoded content of the spatial object corresponding to the FOV to the client but also transmits the low-quality encoded content of the spatial object corresponding to the FOV, which is not only wasted. Bandwidth also causes redundancy in the content in the client.
  • the embodiment of the invention discloses a video data processing method, related device and system, which can save server and guest The transmission bandwidth between the clients and the storage space on the client.
  • FIG. 1 is a schematic diagram of an example of a framework for DASH standard transmission used in system layer video streaming media transmission.
  • the data transmission process of the system layer video streaming media transmission scheme includes two processes: a server side (such as an HTTP server, hereinafter referred to as a server) processes for generating media data for video content, and a client (such as an HTTP streaming media client) requests the server. And the process of getting media data.
  • the media data includes a media presentation description (MPD).
  • the MPD on the server includes a plurality of representations (also called presentation or description layers, English: representation), each representation describing a plurality of fragments.
  • the HTTP streaming request control module of the client obtains the MPD sent by the server, analyzes the MPD, determines the information of each fragment of the video stream described in the MPD, and further determines the fragment to be requested, and requests the receiving end from the server through the HTTP request. The corresponding segment is played and decoded by the media player.
  • the server prepares multiple versions of the code stream for the same video content. For example, the server generates a low resolution low bit rate and low frame rate for the video content of the same episode (eg, 360p resolution, 300kbps rate, 15fps frame). Rate), medium resolution medium code rate high frame rate (such as 720p resolution, 1200kbps code rate, 25fps frame rate) stream, high resolution, high bit rate, high frame rate (such as 1080p resolution, 3000kbps code) Rate, 25fps frame rate), etc.
  • a representation in the DASH standard English: representation
  • Representation is a collection and encapsulation of one or more codestreams in a transport format, one representation containing one or more segments.
  • the coding parameters of the code rate and resolution of different versions of the code stream may be different, and each code stream is divided into a plurality of small files, and each small file is called a segment (or segment, English: segment).
  • each code stream is divided into a plurality of small files, and each small file is called a segment (or segment, English: segment).
  • the server prepares three representations for a movie, including rep1 (representing 1) and rep2 (representing 2).
  • rep3 represents 3).
  • rep1 is a high-definition video with a code rate of 4mbps (megabits per second)
  • rep2 is a standard-definition video with a code rate of 2mbps
  • rep3 is a standard-definition video with a code rate of 1mbps.
  • the segment marked as shaded in Figure 2 is the segmentation data requested by the client.
  • the first three segments requested by the client are the segments of the media representation rep3, the fourth segment is switched to rep2, and the fourth segment is requested. Segment, then switch to rep1, request the fifth segment and the sixth segment, and so on.
  • Each represented segment can be stored in a file end to end, or it can be stored as a small file.
  • the segment may be packaged in accordance with the standard ISO/IEC 14496-12 (ISO BMFF (Base Media File Format)) or may be encapsulated in accordance with ISO/IEC 13818-1 (MPEG-2 TS).
  • the media presentation description is called MPD
  • the MPD can be an xml file.
  • the information in the file is described in a hierarchical manner. As shown in FIG. 3, the information of the upper level is completely inherited by the next level.
  • Some media metadata is described in this file, which allows the client to understand the media content information in the server and can use this information to construct the http-URL of the request segment.
  • media presentation is the presentation of media content. a collection of data; a media presentation description (English: media presentation description), a document that normalizes the description of the media, used to provide streaming services; a period (English: period), a set of consecutive periods that constitute the entire media presentation, the period has continuous And non-overlapping features; representation (English: representation), a structured data set encapsulating one or more media content components (encoded individual media types, such as audio, video, etc.) with descriptive metadata, ie Representation is a collection and encapsulation of one or more code streams in a transport format, one representation containing one or more segments; an adaptive set (English: AdaptationSet) representing multiple interchangeable coded versions of the same media content component a set, an adaptive set containing one or more representations; a subset (English: subset), a combination of a set of adaptive sets, when the player plays all of the adaptive sets, the corresponding media content can be obtained; Information, which is a media unit reference
  • FIG. 4 is a schematic diagram of a segment storage manner in the code stream data; and the other is that all segments on the same rep are stored.
  • FIG. 5 is another schematic diagram of a segment storage manner in the code stream data.
  • each segment in the segment of repA (representing A) is stored as a file separately, and each segment in the segment of repB (representing B) is also stored as a file separately.
  • the server may describe information such as the URL of each segment in the form of a template or a list in the MPD of the code stream.
  • the server may use an index segment (English: index segment, that is, SIDX in FIG. 5) in the MPD of the code stream to describe related information of each segment.
  • the index segment describes the byte offset of each segment in its stored file, the size of each segment, and the duration of each segment (duration, also known as the duration of each segment, referred to as the duration).
  • An adaptive set (e.g., a first adaptive set, a second adaptive set, etc.) in an embodiment of the present invention is used to describe attributes of media data segments of a plurality of interchangeable encoded versions of the same media content component.
  • Data collection The representation in this embodiment is represented as a collection and encapsulation of one or more code streams in a transport format.
  • the descriptor in the embodiment of the present invention is used to describe spatial information of a spatial object associated with it.
  • the related technical concept of the MPEG-DASH technology of the present invention can refer to the relevant provisions in ISO/IEC 23009-1:2014 ⁇ Information technology--Dynamic adaptive streaming over HTTP(DASH)--Part 1:Media presentation description and segment formats. You can also refer to the relevant provisions in the historical standard version, such as ISO/IEC 23009-1:2013 or ISO/IEC 23009-1:2012.
  • the computer processes the data that is appropriate to the actions of the participants.
  • the user's input responds in real time and feeds back to the user's facial features.
  • a sensing device is a three-dimensional interactive device. When VR video (or 360 degree video, or Omnidirectional video) is presented on the headset and handheld device, only the video image representation and associated audio presentation corresponding to the orientation portion of the user's head are presented. .
  • VR video is that the entire video content will be presented to the user; VR video is only a subset of the entire video is presented to the user (English: in VR typically only a Subset of the entire video region represented by the video pictures).
  • a Spatial Object is defined as a spatial part of a content component (ega region of interest, or a tile ) and represented by either an Adaptation Set or a Sub-Representation.”
  • [ ⁇ ] The spatial relationship between spatial objects (Spatial Objects) is described in MPD.
  • a spatial object is defined as a part of a content component, such as an existing region of interest (ROI) and tiles; spatial relationships can be described in Adaptation Set and Sub-Representation.
  • the existing DASH standard defines some descriptor elements in the MPD. Each descriptor element has two attributes, schemeIdURI and value. Among them, the schemeIdURI describes what the current descriptor is, and the value is the parameter value of the descriptor.
  • SupplementalProperty and EssentialProperty SupplementalProperty and EssentialProperty (supplemental feature descriptors and basic property descriptors).
  • schemeIdURI urn:mpeg:dash:srd:2014
  • the spatial information associated to the containing Spatial Object., the corresponding value lists a series of parameter values of the SDR. The syntax of the specific value is shown in the following table 0:
  • the server may divide a space within a 360-degree view range to obtain a plurality of spatial objects, each spatial object corresponding to a sub-view of the user,
  • the splicing of multiple sub-views forms a complete human eye viewing angle.
  • the dynamic change of the viewing angle of the human eye can usually be 120 degrees * 120 degrees.
  • the spatial object 1 and the spatial object 2 described in FIG. 6 are spatial objects that are viewed by two different perspectives of the user.
  • the server may prepare a set of video code streams for each spatial object.
  • the server may obtain encoding configuration parameters of each code stream in the video, and generate a code stream corresponding to each spatial object of the video according to the encoding configuration parameters of the code stream.
  • the client may request the video stream segment corresponding to a certain angle of view for a certain period of time to be output to the spatial object corresponding to the perspective when the video is output.
  • the client outputs the video stream segment corresponding to all the angles of view within the 360-degree viewing angle range in the same period of time, and the complete video image in the time period can be outputted in the entire 360-degree space.
  • the server may first map the spherical surface into a plane, and divide the space on the plane. Specifically, the server may map the spherical surface into a latitude and longitude plan by using a latitude and longitude mapping manner.
  • FIG. 7 is a schematic diagram of a spatial object according to an embodiment of the present invention. The server can map the spherical surface into a latitude and longitude plan, and divide the latitude and longitude plan into a plurality of spatial objects such as A to I.
  • the server may also map the spherical surface into a cube, expand the plurality of faces of the cube to obtain a plan view, or map the spherical surface to other polyhedrons, and expand the plurality of faces of the polyhedron to obtain a plan view or the like.
  • the server can also map the sphere to a plane by using more mapping methods. It can be determined according to the actual application scenario requirements, and there is no restriction here. The following will be described in conjunction with FIG. 7 in a latitude and longitude mapping manner.
  • a set of DASH code streams can be prepared for each spatial object by the server.
  • Each spatial object corresponds to one sub-view, and a set of DASH code streams corresponding to each spatial object is a view code stream of each sub-view.
  • the spatial information of the spatial objects associated with each image in a view code stream is the same, whereby the view code stream can be set as a static view code stream.
  • the view code stream of each sub-view is part of the entire video stream, and the view code streams of all sub-views constitute a complete video stream.
  • the DASH code stream corresponding to the corresponding spatial object may be selected for playing according to the viewing angle currently viewed by the user.
  • the client may determine the DASH code stream corresponding to the switched target space object according to the new perspective selected by the user.
  • an embodiment of the present invention provides a data processing method, including: receiving a media presentation description, where the media presentation description includes a complementary identifier to indicate that the media presentation description describes a view stream and a complementary stream.
  • the view code stream is a code stream obtained by encoding the content of the first spatial object of the target picture
  • the complementary code stream is a code stream obtained by encoding the content of the second spatial object of the target picture, where the target picture includes the first The content of the spatial object and the content of the second spatial object; the view code stream and the complementary code stream are obtained according to the complementary identifier.
  • the complementary identifier is used to identify a complementary code stream
  • the complementary code stream includes a view stream.
  • complementary code streams are used to indicate that the view code stream and the complementary code stream are described in the media presentation description: the complementary identifier is used to identify a complementary code stream, and the complementary code stream includes a view stream. And complementary code streams.
  • the target picture includes the content of the first spatial object and the content of the second spatial object may be understood as: the target picture is composed of the content of the first spatial object and the second spatial object Content composition.
  • an embodiment of the present invention provides a data processing method, including: receiving a media presentation description, where the media presentation description includes a first descriptor and a second descriptor, where the first descriptor includes a first complementary identifier, The second descriptor includes a second complementary identifier, where the value of the first complementary identifier is equal to a preset first value, and is used to identify that the code stream described by the first descriptor is a complementary code stream, and the second complementary identifier The value is equal to the preset second value, and is used to identify the code stream described by the second descriptor as a view code stream; the view code stream is a code stream obtained by encoding the content of the first spatial object of the target picture.
  • the complementary code stream is a code stream obtained by encoding a content of the second spatial object of the target picture, the target picture including the content of the first spatial object and the content of the second spatial object; acquiring the location according to the first complementary identifier Comprising the complementary code stream and acquiring the view code stream according to the second complementary identifier.
  • the target picture includes the content of the first spatial object and the content of the second spatial object may be understood as: the target picture is composed of the content of the first spatial object and the second spatial object Content composition.
  • the server indicates the view code stream and the complementary code stream by using the complementary identifier in the MPD, and correspondingly, after receiving the MPD, the client determines the view code stream and the complementary code stream according to the complementary identifier, and then The server requests the view code stream and the complementary code stream and presents; since the content of the first spatial object corresponding to the view code stream and the content of the second space object corresponding to the complementary code stream form a complete target picture, the view code is There is almost no overlapping content between the stream and the complementary stream, which saves the transmission bandwidth between the server and the client and the storage space on the client.
  • an embodiment of the present invention provides a data processing method, where the method includes: generating a media presentation description, where the media presentation description includes a complementary identifier to indicate that the media presentation description describes a view code stream and a complementary code stream,
  • the view code stream encodes a content of the first spatial object of the target picture to obtain a code stream
  • the complementary code stream encodes a content of the second spatial object of the target picture to obtain a code stream, where the target picture is the first space object Content and content of the second spatial object
  • sending the media presentation description to the client so that the client obtains the view code stream and the complementary code stream according to the complementary identifier.
  • the complementary identifier is used to identify a complementary code stream
  • the complementary code stream includes a view stream.
  • complementary code streams are used to indicate that the view code stream and the complementary code stream are described in the media presentation description: the complementary identifier is used to identify a complementary code stream, and the complementary code stream includes a view stream. And complementary code streams.
  • the target picture includes the content of the first spatial object and the content of the second spatial object may be understood as: the target picture is composed of the content of the first spatial object and the second spatial object Content composition.
  • an embodiment of the present invention provides a data processing method, including: generating a media presentation description, where the media presentation description includes a first descriptor and a second descriptor, where the first descriptor includes a first complementary identifier, The second descriptor includes a second complementary identifier; the value of the first complementary identifier is equal to a preset first value, and is used to identify that the code stream described by the first descriptor is a complementary code stream, and the second complementary identifier The value is equal to the preset second value, and is used to identify the code stream described by the second descriptor as a view code stream; the view code stream is a code stream obtained by encoding the content of the first spatial object of the target picture.
  • the complementary code stream encodes the content of the second spatial object of the target picture to obtain a code stream, the target picture includes the content of the first spatial object and the content of the second spatial object; and sends the media presentation description to the client, So that the client acquires the complementary code stream according to the first complementary identifier and acquires the view code stream according to the second complementary identifier.
  • the target picture includes the content of the first spatial object and the content of the second spatial object may be understood as: the target picture is composed of the content of the first spatial object and the second spatial object Content composition.
  • the server indicates the view code stream and the complementary code stream by using the complementary identifier in the MPD, and correspondingly, after receiving the MPD, the client determines the view code stream and the complementary code stream according to the complementary identifier, and then The server requests the view code stream and the complementary code stream and presents; since the content of the first spatial object corresponding to the view code stream and the content of the second space object corresponding to the complementary code stream form a complete target picture, the view code is There is almost no overlapping content between the stream and the complementary stream, which saves the transmission bandwidth between the server and the client and the storage space on the client.
  • an embodiment of the present invention provides a client, where the client includes a receiving unit and an obtaining unit, where the receiving unit is configured to receive a media presentation description, where the media presentation description includes a complementary identifier to indicate the media presentation description.
  • the view code stream is a code stream obtained by encoding a content of the first spatial object of the target picture
  • the complementary code stream is a code for encoding a content of the second spatial object of the target picture
  • the target unit includes the content of the first spatial object and the content of the second spatial object;
  • the acquiring unit is configured to acquire the view code stream and the complementary code stream according to the complementary identifier.
  • the complementary identifier is used to identify a complementary code stream
  • the complementary code stream includes a view stream.
  • complementary code streams are used to indicate that the view code stream and the complementary code stream are described in the media presentation description: the complementary identifier is used to identify a complementary code stream, and the complementary code stream includes a view stream. And complementary code streams.
  • the target picture includes the content of the first spatial object and the content of the second spatial object may be understood as: the target picture is composed of the content of the first spatial object and the second spatial object Content composition.
  • an embodiment of the present invention provides a client, where the terminal includes a receiving unit and an obtaining unit, where the receiving unit is configured to receive a media presentation description, where the media presentation description includes a first descriptor and a second descriptor, where The first descriptor includes a first complementary identifier, and the second descriptor includes a second complementary identifier, where the value of the first complementary identifier is equal to a preset first value, and is used to identify the code stream described by the first descriptor.
  • the value of the second complementary identifier for the complementary code stream And being equal to a preset second value, where the code stream described by the second descriptor is used to identify a code stream; the view code stream is a code stream obtained by encoding a content of the first spatial object of the target picture, the complement The code stream is a code stream obtained by encoding the content of the second spatial object of the target picture, the target picture includes the content of the first spatial object and the content of the second spatial object; and the acquiring unit is configured to use, according to the first complementary identifier Obtaining the complementary code stream and acquiring the view code stream according to the second complementary identifier.
  • the target picture includes the content of the first spatial object and the content of the second spatial object may be understood as: the target picture is composed of the content of the first spatial object and the second spatial object Content composition.
  • the server indicates the view code stream and the complementary code stream by using the complementary identifier in the MPD, and correspondingly, after receiving the MPD, the client determines the view code stream and the complementary code stream according to the complementary identifier, and then The server requests the view code stream and the complementary code stream and presents; since the content of the first spatial object corresponding to the view code stream and the content of the second space object corresponding to the complementary code stream form a complete target picture, the view code is There is almost no overlapping content between the stream and the complementary stream, which saves the transmission bandwidth between the server and the client and the storage space on the client.
  • an embodiment of the present invention provides a server, where the server includes a generating unit and a sending unit, where the generating unit is configured to generate a media presentation description, where the media presentation description includes a complementary identifier to indicate that the media presentation description is described in the media presentation description.
  • a view code stream and a complementary code stream wherein the view code stream encodes a content of the first spatial object of the target picture, and the complementary code stream encodes a content of the second spatial object of the target picture to obtain a code stream,
  • the target picture includes the content of the first spatial object and the content of the second spatial object;
  • the sending unit is configured to send the media presentation description to the client, so that the client obtains the view code stream and the complement according to the complementary identifier Code stream.
  • the complementary identifier is used to identify a complementary code stream
  • the complementary code stream includes a view stream.
  • complementary code streams are used to indicate that the view code stream and the complementary code stream are described in the media presentation description: the complementary identifier is used to identify a complementary code stream, and the complementary code stream includes a view stream. And complementary code streams.
  • the target picture includes the content of the first spatial object and the content of the second spatial object may be understood as: the target picture is composed of the content of the first spatial object and the second spatial object Content composition.
  • an embodiment of the present invention provides a server, where the server includes a generating unit and a sending unit, where the generating unit is configured to generate a media presentation description, where the media presentation description includes a first descriptor and a second descriptor, where A descriptor includes a first complementary identifier, and the second descriptor includes a second complementary identifier, where the value of the first complementary identifier is equal to a preset first value, and is used to identify that the code stream described by the first descriptor is a complementary code stream, the value of the second complementary identifier is equal to a preset second value, and is used to identify the code stream described by the second descriptor as a view code stream; the view code stream is the first picture of the target picture
  • the content of the spatial object is encoded to obtain a code stream
  • the complementary code stream is a code stream obtained by encoding the content of the second spatial object of the target image, where the target image includes the content of the first spatial object and the content of the second spatial object;
  • the server indicates the view code stream and the complementary code stream by using the complementary identifier in the MPD, and correspondingly, after receiving the MPD, the client determines the view code stream and the complementary code stream according to the complementary identifier, and then The server requests the view code stream and the complementary code stream and presents; since the content of the first spatial object corresponding to the view code stream and the content of the second space object corresponding to the complementary code stream form a complete target picture, the view code is Stream and the complementary code There is almost no overlapping content in the stream, saving the transmission bandwidth between the server and the client and the storage space on the client.
  • an embodiment of the present invention provides a client, where the client includes a processor, a memory, and an input component, where the memory is used to store programs and data, and the processor calls a program in the memory for executing as follows: Operation: receiving, by the input component, a media presentation description, where the media presentation description includes a complementary identifier, to indicate that the media stream description describes a view code stream and a complementary code stream, where the view code stream is a first space object of the target picture Encoding the content to obtain a code stream, the complementary code stream is a code stream obtained by encoding the content of the second spatial object of the target picture, the target picture comprising the content of the first spatial object and the content of the second spatial object;
  • the complementary identifier acquires the view code stream and the complementary code stream.
  • the complementary identifier is used to identify a complementary code stream
  • the complementary code stream includes a view stream.
  • complementary code streams are used to indicate that the view code stream and the complementary code stream are described in the media presentation description: the complementary identifier is used to identify a complementary code stream, and the complementary code stream includes a view stream. And complementary code streams.
  • the target picture includes the content of the first spatial object and the content of the second spatial object may be understood as: the target picture is composed of the content of the first spatial object and the second spatial object Content composition.
  • an embodiment of the present invention provides a client, where the client includes a processor, a memory, and an input component, where the memory is used to store programs and data, and the processor calls a program in the memory to perform the following operations.
  • the code stream described in the second descriptor is a view code stream; the view code stream is a code stream obtained by encoding a content of the first spatial object of the target picture, and the complementary code stream is a second spatial object of the target picture.
  • the target picture includes the content of the first spatial object and the content of the second spatial object may be understood as: the target picture is composed of the content of the first spatial object and the second spatial object Content composition.
  • the server indicates the view code stream and the complementary code stream by using the complementary identifier in the MPD, and correspondingly, after receiving the MPD, the client determines the view code stream and the complementary code stream according to the complementary identifier, and then The server requests the view code stream and the complementary code stream and presents; since the content of the first spatial object corresponding to the view code stream and the content of the second space object corresponding to the complementary code stream form a complete target picture, the view code is There is almost no overlapping content between the stream and the complementary stream, which saves the transmission bandwidth between the server and the client and the storage space on the client.
  • an embodiment of the present invention provides a server, where the server includes a processor, a memory, and an output component, where the memory is used to store programs and data, and the processor calls a program in the memory to perform the following operations: Generating a media presentation description, the media presentation description including a complementary identifier to indicate that the media presentation description describes a view code stream and a complementary code stream, and the view code stream encodes a content of the first spatial object of the target picture to obtain a code stream And the complementary code stream is a code stream obtained by encoding the content of the second spatial object of the target picture, where the target picture includes the content of the first spatial object and the content of the second spatial object; and the output component sends the content to the client
  • the media presentation description causes the client to obtain the view code stream and the complementary code stream according to the complementary identifier.
  • the target picture includes the The content of the first spatial object and the content of the second spatial object may be understood as: the target picture is composed of the content of the first spatial object and the content of the second spatial object.
  • an embodiment of the present invention provides a server, where the server includes a processor, a memory, and an output component, where the memory is used to store programs and data, and the processor calls a program in the memory to perform the following operations: Generating a media presentation description, the media presentation description including a first descriptor and a second descriptor, the first descriptor includes a first complementary identifier, and the second descriptor includes a second complementary identifier; the value of the first complementary identifier is equal to The preset first value is used to identify the code stream described by the first descriptor as a complementary code stream, and the value of the second complementary identifier is equal to a preset second value for identifying the second descriptor.
  • the code stream is a view code stream; the view code stream is a code stream obtained by encoding the content of the first spatial object of the target picture, and the complementary code stream is a code for encoding the content of the second spatial object of the target picture.
  • the target picture includes content of the first spatial object and content of the second spatial object; the media presentation description is sent to the client by the output component to enable the guest End of the complementary stream acquired and acquiring the angle of view according to the second complementary identification code stream according to the first complementary identification.
  • the target picture includes the content of the first spatial object and the content of the second spatial object may be understood as: the target picture is composed of the content of the first spatial object and the second spatial object Content composition.
  • the server indicates the view code stream and the complementary code stream by using the complementary identifier in the MPD, and correspondingly, after receiving the MPD, the client determines the view code stream and the complementary code stream according to the complementary identifier, and then The server requests the view code stream and the complementary code stream and presents; since the content of the first spatial object corresponding to the view code stream and the content of the second space object corresponding to the complementary code stream form a complete target picture, the view code is There is almost no overlapping content between the stream and the complementary stream, which saves the transmission bandwidth between the server and the client and the storage space on the client.
  • the media presentation description includes the first adaptation
  • the information of one representation of the first adaptation set includes the complementary identifier, and the code stream described by the information for identifying the one representation is the complementary code stream.
  • the value of the complementary identifier is a value of a representation identifier of a representation of another representation in the media presentation description, for identifying the other
  • the code stream described by a representation of the information is the view code stream.
  • the media presentation description includes the second adaptation set
  • the second adaptation set includes the complementary identifier to indicate that the second adaptation set includes information describing a representation of the complementary code stream.
  • the value of the complementary identifier is a value of a third adaptive set identifier adaptationSet ID in the media presentation description, to identify the third self
  • the code stream described by the information adapted to the centralized representation is the view code stream.
  • an embodiment of the present invention provides a data processing system, where the system includes a client and a server, where:
  • the client is described by any of the possible implementations of the fifth aspect, or the client described in any of the possible implementations of the sixth aspect, or any of the possible implementations of the ninth aspect Client, or a client as described in any of the possible implementations of the tenth aspect;
  • the server is the server described in any of the possible implementations of the seventh aspect, or any of the eighth aspects
  • the server indicates the view code stream and the complementary code stream by using the complementary identifier in the MPD, and correspondingly, after receiving the MPD, the client determines the view code stream and the complementary code stream according to the complementary identifier, and then Requesting, by the server, the view code stream and the complementary code stream and presenting; since the content of the first spatial object corresponding to the view code stream and the content of the second space object corresponding to the complementary code stream constitute a complete target picture, The view code stream and the complementary code stream have almost no overlapping content, which saves the transmission bandwidth between the server and the client and the storage space on the client.
  • FIG. 1 is a schematic diagram of an example of a framework for DASH standard transmission used in system layer video streaming media transmission
  • FIG. 2 is a schematic diagram of a video file that is encoded into a code stream of multiple code rates according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of a scenario of segmentation description of an MPD file according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a scenario of segmentation storage of code stream data according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of another scenario of segmentation storage of code stream data according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a scene of a spatial object according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of another scenario of a spatial object according to an embodiment of the present invention.
  • FIG. 8 is a schematic flowchart of a data processing method according to an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a client according to an embodiment of the present disclosure.
  • FIG. 10 is a schematic structural diagram of a server according to an embodiment of the present invention.
  • FIG. 11 is a schematic structural diagram of still another client according to an embodiment of the present invention.
  • FIG. 12 is a schematic structural diagram of still another server according to an embodiment of the present invention.
  • FIG. 13 is a schematic structural diagram of a data processing system according to an embodiment of the present invention.
  • the user can switch the angle of view through the operation of the eye or the head, or the screen switching of the video viewing device, and accordingly, the position of the viewing angle is switched from the spatial object 1 to the space.
  • Object 2 the server needs to send the high-quality encoded content of the spatial object 2 to the client, and the client first presents the low-quality encoding of the spatial object 2 while transmitting the high-quality encoded content.
  • Content when the client receives the high-quality encoded content of the spatial object 2, displays the high-quality encoded content of the spatial object 2 without displaying the low-quality encoded content of the spatial object 2, which is equivalent to the low quality of the spatial object 2.
  • the encoded content is only used to transition to avoid the user's discomfort due to the inability to see the contents of the spatial object 2 in time.
  • the user's perspective FOV stays on the spatial object 2 for a relatively long time.
  • the low-quality encoded content of the spatial object 2 sent by the server to the client is Not used, these low-quality coded content caused a waste of transmission bandwidth, and also caused more redundant data in the client.
  • an embodiment of the present invention provides the following method.
  • FIG. 8 is a schematic flowchart diagram of a video data processing method according to an embodiment of the present invention, where the method includes but is not limited to the following steps.
  • Step S801 The server generates a media presentation description MPD.
  • the process of generating the MPD includes establishing a correspondence between the view code stream and the complementary code stream, and configuring a complementary identifier in the MPD to reflect the corresponding relationship (or configuring the first complementary identifier and the second complementary identifier to reflect the corresponding a relationship, such as the third example;
  • the view code stream encodes a content of the first spatial object of the target picture to obtain a code stream
  • the complementary code stream encodes a content of the second spatial object of the target picture to obtain a code stream
  • the target picture is composed of the content of the first spatial object and the content of the second spatial object (may be referred to as "complementary" of the first spatial object and the second spatial object in the target picture)
  • the target picture is provided by the server A picture (or frame) in a video source (for example, a TV show, movie, etc.).
  • the first spatial object and the second spatial object are both defined as a part of a content component for describing spatial relationships, for example, an existing region of interest (English: region of interest, referred to as: ROI) ), tiles, etc. all belong to a part of a content (such as a picture), the information of the space object can be in the adaptive set Adaptation Set, representation (Representation) information, sub-representation (Sub-Representation), descriptors, etc. Described in.
  • the content of the picture needs to be encoded with reference to a preset coding parameter.
  • the coding parameter often defines information such as resolution, compression rate, and code rate.
  • the coding parameters used are different, and the coding effect is different, for example, The higher the rate and code rate, the clearer the picture will be.
  • the coding parameters may be coded with reference to different coding parameters, so that the view code stream and the complementary code stream display different effects.
  • the embodiment of the present invention pre-establishes the correspondence between the view code stream and the complementary code stream to indicate that the view code stream and the complement code stream are complementary, so that the view code stream can be found according to the corresponding relationship.
  • the rule for determining the view code stream is not limited herein.
  • the first space object is determined.
  • the view code stream may be determined, and the space object to which the user's FOV is looked at may be determined as the first space object to determine the view code stream.
  • the view code stream and the complementary code stream are obtained according to the coding parameter encoding, the view code stream can be encoded more clearly. It should be noted that if the user's perspective FOV changes, the view code stream determined based on the changed FOV also changes, and the re-determined view code stream also corresponds to the new complementary code stream. As shown in FIG.
  • the complete space of the target image is a space formed by the spatial objects A to I.
  • the second spatial object corresponding to the complementary code stream is a space.
  • the space formed by the object B to I is spliced; when the first spatial object corresponding to the view code stream is the space object E, the second spatial object corresponding to the complementary code stream is the spatial object A, B, C, D, F, G,
  • the space formed by the splicing of E and I, and so on, the space content of the above-mentioned spatial object A and its corresponding complementary spatial object may not overlap at all, or may partially overlap.
  • the target image is used to display a certain scene in 360 degrees, and the outline of the target image may be a regular shape or an irregular shape.
  • the following example shows how to represent the correspondence between the view code stream and the complementary code stream through the complementary identifier.
  • ComplementaryId A complementary identifier, ComplementaryId, is added to the MPD to mark the representation ID of the view stream.
  • ComplementaryId A complementary identifier, ComplementaryId, is added to the MPD to mark the representation ID of the view stream.
  • the following is a description of the involved ComplementaryId through Table 1, and then combined with the relevant code to tell how to apply.
  • the media presentation description includes an adaptation set (AdaptationSet).
  • AdaptationSet adaptation set
  • the adaptive set in the first example may be referred to as a first adaptive set
  • the first The information of one representation of the adaptive set includes the complementary identifier for identifying the code stream described by the information of the one representation as the complementary code stream.
  • the value of the complementary identifier is a value of a representation identifier, a representationaion ID, of the information of another representation in the media presentation description, for identifying the codestream described by the information of the other representation as the viewstream.
  • the code stream A is considered to be a complementary code stream and there is a view code stream corresponding to the code stream A. If the Representation ID in the information indicating the representation of the code stream B is described, A value equal to the value of the ComplementaryId is considered to be the stream of view streams.
  • An example of an MPD is provided below.
  • the presence of the complementaryId in the indicated information indicates that the code stream video-3.mp4 is a complementary stream, and the video-3.mp4 has a corresponding view stream; since the description of the code stream video-2.mp4 is represented
  • a complementary identifier ComplementaryId describing the adaptation set of the complementary stream is added to the MPD.
  • the following is a description of the involved ComplementaryId through Table 2, and then through the relevant code to tell how to apply.
  • the media presentation description includes an adaptation set (AdaptationSet).
  • the adaptive set in the second example may be referred to as a second adaptive set, and the second The adaptation set includes the complementary identifier ComplementaryId to indicate that the second adaptation set contains information describing the representation of the complementary code stream.
  • the value of the complementary identifier ComplementaryId is the value of the third adaptive set identifier adaptationSet ID in the media presentation description, and the code stream used to identify the representation of the representation in the third adaptive set is the view code stream.
  • the third adaptive set is an adaptive set that is different from the second adaptive set.
  • the code stream described by the information in the adaptive set A is a complementary code stream, and the complementary code stream has a corresponding view code stream; if the adaptive set B The value of the adaptive set identifier adaptationSet ID in the adaptation set A is equal to the value of the complementary identifier ComplementaryId in the adaptation set A, then the code stream described by the information in the adaptation set B is the view code stream.
  • An example of an MPD is provided below.
  • the sub-specified object is a complementary code stream. The following describes the involved ComplementaryId1 and ComplementaryId2 through Tables 3 and 4, and then describes how to apply it through the relevant code.
  • the media presentation description includes two descriptors, one of which may be referred to as a first descriptor, and the other descriptor is a second descriptor, and the first descriptor includes a first complementary identifier.
  • the second descriptor includes a second complementary identifier, where the value of the first complementary identifier is equal to a preset first value, to indicate that the code stream described by the first descriptor is the complementary code stream, and the second complementary identifier The value is equal to the preset second value to indicate that the code stream described by the second descriptor is the view code stream.
  • the first descriptor and the second descriptor are respectively descriptors in two different adaptive sets. The first value and the second value are two pre-configured values that are distinguishable from each other.
  • the descriptor in the MPD can be used to define the spatial object in the video stream.
  • the coordinate of the spatial object described by value is (0,0); the fourth value of the value and the fifth value are the spatial coordinates, which is used to represent the The length and width of the spatial object, here indicates that the space object has a length and width (1920, 1080); the sixth value and the seventh value of the value are used to represent the space of the spatial object reference, which indicates the spatial object reference
  • the space is (3840, 2160); the eighth value of value is the spatial object group identifier, where the spatial object group identifier is 2.
  • an attribute is added to the value, and the position of the added attribute in the value is not limited herein.
  • the newly added attribute in the first descriptor may be referred to as a first complementary identifier, and the newly added attribute in the second descriptor may be referred to as a second complementary identifier.
  • the value of the first complementary identifier is equal to the first value (eg, equal to 0) to indicate that the content of the spatial object described by the first descriptor is a view stream, and The area represented by the space coordinate in the reference space is the spatial object described by the value.
  • the value of the second complementary identifier is equal to the second value (eg, equal to 1) indicating that the content of the spatial object described by the second descriptor is a complementary code stream. And, a portion other than the region indicated by the spatial coordinates in the reference space is a spatial object described by the value.
  • a program code for a specific implementation is provided below.
  • the spatial objects (1920, 1080) are spatial regions in the reference space (3840, 2160)
  • the spatial regions in the reference space (1920, 1080) with the spatial coordinates (960, 540) are the same spatial region, Therefore, the spatial object described by the first descriptor is the first spatial object, and the spatial object described by the second descriptor is the second spatial object.
  • Step S802 The server sends the media presentation description MPD generated above to the client.
  • Step S803 The client receives the MPD.
  • Step S804 The client acquires the complementary identifier in the MPD, so as to determine the view code stream and the complementary code stream according to the complementary identifier (or parse the first complementary identifier and the second complementary identifier, and according to the first complementary identifier and the first The two complementary identifiers determine the view code stream and the complementary code stream).
  • the manner in which the server generates the MPD is different, and the manner in which the client parses the MPD is different.
  • the following examples are used to describe how the client parses the MPD by using the first example, the second example, and the third example.
  • the client obtains the first adaptive set in the MPD after receiving the MPD, and analyzes the information of the representation in the first adaptive set, when a certain representation
  • the information includes a complementary ComplementaryId
  • it indicates that the information of the certain representation is used to describe the complementary code stream and the complementary code stream has a corresponding view code stream. If there is another representation (Representation), the value of the Representation ID is equal to the value.
  • the value of the complementary identifier ComplementaryId is then the other representation of the described code stream as the view stream.
  • the client obtains the second adaptive set of the MPD after receiving the MPD, and if the second adaptive set includes the complementary identifier, the complementary identifier, the second adaptive set is displayed.
  • the code stream of the indicated information is a complementary code stream and the complementary code stream has a corresponding view code stream. If the value of the adaptive set identifier AdaptationSet ID of an adaptive set is equal to the value of the complementary identifier, the indication is The code stream described by the information of the representation of an adaptation set is the view code stream.
  • the client obtains the descriptor value of the MPD after receiving the MPD. If the value of the MPD meets the preset relationship, then one of the value descriptions is determined.
  • the content of the spatial object is a view code stream, and the content of the spatial object described by another value is determined to be a complementary code stream.
  • the preset relationship is: one of the two values has a first complementaryComplementaryId1, and another value has a second a complementary identifier, ComplementaryId2, where the value of the first complementary identifier is a first value, and the value of the second complementary identifier is a second value, wherein the spatial object described by one value is the first spatial object and the spatial object described by the other value is the first Two spatial objects.
  • Step S805 The client requests the view code stream and the complementary code stream from the server.
  • the MPD may carry a network storage address of the view code stream and a network storage address of the complementary code stream, where the network storage address may pass a Uniform Resource Locator (URL), an offset, or the like. Way to reflect.
  • URL Uniform Resource Locator
  • Step S806 The server receives the request and sends the view code stream and the complementary code stream to the client according to the request.
  • Step S807 The client receives the view code stream and the complementary code stream, decodes the view code stream and the complementary code stream, and presents the same through a display screen.
  • the server indicates the view code stream and the complementary code stream in the MPD by the complementary identifier, and correspondingly, after receiving the MPD, the client determines the view code stream and the complementary code stream according to the complementary identifier. And requesting, by the server, the view code stream and the complementary code stream and presenting; since the content of the first spatial object corresponding to the view code stream and the content of the second space object corresponding to the complementary code stream form a complete target picture, Therefore, the view code stream and the complementary code stream have almost no overlapping content, which saves the transmission bandwidth between the server and the client and the storage space on the client.
  • FIG. 9 is a schematic structural diagram of a client 90 according to an embodiment of the present invention.
  • the client 90 may include a receiving unit 901 and an obtaining unit 902.
  • the detailed description of each unit is as follows.
  • the receiving unit 901 is configured to receive a media presentation description, where the media presentation description includes a complementary identifier to indicate that the media stream description describes a view code stream and a complementary code stream, where the view code stream is a first space object of the target picture.
  • the content encoding obtains a code stream
  • the complementary code stream encodes a content of the second spatial object of the target picture to obtain a code stream, where the target picture includes the content of the first spatial object and the content of the second spatial object;
  • the obtaining unit 902 is configured to obtain the view code stream and the complementary code stream according to the complementary identifier.
  • the server indicates the view code stream and the complementary code stream by using the complementary identifier in the MPD. Accordingly, after receiving the MPD, the client 90 determines the view code stream and the complementary code stream according to the complementary identifier, and then The server requests the view code stream and the complementary code stream and presents; since the content of the first spatial object corresponding to the view code stream and the content of the second space object corresponding to the complementary code stream constitute a complete target picture, the view is There is almost no overlapping content between the code stream and the complementary code stream, which saves the transmission bandwidth between the server and the client 90 and the storage space on the client 90.
  • the media presentation description includes a first adaptation set, and information of a representation of the first adaptation set includes the complementary identifier for identifying a code described by the information of the representation.
  • the stream is the complementary code stream.
  • the value of the complementary identifier is a value of a representation identifier representationa ID of another information in the media presentation description, and the code stream described by the information indicating the another representation is the view code stream.
  • the media presentation description includes a second adaptation set
  • the second adaptation set includes the complementary identifier to indicate that the second adaptation set includes a representation for describing the complementary code stream Information.
  • the value of the complementary identifier is a value of a third adaptive set identifier adaptationSet ID in the media presentation description, where the code stream used to identify the information in the third adaptive set is the view code. flow.
  • the related descriptions of the receiving unit 901 and the obtaining unit 902 included in the client 90 may also be:
  • the receiving unit 901 is configured to receive a media presentation description, where the media presentation description includes a first descriptor and a second descriptor, where the first descriptor includes a first complementary identifier, and the second descriptor includes a second complementary identifier, the first Complementary identifier value And being equal to the preset first value, wherein the code stream described by the first descriptor is used to identify a complementary code stream, and the value of the second complementary identifier is equal to a preset second value, for identifying the second description.
  • the code stream described by the sub-stream is a view code stream;
  • the view code stream is a code stream obtained by encoding the content of the first spatial object of the target picture, and the complementary code stream is obtained by encoding the content of the second spatial object of the target picture.
  • a code stream, the target picture including content of the first spatial object and content of the second spatial object;
  • the obtaining unit 902 is configured to acquire the complementary code stream according to the first complementary identifier and obtain the view code stream according to the second complementary identifier.
  • each unit may also correspond to the corresponding description of the method embodiment shown in FIG. 8.
  • the server indicates the view code stream and the complementary code stream in the MPD by the complementary identifier, and correspondingly, after receiving the MPD, the client 90 determines the view code stream according to the complementary identifier and the Complementing the code stream, and then requesting the view code stream and the complementary code stream from the server and presenting; the content of the first spatial object corresponding to the view code stream and the content of the second space object corresponding to the complementary code stream are complete
  • the target picture so that the view code stream and the complementary code stream have almost no overlapping content, saving the transmission bandwidth between the server and the client 90 and the storage space on the client 90.
  • FIG. 10 is a schematic structural diagram of a server 100 according to an embodiment of the present invention.
  • the server 100 may include a generating unit 1001 and a sending unit 1002.
  • the detailed description of each unit is as follows.
  • the generating unit 1001 is configured to generate a media presentation description, where the media presentation description includes a complementary identifier to indicate that the view code stream and the complementary code stream are described in the media presentation description, where the view code stream is the first space object of the target picture.
  • the content encoding obtains a code stream
  • the complementary code stream encodes a content of the second spatial object of the target picture to obtain a code stream, where the target picture includes the content of the first spatial object and the content of the second spatial object;
  • the sending unit 1002 is configured to send the media presentation description to the client, so that the client obtains the view code stream and the complementary code stream according to the complementary identifier.
  • the server 100 indicates the view code stream and the complementary code stream by using the complementary identifier in the MPD. Accordingly, after receiving the MPD, the client determines the view code stream and the complementary code stream according to the complementary identifier, and then The server 100 requests the view code stream and the complementary code stream and presents; since the content of the first spatial object corresponding to the view code stream and the content of the second space object corresponding to the complementary code stream form a complete target picture, the server 100 The view code stream and the complementary code stream have almost no overlapping content, which saves the transmission bandwidth between the server 100 and the client and the storage space on the client.
  • the media presentation description includes a first adaptation set, and information of a representation of the first adaptation set includes the complementary identifier for identifying a code stream described by the information of the one representation.
  • the value of the complementary identifier is a value of a representation identifier, a representationaion ID, of the information of another representation in the media presentation description, to identify the code stream described by the information of the another representation as the view code stream.
  • the media presentation description includes a second adaptation set
  • the second adaptation set includes the complementary identifier to indicate that the second adaptation set includes a representation for describing the complementary code stream Information.
  • the value of the complementary identifier is a value of a third adaptive set identifier adaptationSet ID in the media presentation description, where the code stream used to identify the information in the third adaptive set is the view code. flow.
  • the description of the generating unit 1001 and the sending unit 1002 included in the server 100 may also be as follows:
  • the generating unit 1001 is configured to generate a media presentation description, where the media presentation description includes a first descriptor and a second descriptor, where the first descriptor includes a first complementary identifier, and the second descriptor includes a second complementary identifier, the first The value of the complementary identifier is equal to the preset first value, and is used to identify that the code stream described by the first descriptor is a complementary code stream, and the value of the second complementary identifier is equal to a preset second value for identifying
  • the code stream described in the second descriptor is a view code stream; the view code stream is a code stream obtained by encoding a content of the first spatial object of the target picture, and the complementary code stream is a second spatial object of the target picture. Encoding the content to obtain a code stream, the target picture comprising the content of the first spatial object and the content of the second spatial object;
  • the sending unit 1002 is configured to send the media presentation description to the client, so that the client acquires the complementary code stream according to the first complementary identifier and acquires the view code stream according to the second complementary identifier.
  • each unit may also correspond to the corresponding description of the method embodiment shown in FIG. 8.
  • the server 100 identifies the view code stream and the complementary code stream in the MPD by the complementary identifier, and correspondingly, after receiving the MPD, the client determines the view code stream and the complement according to the complementary identifier.
  • a code stream and then requesting the view code stream and the complementary code stream from the server 100 and presenting; the content of the first spatial object corresponding to the view code stream and the content of the second spatial object corresponding to the complementary code stream are complete.
  • the target picture so that the view code stream and the complementary code stream have almost no overlapping content, saving the transmission bandwidth between the server 100 and the client and the storage space on the client.
  • FIG. 11 is a schematic structural diagram of still another client 110 according to an embodiment of the present invention.
  • the client 110 may include a processor 1101, a memory 1102, and an input component 1103.
  • the processor 1101 and the memory 1102 and The input components 1103 are connected to each other through a bus.
  • the memory 1102 includes, but is not limited to, a random access memory (RAM), a read only memory (ROM), an erasable programmable read only memory (EPROM or flash memory), or a portable read only memory (CD-ROM). Memory 1102 is used for related instructions and data.
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable programmable read only memory
  • CD-ROM portable read only memory
  • the processor 1101 may be one or more central processing units (English: Central Processing Unit, CPU for short). In the case where the processor 1101 is a CPU, the CPU may be a single core CPU or a multi-core CPU.
  • CPU Central Processing Unit
  • the input component 1103 can be a radio frequency module for transmitting and receiving signals, a communication interface for network communication, and the like.
  • the processor 1101 in the client 110 is configured to read the program code stored in the memory 1102, and perform the following operations:
  • the media presentation description includes a complementary identifier to indicate that the media presentation description describes a view code stream and a complementary code stream, where the view code stream is the first space object of the target picture
  • the content encoding obtains a code stream
  • the complementary code stream encodes a content of the second spatial object of the target picture to obtain a code stream, where the target picture includes the content of the first spatial object and the content of the second spatial object;
  • the server indicates the view code stream and the complementary code stream by using the complementary identifier in the MPD. Accordingly, after receiving the MPD, the client 110 determines the view code stream and the complementary code stream according to the complementary identifier, and then The server requests the view code stream and the complementary code stream and presents; since the content of the first spatial object corresponding to the view code stream and the content of the second space object corresponding to the complementary code stream constitute a complete target picture, the view is There is almost no overlapping content between the code stream and the complementary code stream, which saves the transmission bandwidth between the server and the client 110 and the storage space on the client 110.
  • the media presentation description includes a first adaptation set, and information of a representation of the first adaptation set includes the complementary identifier for identifying a code stream described by the information of the one representation.
  • the value of the complementary identifier is a value of a representation identifier, a representationaion ID, of the information of another representation in the media presentation description, to identify the code stream described by the information of the another representation as the view code stream.
  • the media presentation description includes a second adaptation set
  • the second adaptation set includes the complementary identifier to indicate that the second adaptation set includes a representation for describing the complementary code stream Information.
  • the value of the complementary identifier is a value of a third adaptive set identifier adaptationSet ID in the media presentation description, where the code stream used to identify the information in the third adaptive set is the view code. flow.
  • the processor 1101 in the client 110 can also be used to read the program code stored in the memory 1102 to perform the following operations:
  • the media presentation description includes a first descriptor and a second descriptor
  • the first descriptor includes a first complementary identifier
  • the second descriptor includes a second complementary identifier
  • the first The value of the complementary identifier is equal to the preset first value, and is used to identify that the code stream described by the first descriptor is a complementary code stream
  • the value of the second complementary identifier is equal to a preset second value for identifying
  • the code stream described in the second descriptor is a view code stream;
  • the view code stream is a code stream obtained by encoding a content of the first spatial object of the target picture, and the complementary code stream is a second spatial object of the target picture. Encoding the content to obtain a code stream, the target picture comprising the content of the first spatial object and the content of the second spatial object;
  • the server indicates the view code stream and the complementary code stream in the MPD by the complementary identifier, and correspondingly, after receiving the MPD, the client 110 determines the view code stream according to the complementary identifier and the Complementing the code stream, and then requesting the view code stream and the complementary code stream from the server and presenting; the content of the first spatial object corresponding to the view code stream and the content of the second space object corresponding to the complementary code stream are complete
  • the target picture so that the view code stream and the complementary code stream have almost no overlapping content, saving the transmission bandwidth between the server and the client 110 and the storage space on the client 110.
  • FIG. 12 is a schematic structural diagram of still another server 120 according to an embodiment of the present invention.
  • the server 120 may include a processor 1201, a memory 1202, and an output component 1203, the processor 1201 and the memory 1202, and an output component.
  • the 1203 is connected to each other through a bus.
  • the memory 1202 includes, but is not limited to, a random access memory (RAM), a read only memory (ROM), an erasable programmable read only memory (EPROM or flash memory), or a portable read only memory (CD-ROM). Memory 1202 is used for related instructions and data.
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable programmable read only memory
  • CD-ROM portable read only memory
  • the processor 1201 may be one or more central processing units (English: Central Processing Unit, CPU for short). In the case that the processor 1201 is a CPU, the CPU may be a single core CPU or a multi-core CPU.
  • CPU Central Processing Unit
  • the output component 1203 can be a radio frequency module for transmitting and receiving signals, a communication interface for network communication, and the like.
  • the processor 1201 in the server 120 is configured to read the program code stored in the memory 1202 and perform the following operations:
  • the media presentation description including a complementary identifier to indicate that the media presentation description describes a view code stream and a complementary code stream, and the view code stream encodes a content of the first spatial object of the target picture to obtain a code stream
  • the complementary code stream is a code stream obtained by encoding the content of the second spatial object of the target picture, where the target picture includes the content of the first spatial object and the content of the second spatial object;
  • the media presentation description is sent to the client by the output component 1203, so that the client obtains the view code stream and the complementary code stream according to the complementary identifier.
  • the server 120 indicates the view code stream and the complementary code stream by using the complementary identifier in the MPD. Accordingly, after receiving the MPD, the client determines the view code stream and the complementary code stream according to the complementary identifier, and then The server 120 requests the view code stream and the complementary code stream and presents; since the content of the first spatial object corresponding to the view code stream and the content of the second space object corresponding to the complementary code stream constitute a complete target picture, the server 120 The view code stream and the complementary code stream have almost no overlapping content, which saves the transmission bandwidth between the server 120 and the client and the storage space on the client.
  • the media presentation description includes a first adaptation set, and information of a representation of the first adaptation set includes the complementary identifier for identifying a code described by the information of the representation.
  • the stream is the complementary code stream.
  • the value of the complementary identifier is a value of a representation identifier, a representationaion ID, of the information of another representation in the media presentation description, to identify the code stream described by the information of the another representation as the view code stream.
  • the media presentation description includes a second adaptation set
  • the second adaptation set includes the complementary identifier to indicate that the second adaptation set includes a representation for describing the complementary code stream Information.
  • the value of the complementary identifier is a value of a third adaptive set identifier adaptationSet ID in the media presentation description, where the code stream used to identify the information in the third adaptive set is the view code. flow.
  • the processor 1201 in the server 120 can also be used to read the program code stored in the memory 1202 to perform the following operations:
  • the media presentation description including a first descriptor and a second descriptor, the first descriptor includes a first complementary identifier, and the second descriptor includes a second complementary identifier; the value of the first complementary identifier is equal to
  • the preset first value is used to identify the code stream described by the first descriptor as a complementary code stream, and the value of the second complementary identifier is equal to a preset second value for identifying the second descriptor.
  • the code stream is a view code stream; the view code stream is a code stream obtained by encoding the content of the first spatial object of the target picture, and the complementary code stream is a code for encoding the content of the second spatial object of the target picture. a stream, the target picture including content of the first spatial object and content of the second spatial object;
  • the media presentation description is sent to the client by the output component 1203, so that the client acquires the complementary code stream according to the first complementary identifier and acquires the view code stream according to the second complementary identifier.
  • the server 120 identifies the view stream by complementary identification in the MPD. And the complementary code stream, and correspondingly, after receiving the MPD, the client determines the view code stream and the complementary code stream according to the complementary identifier, and then requests the view code stream and the complementary code stream from the server 120 and present;
  • the content of the first spatial object corresponding to the view code stream and the content of the second spatial object corresponding to the complementary code stream constitute a complete target picture, so that the view code stream and the complementary code stream have almost no overlapping content, thereby saving The transmission bandwidth between the server 120 and the client and the storage space on the client.
  • FIG. 13 is a schematic structural diagram of a data processing system 130 according to an embodiment of the present invention.
  • the system 130 includes a client 1301 and a server 1302, where:
  • the client 1301 may be the client 90 described in FIG. 9 or the client 110 described in FIG. 11;
  • the server 1302 may be the server 100 depicted in FIG. 10 or the server 120 depicted in FIG.
  • the server 1302 indicates the view code stream and the complementary code stream by the complementary identifier in the MPD. Accordingly, after receiving the MPD, the client 1301 determines the view code stream according to the complementary identifier. And the complementary code stream, and then requesting the view code stream and the complementary code stream from the server 1302 and presenting; the content of the first spatial object corresponding to the view code stream and the content of the second spatial object corresponding to the complementary code stream.
  • the complete target picture is composed, so that the view code stream and the complementary code stream have almost no overlapping content, which saves the transmission bandwidth between the server 1302 and the client 1301 and the storage space on the client 1301.
  • the server indicates the view code stream and the complementary code stream by using the complementary identifier in the MPD, and correspondingly, after receiving the MPD, the client determines the view code stream and the complement according to the complementary identifier.
  • a code stream and then requesting the view code stream and the complementary code stream from the server and presenting; the content of the first spatial object corresponding to the view code stream and the content of the second spatial object corresponding to the complementary code stream form a complete target
  • the view code stream and the complementary code stream have almost no overlapping content, saving the transmission bandwidth between the server and the client and the storage space on the client.
  • the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Library & Information Science (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

一种数据的处理方法、相关设备及系统,该方法包括:接收媒体呈现描述,该媒体呈现描述包含互补标识,以表明该媒体呈现描述中描述了视角码流和互补码流,该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;根据该互补标识获取该视角码流和该互补码流。能够节省服务器与客户端之间的传输带宽以及节省该客户端的存储空间。

Description

一种数据处理方法、相关设备及系统 技术领域
本发明涉及计算机技术领域,尤其涉及一种数据处理方法、相关设备及系统。
背景技术
随着虚拟现实(英文:virtual reality,VR)技术的日益发展完善,360度等超出人眼正常视觉范围的VR视频观看应用越来越多地呈现在用户面前。在VR视频观看过程中,需要保证用户视角(英文:field of view,FOV)注视的空间对象的内容尽可能清晰,而视角外的空间对象的内容可以相对模糊些。
视频从时域上分为多个播放时段且每个播放时段均对应多个不同分辨率的片段(segment),用户就可以根据网络状况等信息从多种不同质量的视频片段(例如,高清视频、标清视频等)中选择出较适合自己的视频片段。目前,用户在观看视频时,呈现在用户视角范围内的空间对象的内容为视频质量相对较高的视频,而呈现在用户视角范围外空间对象的内容为视频质量相对较低的视频,保证了视角范围内的空间对象的内容尽可能清晰。具体实现如下:提供VR视频的服务器对视频的任意一个播放时段的全部视频内容均进行低质量编码,作为基本层(Base layer),整个基本层为低质量编码内容(通常情况下,FOV切换后播放时段也相应发生改变,对应的基本层也会发生改变);同时将相同播放时段的视频划分为多个部分并将每个部分的视频均进行高质量编码,作为增强层(Enhancement layer),每个部分为一个空间对象的高质量编码内容,每个空间对象各自对应一组空间信息;然后根据由FOV确定的空间信息确定空间对象(FOV可能会对应一个或多个空间对象)并进一步根据确定的空间对象确定该空间对象的高质量编码内容,再将基本层的全部低质量编码内容发送给客户端以及将基于该FOV确定的空间对象的高质量编码内容发送给该客户端。
相应地,客户端接收基于FOV确定的空间对象的高质量编码内容和基本层的全部低质量编码内容。当用户的FOV不发生改变时,在当前FOV范围内呈现该空间对象的高质量编码内容;当用户切换FOV时,如果切换前的FOV对应的空间对象已经不能完全覆盖新的FOV对应的空间对象,那么在不能覆盖的部分采用低质量的编码内容来解码呈现,并及时从服务器获得新FOV对应的空间对象的高质量编码内容;可以理解的是,客户端在请求新FOV对应的空间对象的高质量编码内容的过程中在该新FOV内先呈现该部分或全部的低质量编码内容,能够避免因等待该新FOV对应的空间对象的高质量编码内容而导致用户不适。
现有技术的缺陷在于,当用户的FOV保持不变时该服务器不仅要向客户端发送FOV对应的空间对象的高质量编码内容还要发送该FOV对应的空间对象的低质量编码内容,不仅浪费了带宽还导致该客户端中的内容出现冗余。
发明内容
本发明实施例公开了一种视频数据处理方法、相关设备及系统,能够节省服务器与客 户端之间的传输带宽和该客户端上的存储空间。
下面先对本发明实施例涉及到的相关技术和专业术语进行简单的介绍以方便理解。
一、MPEG-DASH技术介绍
2011年11月,MPEG组织批准了DASH标准,DASH标准是基于HTTP协议传输媒体流的技术规范(以下称DASH技术规范);DASH技术规范主要由两大部分组成:媒体呈现描述(英文:Media Presentation Description,MPD)和媒体文件格式(英文:file format)。如图1,图1是系统层视频流媒体传输采用的DASH标准传输的框架实例示意图。系统层视频流媒体传输方案的数据传输过程包括两个过程:服务器端(如HTTP服务器,以下简称服务器)为视频内容生成媒体数据的过程,和客户端(如HTTP流媒体客户端)向服务器请求并获取媒体数据的过程。其中,上述媒体数据包括媒体呈现描述(英文:Media Presentation Description,MPD)。服务器上的MPD中包括多个表示(也称呈现或者描述层,英文:representation),每个表示描述多个片段。客户端的HTTP流媒体请求控制模块获取服务器发送的MPD,并对MPD进行分析,确定MPD中描述的视频码流的各个片段的信息,进而可确定要请求的片段,通过HTTP请求接收端向服务器请求相应的片段,并通过媒体播放器进行解码播放。
1、媒体文件格式:
在DASH中服务器会为同一个视频内容准备多种版本的码流,例如,服务器为同一集电视剧的视频内容生成低分辨率低码率低帧率(如360p分辨率、300kbps码率、15fps帧率)的码流,中分辨率中码率高帧率(如720p分辨率、1200kbps码率、25fps帧率)的码流,高分辨率高码率高帧率(如1080p分辨率、3000kbps码率、25fps帧率)的码流等。每个版本的码流在DASH标准中称为表示(英文:representation)。表示是在传输格式中的一个或者多个码流的集合和封装,一个表示中包含一或者多个分段。不同版本的码流的码率、分辨率等编码参数可以不同,每个码流分割成多个小的文件,每个小文件被称为分段(或称片段,英文:segment)。在客户端请求媒体分段数据的过程中可以在不同的媒体表示之间切换,如图2所示,服务器为一部电影准备了3个表示,包括rep1(表示1),rep2(表示2),rep3(表示3)。其中,rep1是码率为4mbps(每秒兆比特)的高清视频,rep2是码率为2mbps的标清视频,rep3是码率为1mbps的标清视频。图2中标记为阴影的分段是客户端请求播放的分段数据,客户端请求的前三个分段是媒体表示rep3的分段,第四个分段切换到rep2,请求第四个分段,之后切换到rep1,请求第五个分段和第六个分段等。每个表示的分段可以首尾相接的存在一个文件中,也可以独立存储为一个个的小文件。segment可以按照标准ISO/IEC 14496-12中的格式封装(ISO BMFF(Base Media File Format)),也可以是按照ISO/IEC 13818-1中的格式封装(MPEG-2TS)。
2、媒体呈现描述
在DASH标准中,媒体呈现描述被称为MPD,MPD可以是一个xml的文件,文件中的信息是采用分级方式描述,如图3所示,上一级的信息被下一级完全继承。在该文件中描述了一些媒体元数据,这些元数据可以使得客户端了解服务器中的媒体内容信息,并且可以使用这些信息构造请求segment的http-URL。
在DASH标准中,媒体呈现(英文:media presentation),是呈现媒体内容的结构化 数据的集合;媒体呈现描述(英文:media presentation description),一个规范化描述媒体呈现的文件,用于提供流媒体服务;时期(英文:period),一组连续的时期组成整个媒体呈现,时期具有连续和不重叠的特性;表示(英文:representation),封装有一个或多个具有描述性元数据的的媒体内容成分(编码的单独的媒体类型,例如音频、视频等)的结构化的数据集合即表示是传输格式中一个或者多个码流的集合和封装,一个表示中包含一个或者多个分段;自适应集(英文:AdaptationSet),表示同一媒体内容成分的多个可互替换的编码版本的集合,一个自适应集包含一个或者多个表示;子集(英文:subset),一组自适应集合的组合,当播放器播放其中所有自适应集合时,可以获得相应的媒体内容;分段信息,是媒体呈现描述中的HTTP统一资源定位符引用的媒体单元,分段信息描述媒体数据的分段。
上述segment有两种存储方式:一种是每个segment分开独立存储,如图4,图4是码流数据中的片段存储方式的一示意图;另一种是同一个rep上的所有segment均存储在一个文件中,如图5,图5是码流数据中的片段存储方式的另一示意图。如图4,repA(表示A)的segment中每个segment单独存储为一个文件,repB(表示B)的segment中每个segment也单独存储为一个文件。对应的,图4所示的存储方式,服务器可在码流的MPD中可采用模板的形式或者列表的形式描述每个segment的URL等信息。如图5,rep1的segment中所有segment存储为一个文件,rep2的segment中所有segment存储为一个文件。对应的,图5所示的存储方法,服务器可在码流的MPD中采用一个索引片段(英文:index segment,也就是图5中的SIDX)来描述每个segment的相关信息。索引分段描述了每个segment在其所存储的文件中的字节偏移,每个segment大小以及每个segment持续时间(duration,也称每个segment的播放时长,简称时长)等信息。
本发明实施例中的自适应集(例如,第一自适应集、第二自适应集等等)用于描述同一媒体内容成分的多个可互相替换的编码版本的媒体数据分段的属性的数据集合。本方面实施例中的表示为传输格式中的一个或者多个码流的集合和封装。本发明实施例中的描述子用于描述关联到的空间对象的空间信息。
本发明有关MPEG-DASH技术的相关技术概念可以参考ISO/IEC 23009-1:2014□Information technology--Dynamic adaptive streaming over HTTP(DASH)--Part 1:Media presentation description and segment formats,中的有关规定,也可以参考历史标准版本中的相关规定,如ISO/IEC 23009-1:2013或ISO/IEC 23009-1:2012等。
二、虚拟现实(virtual reality,VR)技术介绍
虚拟现实技术是一种可以创建和体验虚拟世界的计算机仿真系统,它利用计算机生成一种模拟环境,是一种多源信息融合的交互式的三维动态视景和实体行为的系统仿真,可以使用户沉浸到该环境中。VR主要包括模拟环境、感知、自然技能和传感设备等方面。模拟环境是由计算机生成的、实时动态的三维立体逼真图像。感知是指理想的VR应该具有一切人所具有的感知。除计算机图形技术所生成的视觉感知外,还有听觉、触觉、力觉、运动等感知,甚至还包括嗅觉和味觉等,也称为多感知。自然技能是指人的头部转动,眼睛、手势、或其他人体行为动作,由计算机来处理与参与者的动作相适应的数据,并对用 户的输入作出实时响应,并分别反馈到用户的五官。传感设备是指三维交互设备。当VR视频(或者360度视频,或者全方位视频(英文:Omnidirectional video))在头戴设备和手持设备上呈现时,只有对应于用户头部的方位部分的视频图像呈现和相关联的音频呈现。
VR视频和通常的视频(英文:normal video)的差别在于通常的视频是整个视频内容都会被呈现给用户;VR视频是只有整个视频的一个子集被呈现给用户(英文:in VR typically only a subset of the entire video region represented by the video pictures)。
三、现有DASH标准的空间描述:
现有标准中,对空间信息的描述原文是“The SRD scheme allows Media Presentation authors to express spatial relationships between Spatial Objects.A Spatial Object is defined as a spatial part of a content component(e.g.a region of interest,or a tile)and represented by either an Adaptation Set or a Sub-Representation.”
【中文】:MPD中描述的是空间对象(即Spatial Objects)之间的空间关系(即spatial relationships)。空间对象被定义为一个内容成分的一部分空间,比如现有的感兴趣区域(英文:region of interest,ROI)和tile;空间关系可以在Adaptation Set和Sub-Representation中描述。现有DASH标准在MPD中定义了一些描述子元素,每个描述子元素都有两个属性,schemeIdURI和value。其中,schemeIdURI描述了当前描述子是什么,value是描述子的参数值。在已有的标准中有两个已有描述子SupplementalProperty和EssentialProperty(补充特性描述子和基本特性描述子)。现有标准中如果这两个描述子的schemeIdURI="urn:mpeg:dash:srd:2014"(或者schemeIdURI=urn:mpeg:dash:VR:2017),则表示该描述子描述了关联到的空间对象的空间信息(spatial information associated to the containing Spatial Object.),相应的value中列出了SDR的一系列参数值。具体value的语法如下表0:
Figure PCTCN2017092772-appb-000001
表0
MPD样例如下:
Figure PCTCN2017092772-appb-000002
Figure PCTCN2017092772-appb-000003
Figure PCTCN2017092772-appb-000004
其中,上述空间对象的左上坐标、空间对象的长宽和空间对象参考的空间,也可以是相对值,比如:上述value="1,0,0,1920,1080,3840,2160,2"可以描述成value="1,0,0,1,1,2,2,2"。
在一些可行的实施方式中,对于360度大视角的视频图像的输出,服务器可将360度的视角范围内的空间进行划分以得到多个空间对象,每个空间对象对应用户的一个子视角,多个子视角的拼接形成一个完整的人眼观察视角。其中,人眼观察视角的动态变化的,通常可为120度*120度。例如图6所述的空间对象1和空间对象2分别为用户的两个不同视角所注视的空间对象。服务器可为每个空间对象准备一组视频码流,具体的,服务器可获取视频中每个码流的编码配置参数,并根据码流的编码配置参数生成视频的各个空间对象对应的码流。客户端可在视频输出时向服务器请求某一时间段某个视角对应的视频码流分段并输出至该视角对应的空间对象。客户端在同一个时间段内输出360度的视角范围内的所有视角对应的视频码流分段,则可在整个360度的空间内输出显示该时间段内的完整视频图像。
具体实现中,在360度的空间的划分中,服务器可首先将球面映射为平面,在平面上对空间进行划分。具体的,服务器可采用经纬度的映射方式将球面映射为经纬平面图。如图7,图7是本发明实施例提供的空间对象的示意图。服务器可将球面映射为经纬平面图,并将经纬平面图划分为A~I等多个空间对象。进一步的,服务器可也将球面映射为立方体,再将立方体的多个面进行展开得到平面图,或者将球面映射为其他多面体,在将多面体的多个面进行展开得到平面图等。服务器还可采用更多的映射方式将球面映射为平面,具体 可根据实际应用场景需求确定,在此不做限制。下面将以经纬度的映射方式,结合图7进行说明。
如图7,服务器可将球面的空间对象划分为A~I等多个空间对象之后,则可通过服务器为每个空间对象准备一组DASH码流。其中,每个空间对象对应一个子视角,每个空间对象对应的一组DASH码流为每个子视角的视角码流。一个视角码流中每个图像所关联的空间对象的空间信息相同,由此可将视角码流设为静态视角码流。每个子视角的视角码流为整个视频码流的一部分,所有子视角的视角码流构成一个完整的视频码流。视频播放过程中,可根据用户当前观看的视角选择相应的空间对象对应的DASH码流进行播放。用户切换视频观看的视角时,客户端则可根据用户选择的新视角确定切换的目标空间对象对应的DASH码流。
以下具体讲述本发明实施例提供的方法、相关设备及系统。
第一方面,本发明实施例提供了一种数据的处理方法,该方法包括:接收媒体呈现描述,该媒体呈现描述包含互补标识,以表明该媒体呈现描述中描述了视角码流和互补码流,该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;根据该互补标识获取该视角码流和该互补码流。在一种可选的方案中,该以表明该媒体呈现描述中描述了视角码流和互补码流可以理解为:该互补标识用于标识互补的码流,该互补的码流包含视角码流和互补码流。在一种可选的方案中,该目标图片包括该第一空间对象的内容和该第二空间对象的内容可以理解为:该目标图片由该第一空间对象的内容和该第二空间对象的内容组成。
第二方面,本发明实施例提供一种数据处理方法,该方法包括:接收媒体呈现描述,该媒体呈现描述包含第一描述子和第二描述子,该第一描述子包含第一互补标识,该第二描述子包含第二互补标识,该第一互补标识的值等于预设的第一数值,以用于标识该第一描述子所描述的码流为互补码流,该第二互补标识的值等于预设的第二数值,以用于标识该第二描述子所描述的码流为视角码流;该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;根据所述第一互补标识获取所述互补码流以及根据所述第二互补标识获取所述视角码流。在一种可选的方案中,该目标图片包括该第一空间对象的内容和该第二空间对象的内容可以理解为:该目标图片由该第一空间对象的内容和该第二空间对象的内容组成。
通过执行上述步骤,该服务器在MPD中通过互补标识标明视角码流和互补码流,相应地,客户端接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向该服务器请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码流几乎不存在交叠的内容,节省了该服务器与该客户端之间的传输带宽和该客户端上的存储空间。
第三方面,本发明实施例提供一种数据的处理方法,该方法包括:生成媒体呈现描述,该媒体呈现描述包含互补标识,以表明该媒体呈现描述中描述了视角码流和互补码流,该 视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片该第一空间对象的内容和该第二空间对象的内容;向客户端发送该媒体呈现描述,以使该客户端根据该互补标识获取该视角码流和该互补码流。在一种可选的方案中,该以表明该媒体呈现描述中描述了视角码流和互补码流可以理解为:该互补标识用于标识互补的码流,该互补的码流包含视角码流和互补码流。在一种可选的方案中,该目标图片包括该第一空间对象的内容和该第二空间对象的内容可以理解为:该目标图片由该第一空间对象的内容和该第二空间对象的内容组成。
第四方面,本发明实施例提供一种数据处理方法,该方法包括:生成媒体呈现描述,该媒体呈现描述包含第一描述子和第二描述子,该第一描述子包含第一互补标识,该第二描述子包含第二互补标识;该第一互补标识的值等于预设的第一数值,以用于标识该第一描述子所描述的码流为互补码流,该第二互补标识的值等于预设的第二数值,以用于标识该第二描述子所描述的码流为视角码流;该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;向客户端发送该媒体呈现描述,以使该客户端根据所述第一互补标识获取所述互补码流以及根据所述第二互补标识获取所述视角码流。在一种可选的方案中,该目标图片包括该第一空间对象的内容和该第二空间对象的内容可以理解为:该目标图片由该第一空间对象的内容和该第二空间对象的内容组成。
通过执行上述步骤,该服务器在MPD中通过互补标识标明视角码流和互补码流,相应地,客户端接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向该服务器请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码流几乎不存在交叠的内容,节省了该服务器与该客户端之间的传输带宽和该客户端上的存储空间。
第五方面,本发明实施例提供一种客户端,该客户端包括接收单元和获取单元,其中,接收单元用于接收媒体呈现描述,该媒体呈现描述包含互补标识,以表明该媒体呈现描述中描述了视角码流和互补码流,该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;获取单元用于根据该互补标识获取该视角码流和该互补码流。在一种可选的方案中,该以表明该媒体呈现描述中描述了视角码流和互补码流可以理解为:该互补标识用于标识互补的码流,该互补的码流包含视角码流和互补码流。在一种可选的方案中,该目标图片包括该第一空间对象的内容和该第二空间对象的内容可以理解为:该目标图片由该第一空间对象的内容和该第二空间对象的内容组成。
第六方面,本发明实施例提供一种客户端,该终端包括接收单元和获取单元,其中,接收单元用于接收媒体呈现描述,该媒体呈现描述包含第一描述子和第二描述子,该第一描述子包含第一互补标识,该第二描述子包含第二互补标识,该第一互补标识的值等于预设的第一数值,以用于标识该第一描述子所描述的码流为互补码流,该第二互补标识的值 等于预设的第二数值,以用于标识该第二描述子所描述的码流为视角码流;该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;获取单元用于根据所述第一互补标识获取所述互补码流以及根据所述第二互补标识获取所述视角码流。在一种可选的方案中,该目标图片包括该第一空间对象的内容和该第二空间对象的内容可以理解为:该目标图片由该第一空间对象的内容和该第二空间对象的内容组成。
通过运行上述单元,该服务器在MPD中通过互补标识标明视角码流和互补码流,相应地,客户端接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向该服务器请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码流几乎不存在交叠的内容,节省了该服务器与该客户端之间的传输带宽和该客户端上的存储空间。
第七方面,本发明实施例提供一种服务器,该服务器包括生成单元和发送单元,其中,生成单元用于生成媒体呈现描述,该媒体呈现描述包含互补标识,以表明该媒体呈现描述中描述了视角码流和互补码流,该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;发送单元用于向客户端发送该媒体呈现描述,以使该客户端根据该互补标识获取该视角码流和该互补码流。在一种可选的方案中,该以表明该媒体呈现描述中描述了视角码流和互补码流可以理解为:该互补标识用于标识互补的码流,该互补的码流包含视角码流和互补码流。在一种可选的方案中,该目标图片包括该第一空间对象的内容和该第二空间对象的内容可以理解为:该目标图片由该第一空间对象的内容和该第二空间对象的内容组成。
第八方面,本发明实施例提供一种服务器,该服务器包括生成单元和发送单元,其中,生成单元用于生成媒体呈现描述,该媒体呈现描述包含第一描述子和第二描述子,该第一描述子包含第一互补标识,该第二描述子包含第二互补标识,该第一互补标识的值等于预设的第一数值,以用于标识该第一描述子所描述的码流为互补码流,该第二互补标识的值等于预设的第二数值,以用于标识该第二描述子所描述的码流为视角码流;该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;发送单元用于向客户端发送该媒体呈现描述,以使该客户端根据所述第一互补标识获取所述互补码流以及根据所述第二互补标识获取所述视角码流。在一种可选的方案中,该目标图片包括该第一空间对象的内容和该第二空间对象的内容可以理解为:该目标图片由该第一空间对象的内容和该第二空间对象的内容组成。
通过运行上述单元,该服务器在MPD中通过互补标识标明视角码流和互补码流,相应地,客户端接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向该服务器请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码 流几乎不存在交叠的内容,节省了该服务器与该客户端之间的传输带宽和该客户端上的存储空间。
第九方面,本发明实施例提供了一种客户端,该客户端包括处理器、存储器和输入组件,该存储器用于存储程序和数据,该处理器调用该存储器中的程序,用于执行如下操作:通过该输入组件接收媒体呈现描述,该媒体呈现描述包含互补标识,以表明该媒体呈现描述中描述了视角码流和互补码流,该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;根据该互补标识获取该视角码流和该互补码流。在一种可选的方案中,该以表明该媒体呈现描述中描述了视角码流和互补码流可以理解为:该互补标识用于标识互补的码流,该互补的码流包含视角码流和互补码流。在一种可选的方案中,该目标图片包括该第一空间对象的内容和该第二空间对象的内容可以理解为:该目标图片由该第一空间对象的内容和该第二空间对象的内容组成。
第十方面,本发明实施例提供一种客户端,该客户端包括处理器、存储器和输入组件,该存储器用于存储程序和数据,该处理器调用该存储器中的程序,用于执行如下操作:通过该输入组件接收媒体呈现描述,该媒体呈现描述包含第一描述子和第二描述子,该第一描述子包含第一互补标识,该第二描述子包含第二互补标识,该第一互补标识的值等于预设的第一数值,以用于标识该第一描述子所描述的码流为互补码流,该第二互补标识的值等于预设的第二数值,以用于标识该第二描述子所描述的码流为视角码流;该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;根据所述第一互补标识获取所述互补码流以及根据所述第二互补标识获取所述视角码流。在一种可选的方案中,该目标图片包括该第一空间对象的内容和该第二空间对象的内容可以理解为:该目标图片由该第一空间对象的内容和该第二空间对象的内容组成。
通过执行上述操作,该服务器在MPD中通过互补标识标明视角码流和互补码流,相应地,客户端接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向该服务器请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码流几乎不存在交叠的内容,节省了该服务器与该客户端之间的传输带宽和该客户端上的存储空间。
第十一方面,本发明实施例提供一种服务器,该服务器包括处理器、存储器和输出组件,该存储器用于存储程序和数据,该处理器调用该存储器中的程序,用于执行如下操作:生成媒体呈现描述,该媒体呈现描述包含互补标识,以表明该媒体呈现描述中描述了视角码流和互补码流,该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;通过该输出组件向客户端发送该媒体呈现描述,以使该客户端根据该互补标识获取该视角码流和该互补码流。在一种可选的方案中,该以表明该媒体呈现描述中描述了视角码流和互补码流可以理解为:该互补标识用于标识互补的码流,该互补的码流包含视角码流和互补码流。在一种可选的方案中,该目标图片包括该 第一空间对象的内容和该第二空间对象的内容可以理解为:该目标图片由该第一空间对象的内容和该第二空间对象的内容组成。
第十二方面,本发明实施例提供一种服务器,该服务器包括处理器、存储器和输出组件,该存储器用于存储程序和数据,该处理器调用该存储器中的程序,用于执行如下操作:生成媒体呈现描述,该媒体呈现描述包含第一描述子和第二描述子,该第一描述子包含第一互补标识,该第二描述子包含第二互补标识;该第一互补标识的值等于预设的第一数值,以用于标识该第一描述子所描述的码流为互补码流,该第二互补标识的值等于预设的第二数值,以用于标识该第二描述子所描述的码流为视角码流;该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;通过该输出组件向客户端发送该媒体呈现描述,以使该客户端根据所述第一互补标识获取所述互补码流以及根据所述第二互补标识获取所述视角码流。在一种可选的方案中,该目标图片包括该第一空间对象的内容和该第二空间对象的内容可以理解为:该目标图片由该第一空间对象的内容和该第二空间对象的内容组成。
通过执行上述操作,该服务器在MPD中通过互补标识标明视角码流和互补码流,相应地,客户端接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向该服务器请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码流几乎不存在交叠的内容,节省了该服务器与该客户端之间的传输带宽和该客户端上的存储空间。
结合第一方面、或者第三方面、或者第五方面、或者第七方面、或者第九方面,或者第十一方面,在第一种可能的实现方式中,该媒体呈现描述包含第一自适应集,该第一自适应集中的一个表示的信息包含该互补标识,以用于标识该一个表示的信息所描述的码流为该互补码流。
结合第一种可能的实现方式,在第二种可能的实现方式中,该互补标识的值为该媒体呈现描述中的另一个表示的信息的表示标识representaion ID的值,以用于标识该另一个表示的信息所描述的码流为该视角码流。
结合第一方面、或者第二方面、或者第三方面、或者第四方面、或者第五方面,或者第六方面,在第三种可能的实现方式中,该媒体呈现描述包含第二自适应集,该第二自适应集包含该互补标识,以表明该第二自适应集包含用于描述该互补码流的表示的信息。
结合第三种可能的实现方式,在第四种可能的实现方式中,该互补标识的值为该媒体呈现描述中的第三自适应集标识adaptationSet ID的值,以用于标识该第三自适应集中的表示的信息所描述的码流为该视角码流。
第十三方面,本发明实施例提供一种数据处理系统,系统包括客户端和服务器,其中:
该客户端为第五方面的任一可能的实现方式所描述的客户端,或者第六方面的任一可能的实现方式所描述的客户端,或者第九方面的任一可能的实现方式所描述的客户端,或者第十方面的任一可能的实现方式所描述的客户端;
该服务器为第七方面的任一可能的实现方式所描述的服务器,或者第八方面的任一可 能的实现方式所描述的服务器,或者十一方面的任一可能的实现方式所描述的服务器,或者第十二方面的任一可能的实现方式所描述的服务器。
通过实施本发明实施例,该服务器在MPD中通过互补标识标明视角码流和互补码流,相应地,客户端接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向该服务器请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码流几乎不存在交叠的内容,节省了该服务器与该客户端之间的传输带宽和该客户端上的存储空间。
附图说明
下面将对背景技术或者实施例所需要使用的附图作简单地介绍。
图1是系统层视频流媒体传输采用的DASH标准传输的框架实例示意图;
图2是本发明实施例提供的视频文件被编码为多种码率的码流的示意图;
图3是本发明实施例提供的MPD文件分段描述的场景示意图;
图4是本发明实施例提供的一种码流数据分段存储的场景示意图;
图5是本发明实施例提供的又一种码流数据分段存储的场景示意图;
图6是本发明实施例提供的一种空间对象的场景示意图;
图7是本发明实施例提供的又一种空间对象的场景示意图;
图8是本发明实施例提供的一种数据处理方法的流程示意图;
图9是本发明实施例提供的一种客户端的结构示意图;
图10是本发明实施例提供的一种服务器的结构示意图;
图11是本发明实施例提供的又一种客户端的结构示意图;
图12是本发明实施例提供的又一种服务器的结构示意图;
图13是本发明实施例提供的一种数据处理系统的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图对本发明实施例中的技术方案进行描述。
请参见图6,用户在观看视频的过程中,可通过眼部或者头部转动,或者视频观看设备的画面切换等操作进行视角的切换,相应地,视角注视的位置由空间对象1切换到空间对象2。视角从注视空间对象1切换到注视空间对象2时服务器需要向客户端发送空间对象2的高质量编码内容,在发送该高质量编码内容的同时该客户端会先呈现空间对象2的低质量编码内容,等该客户端接收到该空间对象2的高质量编码内容时显示该空间对象2的高质量编码内容而不再显示该空间对象2的低质量编码内容,相当于空间对象2的低质量编码内容只是用来过渡以避免用户因无法及时看到空间对象2的内容而出现不适。然而,实际应用中,用户的视角FOV停留在空间对象2上的时间往往比较长,在FOV停留在空间对象2的过程中,服务器向该客户端发送的该空间对象2的低质量编码内容均未被用上,这些低质量编码内容造成了传输带宽的浪费,也造成了客户端中出现较多冗余数据。为了解决这个问题,本发明实施例提供如下方法。
请参见图8,图8是本发明实施例提供的一种视频数据处理方法的流程示意图,该方法包括但不限于如下步骤。
步骤S801:服务器生成媒体呈现描述MPD。
具体地,生成该MPD的过程包含建立视角码流与互补码流的对应关系,可以在该MPD中配置互补标识来体现该对应关系(或者配置第一互补标识和第二互补标识来体现该对应关系,如样例三);该视角码流为对目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片由该第一空间对象的内容和该第二空间对象的内容组成(可以称第一空间对象与第二空间对象的在该目标图片中“互补”),该目标图片为该服务器提供的视频源(例如,电视剧、电影等)中的某张图片(或称某帧)。该第一空间对象和该第二空间对象均被定义为一个内容成分的一部分空间,用来描述空间关系(spatial relationships),例如,现有的感兴趣区域(英文:region of interest,简称:ROI)、片(tile)等均属于一个内容(如一张图片)的部分空间,该空间对象的信息可以在自适应集Adaptation Set、表示(Representation)信息、子表示(Sub-Representation)、描述子等中描述。
图片的内容在编码时需要参照预先设定的编码参数来编码,该编码参数往往定义了辨率、压缩率、码率等信息,采用的编码参数不同则编码出的效果也不同,例如,分辨率、码率越大则编码得到的图片看起来越清晰。本发明实施例编码得到视角码流和互补码流时可以各自参照不同的编码参数编码,使得该视角码流和互补码流显示出的效果不同。本发明实施例要预先建立该视角码流与该互补码流的对应关系来表明该视角码流与该互补码流之间互补,以便确定视角码流后可以根据该对应关系找到该视角码流对应的互补码流,确定视角码流的规则此处不作限定,为了便于理解以下举一例进行说明:由于视角码流是对第一空间对象内的内容编码得到,因此确定该第一空间对象即可确定该视角码流,可以将用户的FOV注视到的空间对象确定为第一空间对象从而确定该视角码流。在根据编码参数编码得到该视角码流和互补码流时,可以将视角码流编码得更清晰一点。需要说明的是,若用户的视角FOV发生了改变则基于改变后的FOV确定的视角码流也会发生改变,重新确定的视角码流也会对应新的互补码流。以图8为例,目标图片的完整空间为空间对象A~I拼接形成的空间,当视角码流对应的第一空间对象为空间对象A时,该互补码流对应的第二空间对象为空间对象B~I拼接形成的空间;当视角码流对应的第一空间对象为空间对象E时,该互补码流对应的第二空间对象为空间对象A、B、C、D、F、G、E和I拼接形成的空间,其余依次类推,上述的空间对象A和其对应的互补空间对象的空间内容可以完全不重叠,也可以部分重叠。
可选的,该目标图片用于360度地展示某个场景,该目标图片的轮廓可以为规则的形状也可以为不规则的形状。
以下举例讲述如何通过互补标识来体现视角码流与互补码流的对应关系。
样例一:
在MPD中增加互补标识ComplementaryId来标记视角码流的representation ID。以下先通过表1对涉及到的ComplementaryId进行介绍,然后结合相关代码讲述具体如何应用。
Figure PCTCN2017092772-appb-000005
表1
在样例一中,该媒体呈现描述包含自适应集(AdaptationSet),为了与其他样例中的自适应集区分,可以称样例一中的自适应集为第一自适应集,该第一自适应集中的一个表示的信息包含该互补标识,以用于标识该一个表示的信息所描述的码流为该互补码流。该互补标识的值为该媒体呈现描述中的另一个表示的信息的表示标识representaion ID的值,以用于标识该另一个表示的信息所描述的码流为该视角码流。举例来说,若描述码流A的表示的信息中存在ComplementaryId则认为码流A为互补码流且存在与码流A对应的视角码流,若描述码流B的表示的信息中的Representation ID的值等于该ComplementaryId的值则认为码流B为该视角码流。以下提供一种MPD的样例。
Figure PCTCN2017092772-appb-000006
Figure PCTCN2017092772-appb-000007
在上述代码中,描述码流video-3.mp4的表示的信息为<Representation id=“3"bandwidth="450000"complementaryId=”2”><BaseURL>video-3.mp4</BaseURL>
</Representation>,该表示的信息中存在complementaryId即表明码流video-3.mp4为互补码流,且video-3.mp4存在对应的视角码流;由于描述码流video-2.mp4的表示的信息<Representation id="2"bandwidth="450000"><BaseURL>video-2.mp4</BaseURL></Representation>中的表示标识Representation id的值等于互补标识complementaryId的值,即均等于2,因此码流video-2.mp4为该视角码流。
样例二:
在MPD中添加描述互补码流的adaptation set的互补标识ComplementaryId。以下先通过表2对涉及到的ComplementaryId进行介绍,然后通过相关代码讲述如何具体应用。
Figure PCTCN2017092772-appb-000008
表2
在样例二中,该媒体呈现描述包含自适应集(AdaptationSet),为了与其他样例中的自适应集区分,可以称样例二中的自适应集为第二自适应集,该第二自适应集包含该互补标识ComplementaryId,以表明该第二自适应集包含用于描述该互补码流的表示的信息。该互补标识ComplementaryId的值为该媒体呈现描述中的第三自适应集标识adaptationSet ID的值,以用于标识该第三自适应集中的表示的信息所描述的码流为该视角码流,该第三自适应集为区别于该第二自适应集的自适应集。举例来说,自适应集A中存在ComplementaryId,那么该自适应集A中的表示的信息所描述的码流为互补码流,且该互补码流存在对应的视角码流;若自适应集B中的自适应集标识adaptationSet ID的值等于该自适应集A中的互补标识ComplementaryId的值,那么,该自适应集B中的表示的信息描述的码流为该视角码流。
以下提供一种MPD样例。
Figure PCTCN2017092772-appb-000009
Figure PCTCN2017092772-appb-000010
在上述代码中,第二自适应集<AdaptationSet id=“2” complementaryId=“1”[…]><EssentialProperty schemeIdUri="urn:mpeg:dash:srd:2014"value="1"/>
<Representation id="2" bandwidth="450000"><BaseURL>video-2.mp4</BaseURL></Representation></AdaptationSet>中存在complementaryId表明该第二自适应集所描述的表示的信息描述的码流video-2.mp4为互补码流,且该互补码流存在对应的视角码流;AdaptationSet ID的值等于complementaryId的值(即等于1)的自适应集中的表示的信息所描述的码流为该视角码流。
样例三:
在MPD的第一描述子中添加第一互补标识ComplementaryId1来体现该第一描述子描述的对象为视角码流,在该MPD的第二描述子中添加第二互补标识ComplementaryId2来体现该第二描述子指定的对象为互补码流,以下先通过表3和表4对涉及到的ComplementaryId1和ComplementaryId2进行介绍,然后通过相关代码讲述具体如何应用。
Figure PCTCN2017092772-appb-000011
表3
Figure PCTCN2017092772-appb-000012
表4
在样例三中,该媒体呈现描述包含两个描述子,可称其中一个描述子为第一描述子,称另一个描述子为第二描述子,该第一描述子包含第一互补标识,该第二描述子包含第二互补标识,该第一互补标识的值等于预设的第一数值,以表明该第一描述子所描述的码流为该互补码流,该第二互补标识的值等于预设的第二数值,以表明该第二描述子所描述的码流为该视角码流。可选的,该第一描述子和该第二描述子分别为两个不同自适应集中的描述子。该第一数值和第二数值为预先配置的两个可以相互区分的数值。
可以理解的是,MPD中的描述子可以用来定义视频流中的空间对象,以下先简单介绍一下现有技术中的描述子,描述子的值value="1,0,0,1920,1080,3840,2160,2",其中,value的第一个值为视频源标识,该视频源标识等于1表明该value描述的内容源与上面的视频源相同;value的第二个值和第三个值用于体现空间对象的左上坐标,此处表明value描述的空间对象的坐标为(0,0);value的第四个值和第五个值为空间坐标,该空间坐标用于体现该空间对象的长宽,此处表明该空间对象的长宽为(1920,1080);value的第六个值和第七个值用于表示该空间对象参考的空间,此处表明该空间对象参考的空间为(3840,2160);value的第八个值为空间对象组标识,此处的空间对象组标识是2。
本发明实施例是在现有技术中的value的基础上,为该value新增一个属性,新增的属性排在该value中的哪个位置此处暂不作限定。该第一描述子中新增的属性可以称为第一互补标识,该第二描述子中新增的属性可以称为第二互补标识。该第一互补标识的值等于第一数值(例如,等于0)以表明该第一描述子描述的空间对象的内容为视角码流,并且, 在参考的空间中该空间坐标所表示的区域为该value描述的空间对象。该第二互补标识的值等于第二数值(例如,等于1)表明该第二描述子描述的空间对象的内容为互补码流。并且,在参考的空间中该空间坐标所表示的区域以外的部分为该value描述的空间对象。以下提供一种用于具体实现的程序代码。
Figure PCTCN2017092772-appb-000013
在上述代码中,包括描述子的值value="1,0,0,0,1920,1080,3840,2160,2"/>和描述子的值value="1,1,0,0,960,540,1920,1080,2"/>,可称value="1,0,0,0,1920,1080,3840,2160,2"/>的描述子为第一描述子,称value="1,1,0,0,960,540,1920,1080,2"/>的描述子为第二描述子,第一描述子的值value="1,0,0,0,1920,1080,3840,2160,2"/>中列出了9个值,其中 的第2个值为第一互补标识ComplementaryId1,也即是说,value="1,0,0,0,1920,1080,3840,2160,2"/>中的ComplementaryId2=0,因此第一描述子所描述的空间对象的内容为视角码流,并且,该空间对象为空间坐标(1920,1080)在参考的空间(3840,2160)中的区域。第二描述子的值value="1,1,0,0,960,540,1920,1080,2"/>中列出了9个值,其中的第2个值为第二互补标识ComplementaryId2,也即是说,value="1,1,0,0,960,540,1920,1080,2"/>中的ComplementaryId2=1,因此第二描述子所描述空间对象的内容为互补码流,并且,该空间对象为参考的空间(1920,1080)中除空间坐标(960,540)表示的空间区域以外的区域。进一步地,由于空间坐标(1920,1080)在参考的空间(3840,2160)中的空间区域,与空间坐标(960,540)在参考的空间(1920,1080)中的空间区域为同一个空间区域,因此第一描述子描述的空间对象为第一空间对象,第二描述子描述的空间对象为第二空间对象。
步骤S802:该服务器向客户端发送以上生成的媒体呈现描述MPD。
步骤S803:该客户端接收该MPD。
步骤S804:该客户获取该MPD中的互补标识,从而根据该互补标识确定视角码流和互补码流(或者解析出第一互补标识和第二互补标识,并根据该第一互补标识和该第二互补标识确定该视角码流和该互补码流)。
具体地,该服务器生成该MPD的规则不同则该客户端解析该MPD的方式也不同,以下分别以样例一、样例二和样例三为例讲述该客户端如何解析该MPD。
当该服务器按照上述样例一的规则生成MPD时,该客户端接收到该MPD后获取该MPD中的第一自适应集,分析该第一自适应集中的表示的信息,当某个表示的信息中包含互补ComplementaryId时,表明该某个表示的信息用于描述互补码流且该互补码流存在对应的视角码流,若存在另一个表示(Representation)的信息中的Representation ID的值等于该互补标识ComplementaryId的值则该另一个表示所描述的码流为该视角码流。
当服务器按照上述样例二的规则生成MPD时,该客户端接收到该MPD后获取该MPD的第二自适应集,如果该第二自适应集中包含互补标识ComplementaryId则表明该第二自适应集中的表示的信息描述的码流为互补码流且该互补码流存在对应的视角码流,若某个自适应集的自适应集标识AdaptationSet ID的值等于该互补标识ComplementaryId的值,则表明该某个自适应集的表示的信息所描述的码流为该视角码流。
当服务器按照上述样例三的规则生成MPD时,该客户端接收到该MPD后获取该MPD的描述子value,如果该MPD中有两个value之间满足预设关系,那么确定其中一个value描述的空间对象的内容为视角码流,确定另一个value描述的空间对象的内容为互补码流,该预设关系为:这两个value中一个value存在第一互补ComplementaryId1,另一个value存在第二互补标识ComplementaryId2,第一互补标识的值为第一数值,第二互补标识的值为第二数值,该其中一个value描述的空间对象为第一空间对象且该另一个value描述的空间对象为第二空间对象。
步骤S805:该客户端向该服务器请求该视角码流和该互补码流。
具体地,该MPD中可以携带该视角码流的网络存储地址和该互补码流的网络存储地址,该网络存储地址可以通过统一资源定位符(英文:Universal Resource Locator,URL)、偏移量等方式来体现。
步骤S806:该服务器接收该请求并根据该请求将该视角码流和该互补码流发送给该客户端。
步骤S807:该客户端接收该视角码流和互补码流,解码该视角码流和该互补码流并通过显示屏呈现。
在图8所描述的方法中,该服务器在MPD中通过互补标识标明视角码流和互补码流,相应地,客户端接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向该服务器请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码流几乎不存在交叠的内容,节省了该服务器与该客户端之间的传输带宽和该客户端上的存储空间。
上述详细阐述了本发明实施例的方法,为了便于更好地实施本发明实施例的上述方案,相应地,下面提供了本发明实施例的装置。
请参见图9,图9是本发明实施例提供的一种客户端90的结构示意图,该客户端90可以包括接收单元901和获取单元902,各个单元的详细描述如下。
接收单元901用于接收媒体呈现描述,该媒体呈现描述包含互补标识,以表明该媒体呈现描述中描述了视角码流和互补码流,该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;
获取单元902用于根据该互补标识获取该视角码流和该互补码流。
通过运行上述单元,该服务器在MPD中通过互补标识标明视角码流和互补码流,相应地,客户端90接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向该服务器请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码流几乎不存在交叠的内容,节省了该服务器与该客户端90之间的传输带宽和该客户端90上的存储空间。
在一种可选的方案中,该媒体呈现描述包含第一自适应集,该第一自适应集中的一个表示的信息包含该互补标识,以用于标识该的一个表示的信息所描述的码流为该互补码流。可选的,该互补标识的值为该媒体呈现描述中另一个表示的信息的表示标识representaion ID的值,以用于标识该另一个表示的信息所描述的码流为该视角码流。
在又一种可选的方案中,该媒体呈现描述包含第二自适应集,该第二自适应集包含该互补标识,以表明该第二自适应集包含用于描述该互补码流的表示的信息。可选的,该互补标识的值为该媒体呈现描述中的第三自适应集标识adaptationSet ID的值,以用于标识该第三自适应集中的表示的信息所描述的码流为该视角码流。
在本发明实施例中,该客户端90包括的接收单元901和获取单元902的相关描述还可以为:
接收单元901用于接收媒体呈现描述,该媒体呈现描述包含第一描述子和第二描述子,该第一描述子包含第一互补标识,该第二描述子包含第二互补标识,该第一互补标识的值 等于预设的第一数值,以用于标识该第一描述子所描述的码流为互补码流,该第二互补标识的值等于预设的第二数值,以用于标识该第二描述子所描述的码流为视角码流;该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;
获取单元902用于根据所述第一互补标识获取所述互补码流以及根据所述第二互补标识获取所述视角码流。
需要说明的是,各个单元的具体实现还可以对应参照图8所示的方法实施例的相应描述。
在图9所描述的客户端90中,该服务器在MPD中通过互补标识标明视角码流和互补码流,相应地,客户端90接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向该服务器请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码流几乎不存在交叠的内容,节省了该服务器与该客户端90之间的传输带宽和该客户端90上的存储空间。
请参见图10,图10是本发明实施例提供的一种服务器100的结构示意图,该服务器100可包括生成单元1001和发送单元1002,各个单元的详细描述如下。
生成单元1001用于生成媒体呈现描述,该媒体呈现描述包含互补标识,以表明该媒体呈现描述中描述了视角码流和互补码流,该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;
发送单元1002用于向客户端发送该媒体呈现描述,以使该客户端根据该互补标识获取该视角码流和该互补码流。
通过运行上述单元,该服务器100在MPD中通过互补标识标明视角码流和互补码流,相应地,客户端接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向该服务器100请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码流几乎不存在交叠的内容,节省了该服务器100与该客户端之间的传输带宽和该客户端上的存储空间。
在一种可选的方案中,该媒体呈现描述包含第一自适应集,该第一自适应集中的一个表示的信息包含该互补标识,以用于标识该一个表示的信息所描述的码流为该互补码流。可选的,该互补标识的值为该媒体呈现描述中的另一个表示的信息的表示标识representaion ID的值,以用于标识该另一个表示的信息所描述的码流为该视角码流。
在又一种可选的方案中,该媒体呈现描述包含第二自适应集,该第二自适应集包含该互补标识,以表明该第二自适应集包含用于描述该互补码流的表示的信息。可选的,该互补标识的值为该媒体呈现描述中的第三自适应集标识adaptationSet ID的值,以用于标识该第三自适应集中的表示的信息所描述的码流为该视角码流。
在本发明实施例中,该服务器100包括的生成单元1001和发送单元1002的描述还可以如下:
生成单元1001用于生成媒体呈现描述,该媒体呈现描述包含第一描述子和第二描述子,该第一描述子包含第一互补标识,该第二描述子包含第二互补标识,该第一互补标识的值等于预设的第一数值,以用于标识该第一描述子所描述的码流为互补码流,该第二互补标识的值等于预设的第二数值,以用于标识该第二描述子所描述的码流为视角码流;该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;
发送单元1002用于向客户端发送该媒体呈现描述,以使该客户端根据所述第一互补标识获取所述互补码流以及根据所述第二互补标识获取所述视角码流。
需要说明的是,各个单元的具体实现还可以对应参照图8所示的方法实施例的相应描述。
在图10所描述的服务器100中,该服务器100在MPD中通过互补标识标明视角码流和互补码流,相应地,客户端接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向该服务器100请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码流几乎不存在交叠的内容,节省了该服务器100与该客户端之间的传输带宽和该客户端上的存储空间。
请参见图11,图11是本发明实施例提供的又一种客户端110的结构示意图,该客户端110可以包括处理器1101、存储器1102和输入组件1103,该处理器1101与存储器1102以及与输入组件1103通过总线相互连接。
存储器1102包括但不限于是随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或者快闪存储器)、或便携式只读存储器(CD-ROM),该存储器1102用于相关指令及数据。
处理器1101可以是一个或多个中央处理器(英文:Central Processing Unit,简称:CPU),在处理器1101是一个CPU的情况下,该CPU可以是单核CPU,也可以是多核CPU。
输入组件1103可为用来收发信号的射频模块、用于网络通信的通信接口等。
该客户端110中的处理器1101用于读取该存储器1102中存储的程序代码,执行以下操作:
通过该输入组件1103接收媒体呈现描述,该媒体呈现描述包含互补标识,以表明该媒体呈现描述中描述了视角码流和互补码流,该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;
根据该互补标识获取该视角码流和该互补码流。
通过执行上述操作,该服务器在MPD中通过互补标识标明视角码流和互补码流,相应地,客户端110接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向 该服务器请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码流几乎不存在交叠的内容,节省了该服务器与该客户端110之间的传输带宽和该客户端110上的存储空间。
在一种可选的方案中,该媒体呈现描述包含第一自适应集,该第一自适应集中的一个表示的信息包含该互补标识,以用于标识该一个表示的信息所描述的码流为该互补码流。可选的,该互补标识的值为该媒体呈现描述中的另一个表示的信息的表示标识representaion ID的值,以用于标识该另一个表示的信息所描述的码流为该视角码流。
在又一种可选的方案中,该媒体呈现描述包含第二自适应集,该第二自适应集包含该互补标识,以表明该第二自适应集包含用于描述该互补码流的表示的信息。可选的,该互补标识的值为该媒体呈现描述中的第三自适应集标识adaptationSet ID的值,以用于标识该第三自适应集中的表示的信息所描述的码流为该视角码流。
在本发明实施例中,该客户端110中的处理器1101还可以用于读取该存储器1102中存储的程序代码,来执行以下操作:
通过该输入组件1103接收媒体呈现描述,该媒体呈现描述包含第一描述子和第二描述子,该第一描述子包含第一互补标识,该第二描述子包含第二互补标识,该第一互补标识的值等于预设的第一数值,以用于标识该第一描述子所描述的码流为互补码流,该第二互补标识的值等于预设的第二数值,以用于标识该第二描述子所描述的码流为视角码流;该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;
根据所述第一互补标识获取所述互补码流以及根据所述第二互补标识获取所述视角码流。
需要说明的是,各个操作的具体实现还可以对应参照图8所示的方法实施例的相应描述。
在图11所描述的客户端110中,该服务器在MPD中通过互补标识标明视角码流和互补码流,相应地,客户端110接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向该服务器请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码流几乎不存在交叠的内容,节省了该服务器与该客户端110之间的传输带宽和该客户端110上的存储空间。
请参见图12,图12是本发明实施例提供的又一种服务器120的结构示意图,该服务器120可以包括处理器1201、存储器1202和输出组件1203,该处理器1201与存储器1202以及与输出组件1203通过总线相互连接。
存储器1202包括但不限于是随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或者快闪存储器)、或便携式只读存储器(CD-ROM),该存储器1202用于相关指令及数据。
处理器1201可以是一个或多个中央处理器(英文:Central Processing Unit,简称:CPU),在处理器1201是一个CPU的情况下,该CPU可以是单核CPU,也可以是多核CPU。
输出组件1203可为用来收发信号的射频模块、用于网络通信的通信接口等。
该服务器120中的处理器1201用于读取该存储器1202中存储的程序代码,执行以下操作:
生成媒体呈现描述,该媒体呈现描述包含互补标识,以表明该媒体呈现描述中描述了视角码流和互补码流,该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;
通过该输出组件1203向客户端发送该媒体呈现描述,以使该客户端根据该互补标识获取该视角码流和该互补码流。
通过执行上述操作,该服务器120在MPD中通过互补标识标明视角码流和互补码流,相应地,客户端接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向该服务器120请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码流几乎不存在交叠的内容,节省了该服务器120与该客户端之间的传输带宽和该客户端上的存储空间。
在一种可选的方案中,该媒体呈现描述包含第一自适应集,该第一自适应集中的一个表示的信息包含该互补标识,以用于标识该的一个表示的信息所描述的码流为该互补码流。可选的,该互补标识的值为该媒体呈现描述中的另一个表示的信息的表示标识representaion ID的值,以用于标识该另一个表示的信息所描述的码流为该视角码流。
在又一种可选的方案中,该媒体呈现描述包含第二自适应集,该第二自适应集包含该互补标识,以表明该第二自适应集包含用于描述该互补码流的表示的信息。可选的,该互补标识的值为该媒体呈现描述中的第三自适应集标识adaptationSet ID的值,以用于标识该第三自适应集中的表示的信息所描述的码流为该视角码流。
在本发明实施例中,该服务器120中的处理器1201还可以用于读取该存储器1202中存储的程序代码,来执行以下操作:
生成媒体呈现描述,该媒体呈现描述包含第一描述子和第二描述子,该第一描述子包含第一互补标识,该第二描述子包含第二互补标识;该第一互补标识的值等于预设的第一数值,以用于标识该第一描述子所描述的码流为互补码流,该第二互补标识的值等于预设的第二数值,以用于标识该第二描述子所描述的码流为视角码流;该视角码流为对该目标图片的第一空间对象的内容编码得到码流,该互补码流为对该目标图片的第二空间对象的内容编码得到码流,该目标图片包括该第一空间对象的内容和该第二空间对象的内容;
通过该输出组件1203向客户端发送该媒体呈现描述,以使该客户端根据所述第一互补标识获取所述互补码流以及根据所述第二互补标识获取所述视角码流。
需要说明的是,各个操作的具体实现还可以对应参照图8所示的方法实施例的相应描述。
在图12所描述的服务器120中,该服务器120在MPD中通过互补标识标明视角码流 和互补码流,相应地,客户端接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向该服务器120请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码流几乎不存在交叠的内容,节省了该服务器120与该客户端之间的传输带宽和该客户端上的存储空间。
上述详细阐述了本发明实施例的方法和装置,为了便于更好地实施本发明实施例的上述方案,相应地,下面提供了本发明实施例的相关系统。
请参见图13,图13是本发明实施例提供的一种数据处理系统130的结构示意图,该系统130包括客户端1301和服务器1302,其中:
客户端1301可以为图9描述的客户端90或者图11描述的客户端110;
服务器1302可以为图10描述的服务器100或者图12描述的服务器120。
在图13所描述的数据处理系统130中,该服务器1302在MPD中通过互补标识标明视角码流和互补码流,相应地,客户端1301接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向该服务器1302请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码流几乎不存在交叠的内容,节省了该服务器1302与该客户端1301之间的传输带宽和该客户端1301上的存储空间。
综上该,通过实施本发明实施例,该服务器在MPD中通过互补标识标明视角码流和互补码流,相应地,客户端接收到该MPD后根据该互补标识确定该视角码流和该互补码流,然后向该服务器请求该视角码流和该互补码流并呈现;由于该视角码流对应的第一空间对象的内容与该互补码流对应的第二空间对象的内容组成完整的目标图片,因此该视角码流与该互补码流几乎不存在交叠的内容,节省了该服务器与该客户端之间的传输带宽和该客户端上的存储空间。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,可通过计算机程序来指令相关的硬件来完成,该的程序可存储于计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可存储程序代码的介质。

Claims (26)

  1. 一种数据的处理方法,其特征在于,包括:
    接收媒体呈现描述,所述媒体呈现描述包含互补标识,以表明所述媒体呈现描述中描述了视角码流和互补码流,所述视角码流为对所述目标图片的第一空间对象的内容编码得到码流,所述互补码流为对所述目标图片的第二空间对象的内容编码得到码流,所述目标图片包括所述第一空间对象的内容和所述第二空间对象的内容;
    根据所述互补标识获取所述视角码流和所述互补码流。
  2. 根据权利要求1所述的方法,其特征在于,所述媒体呈现描述包含第一自适应集,所述第一自适应集中的一个表示的信息包含所述互补标识,以用于标识所述一个表示的信息所描述的码流为所述互补码流。
  3. 根据权利要求2所述的方法,其特征在于,所述互补标识的值为所述媒体呈现描述中的另一个表示的信息的表示标识representaion ID的值,以用于标识所述另一个表示的信息所描述的码流为所述视角码流。
  4. 根据权利要求1所述的方法,其特征在于,所述媒体呈现描述包含第二自适应集,所述第二自适应集包含所述互补标识,以表明所述第二自适应集包含用于描述所述互补码流的表示的信息。
  5. 根据权利要求4所述的方法,其特征在于,所述互补标识的值为所述媒体呈现描述中的第三自适应集标识adaptationSet ID的值,以用于标识所述第三自适应集中的表示的信息所描述的码流为所述视角码流。
  6. 一种数据处理方法,其特征在于,包括:
    接收媒体呈现描述,所述媒体呈现描述包含第一描述子和第二描述子,所述第一描述子包含第一互补标识,所述第二描述子包含第二互补标识,所述第一互补标识的值等于预设的第一数值,以用于标识所述第一描述子所描述的码流为互补码流,所述第二互补标识的值等于预设的第二数值,以用于标识所述第二描述子所描述的码流为视角码流;所述视角码流为对所述目标图片的第一空间对象的内容编码得到码流,所述互补码流为对所述目标图片的第二空间对象的内容编码得到码流,所述目标图片包括所述第一空间对象的内容和所述第二空间对象的内容;
    根据所述第一互补标识获取所述互补码流以及根据所述第二互补标识获取所述视角码流。
  7. 一种数据的处理方法,其特征在于,包括:
    生成媒体呈现描述,所述媒体呈现描述包含互补标识,以表明所述媒体呈现描述中描述了视角码流和互补码流,所述视角码流为对所述目标图片的第一空间对象的内容编码得 到码流,所述互补码流为对所述目标图片的第二空间对象的内容编码得到码流,所述目标图片包括所述第一空间对象的内容和所述第二空间对象的内容;
    向客户端发送所述媒体呈现描述,以使所述客户端根据所述互补标识获取所述视角码流和所述互补码流。
  8. 根据权利要求7所述的方法,其特征在于,所述媒体呈现描述包含第一自适应集,所述第一自适应集中的一个表示的信息包含所述互补标识,以用于标识所述一个表示的信息所描述的码流为所述互补码流。
  9. 根据权利要求8所述的方法,其特征在于,所述互补标识的值为所述媒体呈现描述中的另一个表示的信息的表示标识representaion ID的值,以用于标识所述另一个表示的信息所描述的码流为所述视角码流。
  10. 根据权利要求7所述的方法,其特征在于,所述媒体呈现描述包含第二自适应集,所述第二自适应集包含所述互补标识,以表明所述第二自适应集包含用于描述所述互补码流的表示的信息。
  11. 根据权利要求10所述的方法,其特征在于,所述互补标识的值为所述媒体呈现描述中的第三自适应集标识adaptationSet ID的值,以用于标识所述第三自适应集中的表示的信息所描述的码流为所述视角码流。
  12. 一种数据处理方法,其特征在于,包括:生成媒体呈现描述,所述媒体呈现描述包含第一描述子和第二描述子,所述第一描述子包含第一互补标识,所述第二描述子包含第二互补标识;所述第一互补标识的值等于预设的第一数值,以用于标识所述第一描述子所描述的码流为互补码流,所述第二互补标识的值等于预设的第二数值,以用于标识所述第二描述子所描述的码流为视角码流;所述视角码流为对所述目标图片的第一空间对象的内容编码得到码流,所述互补码流为对所述目标图片的第二空间对象的内容编码得到码流,所述目标图片包括所述第一空间对象的内容和所述第二空间对象的内容;
    向客户端发送所述媒体呈现描述,以使所述客户端根据所述第一互补标识获取所述互补码流以及根据所述第二互补标识获取所述视角码流。
  13. 一种客户端,其特征在于,包括:
    接收单元,用于接收媒体呈现描述,所述媒体呈现描述包含互补标识,以表明所述媒体呈现描述中描述了视角码流和互补码流,所述视角码流为对所述目标图片的第一空间对象的内容编码得到码流,所述互补码流为对所述目标图片的第二空间对象的内容编码得到码流,所述目标图片包括所述第一空间对象的内容和所述第二空间对象的内容;
    获取单元,用于根据所述互补标识获取所述视角码流和所述互补码流。
  14. 根据权利要求13所述的客户端,其特征在于,所述媒体呈现描述包含第一自适应集,所述第一自适应集中的一个表示的信息包含所述互补标识,以用于标识所述一个表示的信息所描述的码流为所述互补码流。
  15. 根据权利要求14所述的客户端,其特征在于,所述互补标识的值为所述媒体呈现描述中的另一个表示的信息的表示标识representaion ID的值,以用于标识所述另一个表示的信息所描述的码流为所述视角码流。
  16. 根据权利要求13所述的客户端,其特征在于,所述媒体呈现描述包含第二自适应集,所述第二自适应集包含所述互补标识,以表明所述第二自适应集包含用于描述所述互补码流的表示的信息。
  17. 根据权利要求16所述的客户端,其特征在于,所述互补标识的值为所述媒体呈现描述中的第三自适应集标识adaptationSet ID的值,以用于标识所述第三自适应集中的表示的信息所描述的码流为所述视角码流。
  18. 一种客户端,其特征在于,包括:
    接收单元,用于接收媒体呈现描述,所述媒体呈现描述包含第一描述子和第二描述子,所述第一描述子包含第一互补标识,所述第二描述子包含第二互补标识,所述第一互补标识的值等于预设的第一数值,以用于标识所述第一描述子所描述的码流为互补码流,所述第二互补标识的值等于预设的第二数值,以用于标识所述第二描述子所描述的码流为视角码流;所述视角码流为对所述目标图片的第一空间对象的内容编码得到码流,所述互补码流为对所述目标图片的第二空间对象的内容编码得到码流,所述目标图片包括所述第一空间对象的内容和所述第二空间对象的内容;
    获取单元,用于根据所述第一互补标识获取所述互补码流以及根据所述第二互补标识获取所述视角码流。
  19. 一种服务器,其特征在于,包括:
    生成单元,用于生成媒体呈现描述,所述媒体呈现描述包含互补标识,以表明所述媒体呈现描述中描述了视角码流和互补码流,所述视角码流为对所述目标图片的第一空间对象的内容编码得到码流,所述互补码流为对所述目标图片的第二空间对象的内容编码得到码流,所述目标图片所述第一空间对象的内容和所述第二空间对象的内容;
    发送单元,用于向客户端发送所述媒体呈现描述,以使所述客户端根据所述互补标识获取所述视角码流和所述互补码流。
  20. 根据权利要求19所述的服务器,其特征在于,所述媒体呈现描述包含第一自适应集,所述第一自适应集中的一个表示的信息包含所述互补标识,以用于标识所述一个表示的信息所描述的码流为所述互补码流。
  21. 根据权利要求20所述的服务器,其特征在于,所述互补标识的值为所述媒体呈现描述中的另一个表示的信息的表示标识representaion ID的值,以用于标识所述另一个表示的信息所描述的码流为所述视角码流。
  22. 根据权利要求19所述的服务器,其特征在于,所述媒体呈现描述包含第二自适应集,所述第二自适应集包含所述互补标识,以表明所述第二自适应集包含用于描述所述互补码流的表示的信息。
  23. 根据权利要求22所述的服务器,其特征在于,所述互补标识的值为所述媒体呈现描述中的第三自适应集标识adaptationSet ID的值,以用于标识所述第三自适应集中的表示的信息所描述的码流为所述视角码流。
  24. 一种服务器,其特征在于,包括:
    生成单元,用于生成媒体呈现描述,所述媒体呈现描述包含第一描述子和第二描述子,所述第一描述子包含第一互补标识,所述第二描述子包含第二互补标识,所述第一互补标识的值等于预设的第一数值,以用于标识所述第一描述子所描述的码流为互补码流,所述第二互补标识的值等于预设的第二数值,以用于标识所述第二描述子所描述的码流为视角码流;所述视角码流为对所述目标图片的第一空间对象的内容编码得到码流,所述互补码流为对所述目标图片的第二空间对象的内容编码得到码流,所述目标图片包括所述第一空间对象的内容和所述第二空间对象的内容;
    发送单元,用于向客户端发送所述媒体呈现描述,以使所述客户端根据所述第一互补标识获取所述互补码流并根据所述第二互补标识获取所述视角码流。
  25. 一种数据处理系统,其特征在于,所述系统包括客户端和服务器,所述客户端为权利要求13~18任一项所述的客户端;所述服务器为权利要求19~24任一项所述的服务器。
  26. 一种存储介质,其特征在于,所述存储介质用于存储指令,所述指令在处理器上运行时使得权利要求1-12任一项所述的方法得以实现。
PCT/CN2017/092772 2016-10-18 2017-07-13 一种数据处理方法、相关设备及系统 WO2018072488A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610909014.4A CN107959861B (zh) 2016-10-18 2016-10-18 一种数据处理方法、相关设备及系统
CN201610909014.4 2016-10-18

Publications (1)

Publication Number Publication Date
WO2018072488A1 true WO2018072488A1 (zh) 2018-04-26

Family

ID=61954277

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/092772 WO2018072488A1 (zh) 2016-10-18 2017-07-13 一种数据处理方法、相关设备及系统

Country Status (2)

Country Link
CN (1) CN107959861B (zh)
WO (1) WO2018072488A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3787305A4 (en) * 2018-05-22 2021-03-03 Huawei Technologies Co., Ltd. PROCESS PLAYING VIDEO IN VR, TERMINAL, AND SERVER

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108833937B (zh) * 2018-05-30 2021-03-23 华为技术有限公司 视频处理方法和装置
EP3753257A4 (en) * 2019-03-20 2021-09-15 Beijing Xiaomi Mobile Software Co., Ltd. METHOD AND DEVICE FOR TRANSMISSION OF POINT OF VIEW SWITCHING CAPABILITIES IN A VR360 APPLICATION

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102055967A (zh) * 2009-10-28 2011-05-11 中国移动通信集团公司 多视点视频的视角切换以及编码方法和装置
CN102217322A (zh) * 2011-05-27 2011-10-12 华为技术有限公司 媒体发送方法、媒体接收方法和客户端及系统
CN102595203A (zh) * 2011-01-11 2012-07-18 中兴通讯股份有限公司 一种多媒体数据的传输、接收方法及其传输、接收设备
CN104301769A (zh) * 2014-09-24 2015-01-21 华为技术有限公司 呈现图像的方法、终端设备和服务器
US20150026242A1 (en) * 2013-07-19 2015-01-22 Electronics And Telecommunications Research Institute Apparatus and method for providing content
CN104904225A (zh) * 2012-10-12 2015-09-09 佳能株式会社 用于对视频数据进行流传输的方法和相应装置
CN105554513A (zh) * 2015-12-10 2016-05-04 Tcl集团股份有限公司 一种基于h.264的全景视频传输方法及系统

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102137137B (zh) * 2010-09-17 2013-11-06 华为技术有限公司 基于http流的媒体内容动态插播方法、装置及系统
CN102595111A (zh) * 2011-01-11 2012-07-18 中兴通讯股份有限公司 一种多视角编码码流的传输方法、装置和系统
US20140156865A1 (en) * 2012-11-30 2014-06-05 Futurewei Technologies, Inc. Generic Substitution Parameters in DASH
CN105933343B (zh) * 2016-06-29 2019-01-08 深圳市优象计算技术有限公司 一种用于720度全景视频网络播放的码流缓存方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102055967A (zh) * 2009-10-28 2011-05-11 中国移动通信集团公司 多视点视频的视角切换以及编码方法和装置
CN102595203A (zh) * 2011-01-11 2012-07-18 中兴通讯股份有限公司 一种多媒体数据的传输、接收方法及其传输、接收设备
CN102217322A (zh) * 2011-05-27 2011-10-12 华为技术有限公司 媒体发送方法、媒体接收方法和客户端及系统
CN104904225A (zh) * 2012-10-12 2015-09-09 佳能株式会社 用于对视频数据进行流传输的方法和相应装置
US20150026242A1 (en) * 2013-07-19 2015-01-22 Electronics And Telecommunications Research Institute Apparatus and method for providing content
CN104301769A (zh) * 2014-09-24 2015-01-21 华为技术有限公司 呈现图像的方法、终端设备和服务器
CN105554513A (zh) * 2015-12-10 2016-05-04 Tcl集团股份有限公司 一种基于h.264的全景视频传输方法及系统

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3787305A4 (en) * 2018-05-22 2021-03-03 Huawei Technologies Co., Ltd. PROCESS PLAYING VIDEO IN VR, TERMINAL, AND SERVER
US11765427B2 (en) 2018-05-22 2023-09-19 Huawei Technologies Co., Ltd. Virtual reality video playing method, terminal, and server

Also Published As

Publication number Publication date
CN107959861B (zh) 2020-08-25
CN107959861A (zh) 2018-04-24

Similar Documents

Publication Publication Date Title
KR102261559B1 (ko) 정보 처리 방법 및 장치
RU2711591C1 (ru) Способ, устройство и компьютерная программа для адаптивной потоковой передачи мультимедийного контента виртуальной реальности
CN107888993B (zh) 一种视频数据的处理方法及装置
WO2018214698A1 (zh) 一种视频信息的呈现方法和装置
WO2018058773A1 (zh) 一种视频数据的处理方法及装置
US10757162B2 (en) Video data processing method and apparatus
WO2018068236A1 (zh) 一种视频流传输方法、相关设备及系统
CN108282449B (zh) 一种应用于虚拟现实技术的流媒体的传输方法和客户端
CN109218755B (zh) 一种媒体数据的处理方法和装置
CN110913278B (zh) 视频播放方法、显示终端及存储介质
WO2018072488A1 (zh) 一种数据处理方法、相关设备及系统
JP7041472B2 (ja) マニフェストを作成する方法及びネットワーク機器
WO2018058993A1 (zh) 一种视频数据的处理方法及装置
WO2018120474A1 (zh) 一种信息的处理方法及装置
CN108271084B (zh) 一种信息的处理方法及装置
WO2023169003A1 (zh) 点云媒体的解码方法、点云媒体的编码方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17862091

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17862091

Country of ref document: EP

Kind code of ref document: A1