WO2018014691A1 - 一种媒体数据的获取方法和装置 - Google Patents
一种媒体数据的获取方法和装置 Download PDFInfo
- Publication number
- WO2018014691A1 WO2018014691A1 PCT/CN2017/089161 CN2017089161W WO2018014691A1 WO 2018014691 A1 WO2018014691 A1 WO 2018014691A1 CN 2017089161 W CN2017089161 W CN 2017089161W WO 2018014691 A1 WO2018014691 A1 WO 2018014691A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- reference frame
- information
- url
- obtaining
- index
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/40—Network security protocols
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/231—Content storage operation, e.g. caching movies for short term storage, replicating data over plural servers, prioritizing data for deletion
- H04N21/2312—Data placement on disk arrays
- H04N21/2318—Data placement on disk arrays using striping
Definitions
- the present invention relates to the field of media transmission, and in particular, to a method and an apparatus for acquiring media data.
- Streaming media refers to a technology and process of compressing and packaging a series of media data and transmitting data through the network segment to transmit media data on the network.
- DASH Dynamic Adaptive Streaming over HTTP
- MPD Media Presentation Description
- the server prepares multiple versions of the code stream for the same program content.
- Each version of the code stream is called a media representation in the DASH standard, and the code rate and resolution of different versions of the code stream are encoded.
- each code stream is divided into a plurality of small files, and each small file is called a slice.
- the server prepares three media representations rep1, rep2, rep3 for a movie; wherein rep1 is the code rate.
- rep2 is a standard-definition video with a code rate of 2mbps
- rep3 is a standard-definition video with a code rate of 1mbps
- the fragment marked as shaded in Figure 1 is the fragmented data requested by the client.
- the first three fragments requested by the client are the fragments of the media representation rep3, the fourth fragment is switched to rep2, the fourth fragment is requested, then the switch is switched to rep1, and the fifth and sixth fragments are requested.
- Each segment of the media representation can be stored in a file end to end, or can be stored as a small file independently; the segment can be packaged according to the standard ISO/IEC 14496-12 (ISO BMFF (Base) Media File Format)) can also be packaged in accordance with ISO/IEC 13818-1 (MPEG-2 TS).
- ISO/IEC 14496-12 ISO BMFF (Base) Media File Format
- ISO/IEC 13818-1 MPEG-2 TS
- the media presentation description is called MPD
- the MPD is an xml file.
- the information in the file is described in a hierarchical manner. As shown in FIG. 2 and FIG. 3, the information of the upper level is completely inherited by the next level. .
- Some media metadata is described in this file, which allows the client to understand the media content information in the server and can use this information to construct the http-URL of the request segment.
- media presentation is a collection of structured data for presenting media content; a media presentation description, a file that normalizes the presentation of media for providing streaming services; period (period) a set of consecutive periods that constitute the entire media presentation, the period having continuous and non-overlapping characteristics; a media representation, encapsulating one or more media components with descriptive metadata (encoded individual media types, a structured data set such as audio, video, etc.; an Adaptation Set, representing a set of mutually interchangeable encoded versions of the same media content; a subset, A combination of a set of adaptive sets, when the player plays all of the adaptive sets, the corresponding media content can be obtained; the fragmentation information is a media unit referenced by the HTTP uniform resource locator in the media presentation description, and the fragmentation information is described.
- the fragmentation of the media data, the fragmentation of the media data may be stored in a file, or may be stored separately. In one possible manner, the fragmentation of the media data is stored in the MPD.
- the segment in the media representation has two storage methods: one is separately stored separately, as shown in FIG. 4; the other is stored in a file, as shown in FIG. 5.
- the corresponding MPD describes the URL related information of the segment into two types.
- the MPD describes the segment related information in the form of a template or a list.
- each segment has an index in front of it.
- An index segment is used to describe the following segment; when the segment is stored in a file, the MPD describes the index by describing an index segment (the syntax in the slice is shown in the sidx box in Figure 5).
- Information about the segment, the index fragment describes the segment offset, size, and duration of the segment in a stored file.
- the video file is divided into a plurality of video segments having random access functions by a random access point, which is simply referred to as a random access segment, as shown in FIG.
- a random access segment includes one or more pictures; usually at least one non-random access point is set after a random access point in the video encoding.
- the encoding of different random access segments is independent of each other, so that the encoded video stream supports the functions of random access and fast forward and rewind playback.
- the video is split into segments that are independently encoded, the mutual information between the random access segments is not fully utilized, thereby limiting the efficiency of video encoding.
- a knowledge base is provided for the video encoder, so that the video encoder has a long-term "memory" function.
- an image similar to the current encoded/decoded image content can be selected from the knowledge base as a reference image, thereby performing interframe-based encoding on the current image.
- decoding as shown in Figure 7.
- the image in the knowledge base may be a reconstructed image of some images in the video.
- interframe encoded frames P frames or B frame
- I frames intra-coded frames
- This knowledge base-based coding method extracts similar content that appears multiple times in the video into the knowledge base, and improves the coding efficiency of the video by referring to the image in the knowledge base.
- the random access point image can be encoded/decoded with reference to the image in the knowledge base, or the conventional intra coding method can be directly used; the random access point image does not depend on other images in the video sequence for encoding/decoding, and each random access The segments are still independent of each other.
- Non-knowledge library code stream needs to be decoded with reference to knowledge base code stream, and multiple non-contiguous frames in non-knowledge library stream may Refer to the same knowledge base frame, as shown in Figure 7, scene 1 and scene 3 are referenced to the knowledge base frame 1 when encoding; in DASH If the scenario is to slice the non-knowledge code stream, if the scenario 1 and the scenario 3 belong to two different slices, the client needs to obtain the frame of the knowledge base frame 1 when decoding the scenario 1 and the scenario 3. Data, that is to say, there will be multiple segments corresponding to the same knowledge base frame. There is no one-to-one correspondence between the knowledge base frame and the segment in time.
- An embodiment of the present invention provides a method for obtaining media data, where the method includes: acquiring a media presentation description file, where the media presentation description file includes index fragmentation information; and obtaining an index fragment according to the index fragmentation information; Deriving the index fragment to obtain data fragmentation information and reference frame information, where the data fragmentation information is used to describe data fragmentation, the reference frame information corresponding to the data fragmentation; and obtaining, according to the reference frame information, The reference frame.
- the media presentation description file may be structured as a media presentation description (MPD) in the HTTP Dynamic Adaptive Streaming over HTTP (DASH) standard as specified by the Moving Picture Experts Group (MPEG) organization. Structure, it is also possible to appropriately add syntax elements describing the relevant knowledge base file attributes based on the above structure).
- MPD media presentation description
- DASH Dynamic Adaptive Streaming over HTTP
- MPEG Moving Picture Experts Group
- index fragments can be obtained in the manner of the existing DASH scheme.
- the MPD includes the URL address of the index fragment, and the client may request the index fragment from the URL address; in another possible manner, the index fragment is directly stored in the MPD;
- the MPD stores the URL template and the related attributes of the index fragment (for example, the fragment identifier, the storage range, and the like), and the client constructs the URL of the request index fragment according to the URL template and the related attributes of the index fragment. .
- multiple reference frames may be stored in one file or in different files.
- the reference frame may be stored in a file with the data slice or may be stored separately. If the reference frame is stored in the file of the data slice, the media presentation description file may use the MPD in the DASH, or may add a related syntax element describing the reference frame attribute in the MPD, and the syntax element may be in the representation layer of the media. In the attribute of the segmentbase; if the reference frame and the data slice are stored separately, the media presentation description file may use the MPD in the DASH, and the dependencyID attribute is used in the representation layer to describe the relationship between the representation of the reference frame and the representation of the data slice.
- describing, in the MPD, a knowledge base (reference frame) code stream to be referenced by the non-knowledge library code stream, in the MPD of the storage location byteRange in the code stream file for example, omitting other context level information in the MPD;
- LibarayFrame represents the attribute element of the knowledge base
- range represents the storage range attribute in the file of the knowledge base.
- the reference frame information corresponding to the data fragment is obtained by parsing the index fragment, so that the client can conveniently acquire the relationship between the data fragment and the reference frame.
- the reference frame information includes a byte offset of the reference frame and a number of bytes of the reference frame.
- the obtaining the reference frame according to the reference frame information includes: The byte offset of the reference frame and the number of bytes of the reference frame result in the reference frame.
- the scheme of this embodiment is more suitable for use in a video on demand scenario, and the code stream of the reference frame (knowledge base frame) can be stored in a file, and the client can request by a byterange when requesting a single reference frame.
- the code stream of the reference frame knowledge base frame
- the client can obtain the relationship between the fragment segment and the reference frame involved in the entire on-demand program by parsing the index fragment; after requesting the reference frame from the server, if the reference frame is subsequently Will be referenced by other segments, then the client can continue to save the reference frame, so that it does not have to request the server again in subsequent use, saving transmission bandwidth.
- the media presentation description file includes a uniform resource locator (URL) template
- the reference is obtained according to a byte offset of the reference frame and a number of bytes of the reference frame.
- the frame includes: obtaining a byte range of the reference frame according to a byte offset of the reference frame and a byte offset of the reference frame; obtaining a reference frame according to a byte range of the reference frame and the URL template URL; obtaining the reference frame according to the URL of the reference frame.
- URL uniform resource locator
- the media presentation description file includes storage location information of a reference frame.
- the obtaining a URL of the reference frame according to the byte range of the reference frame and the URL template includes: The storage location information of the reference frame, the byte range of the reference frame, and the URL template obtain the URL of the reference frame.
- the storage location information of the reference frame includes a storage range of the reference frame; or
- the storage location information of the reference frame includes storage file identification information of the reference frame.
- the reference frame information includes the identifier information of the reference frame.
- the obtaining the reference frame according to the reference frame information includes: obtaining, according to the identifier information of the reference frame Reference frame.
- This embodiment can be used for a scene in which a video is broadcasted.
- Each reference frame is stored in a separate file, and each file corresponds to identification information of one reference frame.
- the media presentation description file includes a uniform resource locator (URL) template
- the obtaining the reference frame according to the identification information of the reference frame includes: according to the The identification information of the reference frame and the URL template obtain a URL of the reference frame; and the reference frame is obtained according to the URL of the reference frame.
- URL uniform resource locator
- the template information SegmentTemplate in the MPD may be used, and the attribute is an existing attribute in the representation layer; the code stream dependency of the reference frame and the code stream of the data fragment are described by the attribute dependencyID existing in the DASH.
- the method further includes: parsing the index fragment to obtain a reference frame number corresponding to the data fragment.
- the client requests multiple data fragments, if the number of reference frames corresponding to one data fragment is 0, it indicates that the data fragment does not need a reference frame; if one data fragment corresponds to If the number of reference frames is 1, the corresponding reference frame can be obtained according to the foregoing embodiment; if the number of reference frames corresponding to one data slice is greater than 1, for each reference frame, it can be obtained according to the above embodiment, and the above steps are repeated until All reference frames corresponding to the data slice are obtained.
- the client decodes the data fragment by using the reference frame to perform the playback of the media content.
- the correspondence between the reference frame and the segment is described, but the reference relationship between the frame and the reference frame in the segment needs to be parsed by the frame information in the segment, but in the client, the reference frame is sent first.
- the decoder decodes and stores it in the decoder, so it is necessary to apply for the storage space for the smooth decoding of the knowledge base in advance when the decoder is initialized; this embodiment gives the number of reference frames required for frame decoding in the segment. How to carry information;
- the index fragment carries the number of reference frames required for frame decoding in the segment; for example, adding the attribute maxLibframeNumber to the sidx;
- the number of reference frames required for frame decoding in the segment is carried in the MPD; for example, the attribute maxLibframeNumber is added to the MPD;
- maxLibframeNumber The maximum number of reference frames that the segment needs to reference for decoding.
- the client After the client obtains the maxLibframeNumber information from the index fragment or from the MPD, the information is sent to the decoder; the decoder performs the application and management of the storage space according to the obtained maxLibframeNumber information.
- An embodiment of the second aspect of the present invention discloses a device for acquiring media data, the device comprising: an obtaining module, configured to acquire a media presentation description file, where the media presentation description file includes index fragmentation information; and the acquiring module The method is further configured to obtain an index fragment according to the index fragmentation information, and the parsing module is configured to parse the index fragment to obtain reference frame information and data fragmentation information, where the data fragmentation information is used to describe data fragmentation.
- the reference frame information corresponds to the data fragment; the obtaining module is further configured to obtain the reference frame according to the reference frame information.
- the reference frame information includes a byte offset of a reference frame and a number of bytes of a reference frame
- the acquiring module is configured to use a byte offset of the reference frame and the reference The number of bytes of the frame gets the reference frame.
- the media presentation description file includes a uniform resource locator (URL) template
- the obtaining module is configured to: according to a byte offset of the reference frame and a byte of the reference frame Deviating to obtain a byte range of the reference frame; obtaining a URL of the reference frame according to the byte range of the reference frame and the URL template; and obtaining the reference frame according to the URL of the reference frame.
- URL uniform resource locator
- the media presentation description file includes storage location information of a reference frame
- the acquiring module is configured to: according to storage location information of the reference frame, a byte range of the reference frame, and the The URL template gets the URL of the reference frame.
- the storage location information of the reference frame includes a storage range of the reference frame; or the storage location information of the reference frame includes storage file identification information of the reference frame.
- the reference frame information includes identifier information of a reference frame
- the acquiring module is configured to obtain the reference frame according to the identifier information of the reference frame.
- the media presentation description file includes a uniform resource locator (URL) template
- the obtaining module is configured to: obtain a URL of the reference frame according to the identification information of the reference frame and the URL template; The reference frame is obtained according to the URL of the reference frame.
- URL uniform resource locator
- the parsing module is further configured to parse the index fragment to obtain a data fragment pair.
- a third aspect of the present invention discloses a file format of media data, where the file format includes correspondence information of a reference frame and a data slice.
- the file format of the media data disclosed in the embodiment of the present invention is applied to the DASH standard protocol framework, and some syntax elements are appropriately added, so that the client can obtain the relationship between the reference frame and the data fragment by parsing the file format.
- the file in the file format of the embodiment of the present invention may be the index fragment in the above implementation.
- the file format also includes data fragmentation information.
- the correspondence information includes a byte offset of a reference frame and a number of bytes of a reference frame.
- the relevant description of the syntax elements in the file format based on the DASH protocol is as follows:
- Flag 0x01: indicates that the knowledge base frame information corresponding to the segment is described in the sidx box;
- Library_frame_count the number of knowledge base frames that need to be referenced by segment
- Library_frame_offset the first byte offset of the knowledge base frame in the stored stream; in an embodiment of the invention, the byte offset may be an absolute offset or a relative offset relative to a certain slice.
- the number of bytes of the grammar can be 32 bits or 64 bits;
- Library_frame_size The byte size of the knowledge base frame.
- the correspondence relationship information includes identifier information of a reference frame.
- the relevant description of the syntax elements in the file format based on the DASH protocol is as follows:
- Flag 0x01: indicates that the knowledge base frame information corresponding to the segment is described in sidx.
- Library_frame_count the number of knowledge base frames to be referenced by the media segment
- Library_frame_id ID of the knowledge base frame.
- the file format further includes reference frame quantity information corresponding to the data fragment.
- the embodiment of the fourth aspect of the present invention discloses a client, where the client includes the media data acquiring device in the second aspect, and the client is used for acquiring and playing media data.
- the client may be a smart phone, a notebook computer, a desktop computer, a television, and the like.
- An embodiment of the fifth aspect of the present invention discloses a server for making or storing a packaged media file according to the third aspect of the embodiment.
- An embodiment of the sixth aspect of the present invention discloses a method for playing media data.
- the method includes: obtaining a reference frame and a data slice of the media data according to any of the foregoing embodiments, and decoding the data slice according to the reference frame.
- a data fragment includes multiple video image frames, and the index fragment includes corresponding information of the video image frame and the reference frame; and decoding the data fragment according to the reference frame includes: according to the reference frame, the video The corresponding information of the image frame and the reference frame decodes the video image frame.
- one data fragment includes multiple video image frames
- the media presentation description (MPD) includes corresponding information of the video image frame and the reference frame
- decoding the data fragment according to the reference frame includes: according to the reference The corresponding information of the frame, the video image frame and the reference frame decodes the video image frame.
- the corresponding information of the video image frame and the reference frame includes a byte range of the reference frame corresponding to the video image frame.
- the corresponding information of the video image frame and the reference frame includes reference frame identification information corresponding to the video image frame.
- FIG. 1 is a schematic diagram of media data requested by a client for different media representations.
- FIG. 2 is a schematic diagram of a data grading model of a media presentation description (MPD) in the HTTP Dynamic Adaptive Streaming Media (DASH) standard.
- MPD media presentation description
- DASH Dynamic Adaptive Streaming Media
- FIG. 3 is another schematic diagram of the data hierarchical structure of the MPD in the DASH standard.
- FIG. 4 is a schematic diagram of a media representation corresponding to separate fragment storage.
- Figure 5 is a schematic diagram showing a media representation of a corresponding slice stored in a file.
- FIG. 6 is a schematic diagram of random access points and random access segments in video coding.
- FIG. 7 is a schematic diagram of a data reference relationship in a video encoding based on a knowledge base.
- FIG. 8 is a schematic diagram of a storage manner of a reference frame according to an embodiment of the present invention.
- FIG. 9 is a schematic diagram of another storage manner of a reference frame according to an embodiment of the present invention.
- FIG. 10 is a schematic diagram of another storage manner of a reference frame according to an embodiment of the present invention.
- FIG. 11 is a flowchart of a method for acquiring media data according to an embodiment of the present invention.
- FIG. 12 is a schematic structural diagram of an apparatus for acquiring media data according to an embodiment of the present invention.
- the reference relationship between code streams is described in the Media Presentation Description (MPD).
- MPD Media Presentation Description
- attribute dependencyId indicates the identity of another representation (Identity, ID) that needs to be relied upon when decoding or rendering the data corresponding to the representation. Every representation in the MPD has A separate ID.
- ID the identity of another representation
- Every representation in the MPD has A separate ID.
- the client requests fragment data according to the representation containing the dependencyId attribute, it needs to obtain the segment corresponding to the representation that depends.
- the time of the segment of the different representations is one-to-one.
- the client can obtain the time information of the segment according to the segment information described in the MPD, so that the segment corresponding to the representation can be obtained.
- the URL of the segment is described by describing an index segment.
- the specific syntax of the slice is, for example, the sidx box in FIG. 5; the URL information of the index segment is described by the indexRange attribute; the syntax format in the index segment It is described in ISO/IEC 14496-12 as follows:
- reference_ID the ID of the code stream
- Timescale time unit
- Earliest_presentation_time The earliest presentation time of the code stream described in the sidx box, in timescale units;
- First_offset the starting offset of the first segment after the sidx box
- Reference_count the number of segments described in the sidx box
- Reference_type 1 indicates that the segment is an index segment; 0 indicates that the segment is a media content;
- Referenced_size the size of the segment
- Subsegment_duration the duration of the segment in timescale
- starts_with_SAP the stream access type of the segment
- SAP_delta_time the earliest presentation time of the first streaming access point
- the client receives the MPD, and obtains the dependency information of the representation and the information of the index segment after parsing;
- the client After determining the representation to be requested, the client constructs the URL of the request index segment according to the indexRange information in the MPD, such as http://example.com/video-512k.mp4/0-4332, and then the client requests according to the URL. Index segment;
- the client obtains the index segment, parses the sidx box information in the index segment, obtains the segment information, constructs the segment URL according to the segment information, and requests the segment according to the constructed segment URL;
- the client requests a segment from the server, and the corresponding URLs are http://example.com/video-512k.mp4/10000-10500 and http://example.com/video-768k.mp4/9000-9400;
- the client receives the segment sent by the server.
- an embodiment of the present invention discloses a method for acquiring media data, where the method includes:
- S101 Acquire a media presentation description file, where the media presentation description file includes index fragmentation information.
- S104 Parse the index fragment to obtain data fragmentation information.
- the index segment includes reference frame (knowledge base frame) information corresponding to the data segment, and the index segment may be used in a scenario where the user plays video, or may be in other scenarios. Use, at this time a media indicates that the corresponding data segment can be stored in a file and can be stored in different files.
- reference frame knowledge base frame
- Flag 0x01: indicates that the reference frame information corresponding to the segment is described in the sidx box;
- Library_frame_count the number of reference frames required by the segment
- Library_frame_offset the first byte offset of the reference frame in the stored stream; in an embodiment of the invention, the byte offset may be an absolute offset or a relative offset with respect to a certain slice;
- Library_frame_size The number of bytes of the reference frame.
- the client obtains the MPD file, parses the MPD, and obtains the indexRange information.
- the client constructs an index of the index segment according to the indexRange information, and sends a request for indexing the fragment to the server.
- the client parses the sidx box, and the client parses the information of the i-th segment, i.
- the value ranges from 1 to reference_count.
- the client obtains the size information of the i-th segment by parsing the information of the i-th segment.
- the segment is stored continuously in the file, so if the size information of the segment is obtained, the byteRange information of the segment can be derived, thereby constructing the segment URL.
- the size of all segments before the i-th segment is 20000, and the size of the i-th segment is 500.
- the byteRange information corresponding to the i-th segment is "20000-20499", and the URL of the segment is http://example. .com/example.mp4/20000-20499.
- the client obtains the number of reference frames (library_frame_count) required by the ith segment, and if the value of the library_frame_count is 0, the segment does not need a reference frame. Row decoding; if the value of library_frame_count is greater than 0, the value of library_frame_count indicates the number of reference frames needed for segment decoding.
- the client parses the offset value and the size value of the reference frame, and calculates the byteRange of the reference frame by using the offset value and the size value, thereby constructing a URL required for requesting the reference frame.
- the offset of the first byte of the reference frame in the storage file is 100
- the size of the frame is 200
- the byteRange in the URL is "100-299”
- the URL of the reference frame is http://example. .com/example2.mp4/100-299;
- the solution of this embodiment is more suitable for use in a video-on-demand scenario.
- the code stream of the reference frame may be stored in a file.
- the request may be requested by a byteRange.
- the code stream of the reference frame may be stored in a file with the code stream file of the non-reference frame, or may be stored separately in one file; if the code stream of the reference frame is stored in the file stream of the non-reference frame In the MPD, the existing MPD may be used, or the related attribute of the reference frame may be added in the existing MPD, and the attribute describes the position of the reference frame in the storage file byteRange, and the information may be described in the representation layer.
- the SegmentBase property In the SegmentBase property;
- the corresponding reference relationship between the reference frame and the segment may be separately described in other boxes than the sidx, the sidx is described in the prior art; the independent box is used to describe the reference relationship, and the reference relationship may not be destroyed.
- the grammatic structure of the existing sidx is as follows:
- Library_frame_count the number of reference frames required by the segment
- Library_frame_offset the first byte offset of the reference frame in the stored stream; in an embodiment of the invention, the sub-section offset may be an absolute offset or a relative offset with respect to a certain slice;
- Library_frame_size The number of bytes of the reference frame.
- the related attribute of the reference frame refers to the storage information of the code stream of the reference frame, such as 3 minutes of video, the number of bits of the code stream of the non-reference frame is 10000 Byte, and the reference frame has 5 Frame, the total number of bits is 500 Byte; the storage space of 10000 Byte is followed by the data of the reference frame, and the related attribute of the reference frame is 10000-10499";
- each reference frame can also be found directly through the information in sidx if the MPD is not modified.
- the MPD may adopt an existing MPD scheme, and the dependency relationship between the representations is described by the dependencyId attribute in the representation layer.
- LibarayFrame represents the attribute element of the reference frame
- range represents the storage range attribute of the reference frame
- the client can obtain the relationship between the segment and the reference frame involved in the on-demand program by parsing the sidx; in an embodiment of the present invention, the client can maintain a storage file to save the data.
- the reference frame information corresponding to the segment after the client requests the reference frame from the server, if the reference frame needs to be used in the subsequent segment, the reference frame can continue to be saved in the client, and then When used, there is no need to request the server again, thus saving transmission bandwidth.
- Store files can be used to store received The ID of the reference frame or the URL address of the reference frame.
- a second embodiment of the present invention provides a method for acquiring media data.
- the index fragment includes reference frame information corresponding to the data fragment. Expressed by means of identification information,
- Flag 0x01: indicates that the reference frame information corresponding to the segment is described in sidx.
- Library_frame_count the number of reference frames required by the segment
- Library_frame_id ID of the reference frame.
- the corresponding reference relationship between the reference frame and the segment may be separately described in other boxes than the sidx, the sidx is described in the prior art; the independent box is used to describe the reference relationship, and the reference relationship may not be destroyed.
- the grammatic structure of the existing sidx is as follows:
- Library_frame_count the number of reference frames required by the segment
- the client obtains the MPD file, parses the URL construction template of the reference frame, and describes a method for constructing the URL of the reference frame in the template.
- the template contains the ID parameter of the reference frame, and the template has a $Number. $ indicates.
- the URL template specified in the existing MPD can be used directly.
- the client requests index fragmentation according to the information of the index fragment in the MPD.
- the client parses the received index fragment (sidx box);
- the client obtains the number of reference frames required by the segment (library_frame_count). If the value is 0, the segment does not need to decode the reference frame; if the value is greater than 0, the Indicates the number of reference frames required for segment decoding;
- the client parses the ID of the reference frame, and constructs the URL of the reference frame according to the ID information of the reference frame and the reference frame URL template information in the MPD, for example, the template is http://example.com/example.mp4/$Number$.ref
- the method for obtaining the data fragmentation by the client can refer to the provisions in the existing DASH standard, and details are not described herein again.
- the method for obtaining media data is applicable to a scene in which a video is broadcasted.
- Each reference frame is encoded and stored as a separate file.
- the naming of each file includes the ID parameter corresponding to the sidx, and is included in the MPD.
- the template information SegmentTemplate describing the URL of the reference frame, which is an existing attribute of the representation; the code stream of the reference frame and the code stream of the non-reference frame are described by the attribute dependencyId in the DASH.
- determining whether the frame decoding in the segment requires the reference frame is performed by whether the library_frame_count is zero or not, in use, by adding an identifier to the sidx to determine whether the segment needs a reference frame, if the identifier is 0. , indicating that the decoding of the segment does not require a reference frame; if the identifier is not 0, the decoding of the segment requires a reference frame.
- the corresponding client also resolves the identifier. If the identifier is 0, it means that the parsing segment does not need a reference frame; if the identifier is not 0, it indicates that the reference frame needs to be parsed, and the number of reference frames and the reference frame are subsequently parsed.
- the information of the reference frame is identical to that described in the above embodiment.
- Another embodiment of the present invention is an extended embodiment of the above embodiment, which can be used with the above embodiment.
- the above embodiment describes the relationship between the reference frame and the segment, but the relationship between the frame and the reference frame in the specific segment needs to be obtained by parsing the frame information in the segment.
- the reference frame is decoded before the video frame of the segment that needs the reference frame, and the decoded reference frame is stored in the decoded image management of the decoder; therefore, when the decoder is initialized, Decoding the reference frame to apply for the storage space; this embodiment provides a carrying manner of the number of reference frames required for frame decoding in the segment;
- the index fragment in the first embodiment and the second embodiment carries the information about the number of reference frames required for frame decoding in the segment; for example, adding the attribute maxLibframeNumber to the sidx;
- maxLibframeNumber The maximum number of reference frames required for segment decoding.
- the MPD in the foregoing Embodiment 1 and Embodiment 2 carries the information about the number of reference frames required for frame decoding in the segment; for example, adding an attribute maxLibframeNumber to the MPD;
- maxLibframeNumber The maximum number of reference frames required for segment decoding.
- the client After the client obtains the maxLibframeNumber information from the sidx or the MPD, the information is sent to the decoder; the decoder performs the application and management of the storage space according to the obtained maxLibframeNumber information.
- the reference frame can be stored in the client. If the subsequent segment also needs to use the reference frame, then there is no need to re-request the server.
- parsing the index fragment obtains a number of knowledge base frames (library_frame_count) to be referred to by the i-th segment, and if the value is 0, the segment does not need to decode the reference frame; If the value is greater than 0, the value indicates the number of reference frames required for segment decoding.
- library_frame_count a number of knowledge base frames
- the offset value and the number of bytes of the reference frame are obtained, and the reference frame is determined by the offset value and the number of bytes of the reference frame.
- the reference may be passed and already stored.
- the method of comparing the offset value of the frame with the number of bytes is determined.
- the client obtains the reference frame from the local device. Otherwise, constructs the URL of the reference frame and requests the knowledge base frame data from the server. In a possible implementation manner, the URL of the reference frame may also be constructed first, through the URL. Information to determine whether the information of the reference frame has been saved locally.
- the reference reference relationship between the reference frame and the segment includes not only the reference relationship between the segment and the knowledge base frame, but also describes that the knowledge base frame is referenced by the first image frame in the segment; In the way of description in the example, four descriptions are also given here;
- a sampleIndex syntax is added, which indicates that the currently described knowledge base frame is referenced by the sampleIndex image frames in the segment;
- the client After obtaining the segment and the knowledge base frame data, the client determines, according to the sampleIndex information, which sample of the segment needs to be sent to the decoder before the sample in the segment. For example, if the value of sampleIndex is 50, the knowledge base frame is represented. Need to be sent to the decoder before the 50th sample of the segment;
- referenced_Times the number of times the corresponding knowledge base frame is referenced
- sampleIndex the sample number of the corresponding knowledge base frame in the reference
- the client can determine which samples of the corresponding knowledge base frame need to be sent to the decoder before the sample.
- the reference relationship between the reference knowledge base frame and the segment is described in an initialization segment, and a uuid box (Universal Unique IDentifiers) is added to the initialization slice, and the uuid box is in the initialization segment.
- a uuid box Universal Unique IDentifiers
- the corresponding reference relationship between the reference knowledge base frame and the segment is carried in the uuid box; the specific syntax is as follows:
- reference_count, library_frame_count, library_frame_size and the previous embodiment have the same semantics.
- libUUIDsize describes the total number of bytes of the knowledge base frame in the current representation stream
- Library_frame_offset Describes the offset of a single knowledge base frame in the entire knowledge base data.
- Library_frame_offset of a single knowledge base a fixed offset + the sum of the bytes of the frame of the knowledge base stored in front of the knowledge base, where the fixed offset It can be 0 or other integers, such as 16.
- the client constructs the URL of the initial fragment through the range attribute of the initialization in the MPD, such as http://example/1.mp4/0-1000; the client requests the initial fragmentation; the client obtains the initial score.
- the client After the slice, parsing the uuidbox, obtaining the corresponding reference relationship between the referenced knowledge base frame and the segment, and the location information of the knowledge base frame in the represented code stream, and obtaining the knowledge base frame according to the position information; and the foregoing embodiment in the present invention
- the client can obtain the segment information by parsing the index fragment, the client constructs the segment request URL, obtains the segment data, and then sends the frame in the knowledge base frame and the segment to the decoder for decoding, and then renders.
- the syntax of the MPD and the index fragment is not modified, so that the representation code stream can be backward compatible with the prior art, and in the actual network transmission, the compatibility change of the existing CDN is avoided.
- the information of the referenced knowledge base frame may be described in the MPD, and the information of the referenced knowledge base frame is described in an adaptation set (AdaptationSet) element or a representation element of the MPD. , for example, adding a reference to the SegmentTemplate element of the AdaptationSet or representation
- AdaptationSet adaptation set
- the referenceFrame describes the URL construction method of the knowledge base frame.
- the library frame in the case that the MPD is not updated, the knowledge base frame is information of the knowledge base frame to be referred to by all the segments described in the current MPD.
- the processing after obtaining the knowledge base frame is the same as the other embodiments of the present invention.
- the implementation is more suitable for use in a live broadcast application, and the reference relationship of the slice described in the knowledge base frame and the MPD can be realized by continuously updating the MPD.
- an embodiment of the present invention discloses a media data acquiring apparatus 20, where the apparatus 20 includes: an obtaining module 21, configured to acquire a media presentation description file, where the media presentation description file includes index fragmentation information;
- the module 21 is further configured to obtain an index fragment according to the index fragmentation information;
- the parsing module 22 is configured to parse the index fragment to obtain reference frame information corresponding to the data fragment; and the parsing module 22 is further configured to parse the index.
- the fragmentation module obtains the data fragmentation information.
- the obtaining module 21 is further configured to obtain the reference frame according to the reference frame information corresponding to the data fragment.
- the obtaining module 21 is further configured to obtain the data fragment according to the data fragmentation information.
- the acquisition module can be a receiver.
- the media data obtaining device 20 can be applied to a variety of devices including digital televisions, digital live broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptops or desktops.
- PDAs personal digital assistants
- Computers digital cameras, digital recording devices, digital media players, video game devices, video game consoles, cellular or satellite radio phones, video teleconferencing devices, and the like.
- These devices can decompress and play video data, such as by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4 Part 10 Advanced Video Coding (AVC), H.265
- AVC Advanced Video Coding
- the method for obtaining data fragments in the foregoing implementation of the present invention may be in any one of the existing DASH standards, and the embodiments of the present invention are not limited thereto, and are not described herein.
- the reference frame (knowledge base frame) is used for encoding, and there is a reference relationship between the code stream of the reference frame and the code stream of the non-reference frame, and different segments of the same non-reference frame code stream are referred to the same reference frame data.
- Decoding proposes a processing method based on DASH technology for these characteristics of the code stream encoded by the knowledge base technology, and the method supports the application of the knowledge base coding technology with a small grammatical modification under the framework of the DASH standard protocol.
- the client can flexibly switch and play the code stream without wasting bandwidth.
- the content is based on the same concept as the method embodiment of the present invention.
- the description in the method embodiment of the present invention and details are not described herein again.
- the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Multimedia (AREA)
- Computer Networks & Wireless Communication (AREA)
- Information Transfer Between Computers (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
一种媒体数据的获取方法和装置,涉及媒体传输领域,其中,所述方法包括:获取媒体呈现描述文件,所述媒体呈现描述文件包括索引分片信息;根据所述索引分片信息得到索引分片;解析所述索引分片,得到数据分片对应的参考帧信息;解析所述索引分片,得到数据分片信息;根据所述数据分片对应的参考帧信息得到所述参考帧;根据所述数据分片信息得到数据分片。针对知识库技术编码的码流的特性提出了一种基于DASH技术的方法,该方法在DASH标准协议的框架下,以较小的语法改动来支持知识库编码技术的应用,使得客户端在不浪费带宽的情况下可以灵活的进行码流的切换和播放。
Description
本发明涉及媒体传输领域,具体涉及一种媒体数据的获取方法和装置。
流媒体(Streaming media)是指将一连串的媒体数据压缩封装后,经过网络分段发送数据,在网络上传输媒体数据的一种技术与过程。
2011年11月,动态图像专家组(Moving Picture Experts Group,MPEG)组织批准了HTTP动态自适应流媒体(Dynamic Adaptive Streaming over HTTP,DASH)标准,DASH标准是基于HTTP协议传输媒体流的技术规范;DASH技术规范主要由两大部分组成:媒体呈现描述(Media Presentation Description,MPD)和媒体文件格式(file format)。
DASH媒体文件格式
在DASH中服务器会为同一个节目内容准备多种版本的码流,每个版本的码流在DASH标准中称为媒体表示(representation),不同版本的码流的码率、分辨率等编码参数可以不同,每个码流分割成多个小的文件,每个小文件被称为分片。在客户端请求媒体分片数据的过程中可以在不同的媒体表示之间切换,如图1所示,服务器为一部电影准备了3个媒体表示rep1,rep2,rep3;其中,rep1是码率为4mbps(每秒兆比特)的高清视频,rep2是码率为2mbps的标清视频,rep3是码率为1mbps的标清视频,图1中标记为阴影的分片是客户端请求播放的分片数据,客户端请求的前三个分片是媒体表示rep3的分片,第四个分片切换到rep2,请求第四个分片,之后切换到rep1,请求第五和第六个分片等;每个媒体表示的分片(segment)可以首尾相接的存在一个文件中,也可以独立存储为一个个的小文件;segment可以按照标准ISO/IEC 14496-12中的格式封装(ISO BMFF(Base Media File Format)),也可以是按照ISO/IEC 13818-1中的格式封装(MPEG-2TS)。
DASH媒体呈现描述
在DASH标准中,媒体呈现描述被称为MPD,MPD是一个xml的文件,文件中的信息是采用分级方式描述,如图2和图3所示,上一级的信息被下一级完全继承。在该文件中描述了一些媒体元数据,这些元数据可以使得客户端了解服务器中的媒体内容信息,并且可以使用这些信息构造请求segment的http-URL。
在DASH标准中,媒体呈现(media presentation),是呈现媒体内容的结构化数据的集合;媒体呈现描述(media presentation description),一个规范化描述媒体呈现的文件,用于提供流媒体服务;时期(period),一组连续的时期组成整个媒体呈现,时期具有连续和不重叠的特性;媒体表示(representation),封装有一个或多个具有描述性元数据的的媒体成分(编码的单独的媒体类型,例如音频、视频等)的结构化的数据集合;自适应集(AdaptationSet),表示同一媒体内容的多个可互替换的编码版本的集合;子集(subset),
一组自适应集合的组合,当播放器播放其中所有自适应集合时,可以获得相应的媒体内容;分片信息,是媒体呈现描述中的HTTP统一资源定位符引用的媒体单元,分片信息描述媒体数据的分片,媒体数据的分片可以存储在一个文件中,也可以单独存储,在一种可能的方式中,MPD中会存储媒体数据的分片。
在DASH媒体文件格式中,媒体表示中的segment有两种存储方式:一种是分开独立存储,如图4所示;另一种是存储在一个文件中,如图5所示。相应的MPD对segment的URL相关信息的描述也分为两种,当segment独立存储时,MPD通过模板或者列表的形式描述segment的相关信息,在一种方式中,每个segment前面会有一个索引分片(index segment)来描述后面的segment;当segment存储在一个文件时,MPD通过描述一个索引分片(index segment,该分片中的语法如图5中的sidx box所示)来描述多个segment的相关信息,索引分片中描述了segment在所存储的一个文件中segment的字节偏移,大小以及持续时间(duration)等信息。
知识库编码技术介绍
传统视频编码中,为了使编码后的视频文件支持随机访问功能,视频文件被随机访问点分割成多个具有随机访问功能的视频片段,简称为随机访问片段,如图6所示,给出了常用的IPPP编码结构下随机访问点、非随机访问点、随机访问片段的示意图。一个随机访问片段包括一幅或多幅图像(picture);通常视频编码中一幅随机访问点后将设置至少一个非随机访问点。不同随机访问片段的编码彼此独立,从而使得编码后的视频码流支持随机访问(random access)和快进快退播放的功能。然而,正因为将视频割裂成相互独立编码的片段,造成各个随机访问片段之间的互信息(mutual information)没有得到充分的利用,从而限制了视频编码的效率。
为提升视频的编码效率,在已有的专利中(中国专利申请号:201510150090.7,申请日2015年3月31日)为视频编码器提供一个知识库,让视频编码器拥有长期“记忆”的功能。在编码/解码视频中一幅图像(特别是随机访问点图像)时,可以从知识库中选择和当前编码/解码图像内容相近的图像作为参考图像,从而对当前图像进行基于帧间预测的编码/解码,如图7所示。其中,知识库中的图像可以是视频中的一些图像的重建图像。通过参考知识库中的图像,不同随机访问片段之间的相关性得到了利用,例如两个场景内容相似随机访问点图像参考了知识库中的同一幅图像编码为帧间编码帧(P帧或B帧),而不需要将这两个随机访问点图像分别以传统帧内编码方式编码为帧内编码帧(I帧)。这种基于知识库的编码方法将视频中多次出现的相似内容提取放入知识库中,通过参考知识库中的图像提高视频的编码效率。此时,随机访问点图像可以参考知识库中图像进行编码/解码,也可以直接使用传统的帧内编码方法;随机访问点图像不依赖于视频序列中的其它图像进行编码/解码,各随机访问片段之间仍然相互独立。
采用知识库编码的方式进行视频编码,会产生知识库码流和非知识库码流,非知识库码流需要参考知识库码流解码,而且非知识库流中的多个不连续帧可能会参考同一个知识库帧,如图7所示,场景一和场景三在编码的时候都参考了知识库帧1;在采用DASH的
方案将非知识库码流进行分片时,如果场景一和场景三,分别属于两个不同的分片,在客户端进行解码场景一和场景三的时候都需要先获得知识库帧1的帧数据,也就是说,会有多个segment对应同一个知识库帧,知识库帧和segment在时间上没有一一对应关系,所以知识库帧和segment是没有办法通过时间的对应关系来获得参考关系的;现有技术是不能支持segment间的参考关系是多对一的码流的传输的,现有的DASH技术没有针对知识库帧的系统层方案;也没有现有系统层的技术可以套用到知识库这样的参考编码方式上,对于知识库是没有系统层协议可以使用的,导致这种高效的编码方式不能与现有传输机制相匹配,而限制其应用。
发明内容
本发明实施例提供了一种媒体数据的获取方法,所述方法包括:获取媒体呈现描述文件,所述媒体呈现描述文件包括索引分片信息;根据所述索引分片信息得到索引分片;解析所述索引分片,得到数据分片信息和参考帧信息,所述数据分片信息用于描述数据分片,所述参考帧信息与所述数据分片相对应;根据所述参考帧信息得到所述参考帧。
媒体呈现描述文件的结构可以如动态图像专家组(Moving Picture Experts Group,MPEG)组织规定的HTTP动态自适应流媒体(Dynamic Adaptive Streaming over HTTP,DASH)标准中的MPD(media presentation description,媒体呈现描述)结构,也可以在上述结构的基础上适当增加描述相关的知识库文件属性的语法元素)。
在本发明的实施例中,可以按照现有的DASH方案中的方式获取索引分片。例如一种可能的方式中,MPD中包含索引分片的URL地址,客户端可以向该URL地址请求索引分片;在另一种可能的方式中,MPD中直接存储该索引分片;在另一种可能的方式中,MPD中存储URL模板和索引分片的相关属性(例如,分片标识,存储范围等),客户端根据URL模板和索引分片的相关属性构建请求索引分片的URL。
在本发明的实施例中,多个参考帧可以存储在一个文件中,也可以存储在不同的文件中。
在本发明的实施例中,参考帧可以和数据分片存储在一个文件中,也可以单独存储。如果参考帧存储在数据分片的文件中,媒体呈现描述文件可以使用DASH中的MPD,也可以在MPD中增加描述参考帧属性的相关语法元素,该语法元素可以在媒体表述(representation)层的segmentbase的属性中;如果参考帧和数据分片分开存储,媒体呈现描述文件可以使用DASH中的MPD,在representation层中使用dependencyID属性描述参考帧所在表示和数据分片所在表示之间的关系。
在一个实施例中,在MPD中描述非知识库码流要参考的知识库(参考帧)码流在码流文件中的存储位置byteRange的MPD样例如下,省略MPD中的其他上下文层级信息;
LibarayFrame表示知识库的属性元素,range表示知识库的文件中的存储范围属性。
根据本发明实施例的媒体数据的获取方法,通过解析索引分片的方式得到数据分片对应的参考帧信息,从而使得客户端可以较为方便的获取数据分片与参考帧之间的关系。
在一种可能的实现方式中,所述参考帧信息包括参考帧的字节偏移和参考帧的字节数;相应的,所述根据所述参考帧信息得到所述参考帧,包括:根据所述参考帧的字节偏移和所述参考帧的字节数得到所述参考帧。
该实施例的方案比较适合在视频点播的场景中使用,参考帧(知识库帧)的码流可以存储在一个文件中,客户端在请求单个参考帧的时候,可以通过byterange的方式请求。
在本发明的实施例中,客户端通过解析索引分片,可以得到整个点播节目所涉及到的分片segment和参考帧的关系;在向服务器请求得到参考帧后,如果该参考帧在后续还会被其他的segment参考,那么客户端可以继续保存该参考帧,从而在后续使用的时候不必再向服务器请求,节省了传输带宽。
在一种可能的实现方式中,所述媒体呈现描述文件包括统一资源定位符(URL)模板,所述根据所述参考帧的字节偏移和所述参考帧的字节数得到所述参考帧,包括:根据所述参考帧的字节偏移和所述参考帧的字节偏移得到参考帧的字节范围;根据所述参考帧的字节范围和所述URL模板得到参考帧的URL;根据所述参考帧的URL得到所述参考帧。
在一种可能的实现方式中,所述媒体呈现描述文件包括参考帧的存储位置信息;相应的,所述根据所述参考帧的字节范围和所述URL模板得到参考帧的URL包括:根据所述参考帧的存储位置信息,所述参考帧的字节范围和所述URL模板得到所述参考帧的URL。
在一种可能的实现方式中,所述参考帧的存储位置信息包括参考帧的存储范围;或者
所述参考帧的存储位置信息包括参考帧的存储文件标识信息。
在一种可能的实现方式中,所述参考帧信息包括参考帧的标识信息;相应的,所述根据所述参考帧信息得到所述参考帧,包括:根据所述参考帧的标识信息得到所述参考帧。
本实施例可以用于视频直播的场景,每个参考帧以单独的文件存储,每个文件对应一个参考帧的标识信息。
在一种可能的实现方式中,所述媒体呈现描述文件包括统一资源定位符(URL)模板,其特征在于,所述根据所述参考帧的标识信息得到所述参考帧,包括:根据所述参考帧的标识信息和所述URL模板得到参考帧的URL;根据所述参考帧的URL得到所述参考帧。
本实施例可以使用MPD中的模板信息SegmentTemplate,该属性是representation层中的已有属性;参考帧的码流和数据分片的码流依赖关系采用DASH中已有的属性dependencyID描述。
在一种可能的实现方式中,所述方法还包括:解析所述索引分片,得到数据分片对应的参考帧数量。
在本发明的实施例中,客户端请求多个数据分片的情况下,如果一个数据分片对应的参考帧数量为0,则说明该数据分片不需要参考帧;如果一个数据分片对应的参考帧数量为1,则可以按照上述实施例得到对应的参考帧;如果一个数据分片对应的参考帧数量大于1,则对于每一个参考帧,可以按照上述实施例得到,重复上述步骤直到得到该数据分片对应的全部参考帧为止。
在本发明的实施例中,在得到了参考帧和数据分片之后,客户端利用参考帧解码数据分片,进行媒体内容的播放。
在本发明的实施例中,描述了参考帧和segment的对应关系,但是segment中的帧和参考帧的参考关系需要解析segment中的帧信息获得,但是在客户端中,参考帧要先被送入解码器解码,并存储在解码器中,所以需要在解码器的初始化的时候,预先为知识库的顺利解码申请存储空间;本实施例给出了segment中的帧解码需要的参考帧的数量信息的携带方式;
携带方式一:
在索引分片中携带segment中的帧解码需要的参考帧的数量信息;比如在sidx中增加属性maxLibframeNumber;
携带方式二:
在MPD中携带segment中的帧解码需要的参考帧的数量信息;比如在MPD中增加属性maxLibframeNumber;
maxLibframeNumber:segment解码需要参考的参考帧的最大数量。
在客户端从索引分片或者从MPD中获取到maxLibframeNumber信息后,将该信息送入解码器;解码器根据获得的maxLibframeNumber信息进行存储空间的申请和管理。
本发明第二方面的实施例公开了一种媒体数据的获取装置,所述装置包括:获取模块,用于获取媒体呈现描述文件,所述媒体呈现描述文件包括索引分片信息;所述获取模块还用于根据所述索引分片信息得到索引分片;解析模块,用于解析所述索引分片,得到参考帧信息和数据分片信息,所述数据分片信息用于描述数据分片,所述参考帧信息与所述数据分片相对应;所述获取模块还用于根据所述参考帧信息得到所述参考帧。
在一种可能的实现方式中,所述参考帧信息包括参考帧的字节偏移和参考帧的字节数;所述获取模块用于根据所述参考帧的字节偏移和所述参考帧的字节数得到所述参考帧。
在一种可能的实现方式中,所述媒体呈现描述文件包括统一资源定位符(URL)模板,所述获取模块用于:根据所述参考帧的字节偏移和所述参考帧的字节偏移得到参考帧的字节范围;根据所述参考帧的字节范围和所述URL模板得到参考帧的URL;根据所述参考帧的URL得到所述参考帧。
在一种可能的实现方式中,所述媒体呈现描述文件包括参考帧的存储位置信息;所述获取模块用于根据所述参考帧的存储位置信息,所述参考帧的字节范围和所述URL模板得到所述参考帧的URL。
在一种可能的实现方式中,所述参考帧的存储位置信息包括参考帧的存储范围;或者所述参考帧的存储位置信息包括参考帧的存储文件标识信息。
在一种可能的实现方式中,所述参考帧信息包括参考帧的标识信息;所述获取模块用于根据所述参考帧的标识信息得到所述参考帧。
在一种可能的实现方式中,述媒体呈现描述文件包括统一资源定位符(URL)模板,所述获取模块用于:根据所述参考帧的标识信息和所述URL模板得到参考帧的URL;根据所述参考帧的URL得到所述参考帧。
在一种可能的实现方式中,所述解析模块还用于解析所述索引分片,得到数据分片对
应的参考帧数量。
可以理解的是,本发明装置实施例的实现方式,可以参考对应的方法实施例中的相关步骤,在此不再赘述。
本发明第三方面实施例公开了一种媒体数据的文件格式,所述文件格式包括参考帧和数据分片的对应关系信息。
本发明实施例公开的媒体数据的文件格式,应用于DASH标准协议框架下,适当的增加一些语法元素,从而使得客户端通过解析该文件格式,得到参考帧和数据分片的关系。
采用本发明实施例的文件格式的文件可以是上述实施中的索引分片。
在一种可能的实现方式中,文件格式中还包括数据分片信息。
在一种可能的实现方式中,所述对应关系信息包括参考帧的字节偏移和参考帧的字节数。
在一个实现方式中,基于DASH协议的文件格式中的语法元素的相关描述如下:
其中,语法元素表示的含义如下:
Flag=0x01:表示sidx box中描述了segment对应的知识库帧信息;
在DASH现有的技术规范中,flag的值是0;本发明的实施例通过在flag字段中赋予特殊的值,来指示后续存在知识库语法元素。可以理解的是,flag=0x01只是一种示例,实现中flag的值可以取不等于0的其它值;
library_frame_count:segment需要参考的知识库帧个数;
library_frame_offset:知识库帧在所存储流中的第一个字节偏移;在本发明的实施例中,字节偏移可以是绝对偏移,也可以是相对于某一分片的相对偏移,该语法的字节数可以是32位的也可以是64位的;
library_frame_size:知识库帧的字节大小。
在一种可能的实现方式中,所述对应关系信息包括参考帧的标识信息。
在一个实现方式中,基于DASH协议的文件格式中的语法元素的相关描述如下:
Flag=0x01:表示sidx中描述了segment对应的知识库帧信息
library_frame_count:所在的media segment需要参考的知识库帧个数
library_frame_id:知识库帧的ID。
在一种可能的实现方式中,所述文件格式还包括数据分片对应的参考帧数量信息。
本发明第四方面的实施例公开了一种客户端,所述客户端包括第二方面实施例中的媒体数据的获取装置,所述客户端用于媒体数据的获取和播放。
在本发明的实现方式中,客户端可以是智能手机,笔记本电脑,台式电脑,电视等设备。
本发明第五方面的实施例公开一种服务器,所述服务器用于制作或存储根据第三方面实施例封装后的媒体文件。
从本发明实施例提供的以上技术方案可以看出,由于本发明实施例针对知识库技术编码的码流的特性提出了一种基于DASH技术的方法,该方法在DASH标准协议的框架下,以较小的语法改动来支持知识库编码技术的应用,使得客户端在不浪费带宽的情况下可以灵活的进行码流的切换和播放。
本发明第六方面的实施例公开了一种媒体数据的播放方法,所述方法包括:根据前面任一实施例得到媒体数据的参考帧和数据分片,根据参考帧对数据分片进行解码。
在一种可能的实现方式中,一个数据分片包括多个视频图像帧,索引分片包括视频图像帧和参考帧的对应信息;根据参考帧对数据分片进行解码包括:根据参考帧,视频图像帧和参考帧的对应信息对视频图像帧进行解码。
在一种可能的实现方式中,一个数据分片包括多个视频图像帧,媒体呈现描述(MPD)包括视频图像帧和参考帧的对应信息;根据参考帧对数据分片进行解码包括:根据参考帧,视频图像帧和参考帧的对应信息对视频图像帧进行解码。
在一种可能的实现方式中,视频图像帧和参考帧的对应信息包括视频图像帧对应的参考帧的字节范围。
在一种可能的实现方式中,视频图像帧和参考帧的对应信息包括视频图像帧对应的参考帧标识信息。
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为客户端请求不同的媒体表示的媒体数据的示意图。
图2为HTTP动态自适应流媒体(DASH)标准中的媒体呈现描述(MPD)的数据分级模型示意图。
图3为DASH标准中MPD的数据分级结构的另一示意图。
图4为一个媒体表示对应的分片独立存储的示意图。
图5为一个媒体表示对应的分片存储在一个文件的示意图。
图6为视频编码中的随机访问点和随机访问片段的示意图。
图7为基于知识库的视频编码中的数据参考关系示意图。
图8为本发明实施例的参考帧的存储方式的示意图。
图9为本发明实施例的参考帧的另一存储方式的示意图。
图10为本发明实施例的参考帧的另一存储方式的示意图。
图11为本发明实施例的一种媒体数据的获取方法的流程图。
图12为本发明实施例的一种媒体数据的获取装置的结构示意图。
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
在HTTP动态自适应流媒体(Dynamic Adaptive Streaming over HTTP,DASH)标准的技术规范中,码流之间的参考关系是在媒体呈现描述(Media Presentation Description,MPD)中描述的。在MPD的媒体表示(representation)级的语法中有一个属性dependencyId,dependencyId表示解码或者呈现representation对应的数据时,需要依赖的另一个representation的身份(Identity,ID),在MPD中每一个representation都有一个独立的ID。当客户端根据包含dependencyId属性的representation请求分片(segment)数据时,需要获取所依赖的representation对应的segment。不同的representation的segment的时间是一一对应的,客户端根据MPD中描述的segment的信息,可以获得segment的时间信息,因此可以得到所依赖的representation对应的segment。
下面给出MPD中representation的相关描述(representation上面层级的信息省略)
这个MPD中是通过描述一个索引分片(index segment)来描述segment的URL,该分片的具体语法例如图5中的sidx box;index segment的URL信息通过indexRange属性描述;index segment中的语法格式在ISO/IEC 14496-12中描述如下:
其中,语法元素表示的含义如下:
reference_ID:码流的ID;
timescale:时间单位;
earliest_presentation_time:sidx box中描述的码流的最早呈现时间,以timescale为单位;
first_offset:第一个segment在sidx box后的起始偏移;
reference_count:sidx box中描述的segment的个数;
reference_type;1表示segment是index segment;0表示segment是媒体内容;
referenced_size:segment的大小;
subsegment_duration:以timescale为单位的segment持续时长;
starts_with_SAP:segment的流接入类型;
SAP_delta_time:第一个流接入点的最早呈现时间;
对于上述的文件格式,客户端处理媒体数据的流程如下:
客户端接收到MPD,解析后获得representation的依赖关系信息和index segment的信息;
客户端根据网络带宽条件或者其它因素(例如,个人喜好,显示器分辨率等)选择要请求的representation,比如客户端请求id="tag5"的representation;
在确定了要请求的representation后,客户端根据MPD中的indexRange信息构造请求index segment的URL,比如http://example.com/video-512k.mp4/0-4332,然后客户端根据这个URL请求index segment;
客户端获取到index segment,解析index segment中的sidx box信息,获得segment的信息,根据segment的信息构造segment的URL,根据构造的segment的URL请求segment;
当客户端需要请求id="tag6"的representation的segment时,类似的,客户端请求id="tag6"的representation的index segment,获得segment的信息;
客户端根据要切换码流(从id="tag5"的representation切换到id="tag6"的representation)的时间点信息,获得对应的id="tag5"的representation的第i个segment信息和id="tag6"的representation的第i个segment的信息,然后确定要下载的id="tag5"的representation的第i个segment和id="tag6"的representation的第i个segment的URL,其中i为正整数,
可以为2,3,10等;比如客户端切换的码流时间点是视频播放时间的第1分钟,对应该时间点的id="tag5"的representation的第i个segment range信息是10000-10500,那么该segment的URL为http://example.com/video-512k.mp4/10000-10500;对应该时间点的id="tag6"的representation的第i个segment range信息是9000-9400,那么该segment的URL为http://example.com/video-768k.mp4/9000-9400;在解码时tag6的segment依赖于tag5的segment的数据;
客户端向服务器请求segment,对应的URL分别为http://example.com/video-512k.mp4/10000-10500和http://example.com/video-768k.mp4/9000-9400;
客户端接收服务器发送的segment。
如图11,所示,本发明的实施例公开了一种媒体数据的获取方法,所述方法包括:
S101:获取媒体呈现描述文件,所述媒体呈现描述文件包括索引分片信息;
S102:根据所述索引分片信息得到索引分片;
S103:解析所述索引分片,得到数据分片对应的参考帧信息;
S104:解析所述索引分片,得到数据分片信息;
S105:根据所述数据分片对应的参考帧信息得到所述参考帧;
S106:根据所述数据分片信息得到数据分片。
在本发明的一个实施例中,索引分片(index segment)包括数据分片对应的参考帧(知识库帧)信息,索引分片可在用户点播视频的场景下使用,也可以在其他的场景使用,此时一个媒体表示对应的数据分片(segment)可以存储在一个文件里,可以存储在不同的文件里。
其中,语法元素表示的含义如下(与前述实施例相同的语法元素表示的含义在此不再赘述):
Flag=0x01:表示sidx box中描述了segment对应的参考帧信息;
在DASH现有的技术规范中,flag的值是0;本发明的实施例通过在flag字段中赋予特殊的值,来指示后续存在参考帧的语法元素。可以理解的是,flag=0x01只是一种示例,实现中flag的值可以取不等于0的其它值;
library_frame_count:segment需要的参考帧数;
library_frame_offset:参考帧在所存储流中的第一个字节偏移;在本发明的实施例中,字节偏移可以是绝对偏移,也可以是相对于某一分片的相对偏移;
library_frame_size:参考帧的字节数。
在本发明的实施例中,客户端获得MPD文件,解析MPD,获得indexRange信息。客户端根据indexRange信息,构造索引分片(Index segment)的URL,向服务器发出索引分片的请求,客户端接收到索引分片后解析sidx box,客户端解析第i个segment的信息,i的取值范围为1到reference_count;客户端通过解析第i个segment的信息获得第i个segment的size信息。通常情况下segment在文件中是连续存储的,所以获得segment的size信息,就可以推导出segment的byteRange信息,从而构造segment的URL。比如第i个segment之前的所有segment的size总和为20000,第i个segment的size为500,则第i个segment对应的byteRange信息是“20000-20499”,该segment的URL为http://example.com/example.mp4/20000-20499。
在本发明的一个实施例中,可选的,客户端获得第i个segment需要的参考帧的个数(library_frame_count),如果library_frame_count的值为0,表示segment不需要参考帧进
行解码;如果library_frame_count的值大于0,library_frame_count的值表示segment解码需要的参考帧的数量。
客户端解析获得参考帧的偏移值和size值,通过偏移值和size值计算出参考帧的byteRange,从而构造请求参考帧所需要的URL。比如参考帧的起始的第一个字节在存储文件中的偏移是100,帧的大小是200,URL中的byteRange就是“100-299”,该参考帧的URL就是http://example.com/example2.mp4/100-299;
根据参考帧的URL,获取相应的参考帧;
根据segment的URL,获取相应的segment。
该实施例方案,比较适合在视频点播的场景中使用,参考帧的码流可以存储在一个文件中,在客户端请求单个参考帧的时候,可以通过byteRange的方式请求。在该实施例中参考帧的码流可以和非参考帧的码流文件存储在一个文件中,也可以独立存储在一个文件中;如果参考帧的码流存储在非参考帧的码流的文件中,MPD可以使用现有的MPD,也可在现有的MPD中增加参考帧的相关的属性,该属性描述参考帧的码流在存储文件中的位置byteRange,该信息可以描述在representation层的SegmentBase属性中;
在本发明的一个实施例中,参考帧与segment的对应参考关系可以被独立的在sidx以外的其他box中描述,sidx按现有技术的描述方式;采用独立的box描述参考关系,可以不破坏已有的sidx的语法结构。新增的描述信息语法如下:
reference_count:segment个数
library_frame_count:segment需要的参考帧数;
library_frame_offset:参考帧在所存储流中的第一个字节偏移;在本发明的实施例中,子节偏移可以是绝对偏移,也可以是相对于某一分片的相对偏移;
library_frame_size:参考帧的字节数。
在本发明的一个实施例中,参考帧的相关的属性是指的是参考帧的码流的存储信息,比如3分钟的视频,非参考帧的码流的比特数是10000Byte,参考帧有5帧,总的比特数是是500Byte;10000Byte的存储空间后面的是参考帧的数据,参考帧的相关的属性为10000-10499”;
在本发明的一个实施例中,如果MPD不做任何修改,直接通过sidx中的信息也可以找到每一个参考帧。
在本发明的一个实施例中,如果参考帧码流和非参考帧码流分开存储,MPD可以采用现有的MPD方案,在representation层用dependencyId属性描述representation间的参考关系。
在MPD中描述参考帧的码流的存储位置byteRange的样例如下,省略MPD中的其他上下文层级信息;
LibarayFrame表示参考帧的属性元素,range表示参考帧的存储范围属性,或者是segment对应的参考帧的描述信息在文件中的范围(slid box)。
在本发明的实施例中,客户端通过解析sidx,可以得到点播节目所涉及到的segment和参考帧的关系;在本发明的一个实施例中,客户端可以维护一个存储文件,用以保存数据分片(segment)对应的参考帧信息;客户端在向服务器请求到参考帧后,如果该参考帧在后续的segment中还需要使用,那么该参考帧可以继续保存在客户端,在后续再被使用的的时候,不需要再向服务器请求,从而节省了传输带宽。存储文件可以用于存储已接收
的参考帧的ID或者请求该参考帧的URL地址。
本发明第二个实施例提供一种媒体数据的获取方法,在该实施例中索引分片包括数据分片对应的参考帧信息。使用标识信息的方式来表示,
Flag=0x01:表示sidx中描述了segment对应的参考帧信息
library_frame_count:segment需要的参考帧的数量
library_frame_id:参考帧的ID。
在本发明的一个实施例中,参考帧与segment的对应参考关系可以被独立的在sidx以外的其他box中描述,sidx按现有技术的描述方式;采用独立的box描述参考关系,可以不破坏已有的sidx的语法结构。新增的描述信息语法如下:
segment对应的参考信息描述box:
library_frame_count:segment需要的参考帧的数量
library_frame_id:参考帧的ID
在本发明的实施例中,客户端获取MPD文件,解析获得参考帧的URL构造模板,模板中描述了参考帧的URL的构造方法,模板中含有参考帧的ID参数,在模板中以$Number$表示。在一种可能实现的方式中,可以直接使用现有的MPD中规定的URL模板。
客户端根据MPD中的索引分片的信息请求索引分片。客户端解析接收到的索引分片(sidx box);
在本发明的一个实施例中,可选的,客户端获得segment需要的参考帧的个数(library_frame_count),如果该值为0,表示segment不需要参考帧解码;如果该值大于0,该是表示segment解码需要的参考帧的个数;
客户端解析获得参考帧的ID,根据参考帧的ID信息和MPD中的参考帧URL模板信息构造参考帧的URL,比如模板是http://example.com/example.mp4/$Number$.ref,则ID=4的参考帧的URL为http://example.com/example.mp4/4.ref;根据参考帧的URL,获取参考帧。
客户端获取数据分片的方法可以参考现有的DASH标准中的规定,在此不再赘述。
本发明的实施例中,获取媒体数据的方法适用于视频直播的场景,每个参考帧编码后以单独的文件存储,每个文件的命名中含有上述sidx所对应的ID参数;在MPD中包括描述参考帧的URL的模板信息SegmentTemplate,该属性是representation的已有属性;参考帧的码流和非参考帧的码流采用DASH中的属性dependencyId描述。
在上述实施例中,判断segment中的帧解码是否需要参考帧是通过library_frame_count是否为零来进行的,在使用中也可以通过在sidx中增加一个标识来判断segment是否需要参考帧,如果标识为0,表示segment的解码不需要参考帧;如果标识不为0,则segment的解码需要参考帧。相应的客户端也对该标识的解析,如果该标识为0,表示解析segment不需要参考帧;如果标识不为0,表示需要解析参考帧,后续解析参考帧的个数和参考帧的信息,参考帧的信息和上述实施例所描述的一致。
本发明的另一个实施例是上述实施例的扩展实施例,可以和上述实施例一起使用。
上述实施例描述了参考帧和segment的关系,但是具体segment中的帧和参考帧的关系需要解析segment中的帧信息获得。在客户端中,参考帧要先于segment中需要参考帧的视频帧进行解码,并将解码后的参考帧存储在解码器的解码图像管理中;所以需要在解码器的初始化的时候,预先为解码参考帧申请存储空间;本实施例给出了segment中的帧解码需要的参考帧的数量信息的携带方式;
携带方式一:
在上述实施例一和实施例二中的索引分片中携带segment中的帧解码需要的参考帧的数量信息;比如在sidx中增加属性maxLibframeNumber;
maxLibframeNumber:segment解码需要的参考帧的最大数量。
携带方式二:
在上述实施例一和实施例二中的MPD中携带segment中的帧解码需要的参考帧的数量信息;比如在MPD中增加属性maxLibframeNumber;
maxLibframeNumber:segment解码需要的参考帧的最大数量。
在客户端从sidx或者从MPD中获取到maxLibframeNumber信息后,将该信息送入解码器;解码器根据获得的maxLibframeNumber信息进行存储空间的申请和管理。
在本发明的另一个实施例中,因为非参考帧码流中的不同的segment可以参考相同的参考帧,所以在客户端获得参考帧并送入解码器之后,可以将参考帧存储在客户端。如果后续的segment也需要使用该参考帧,那么不需要再重新向服务器请求。
在一种实现方式中,客户端获得MPD文件,解析MPD,获得indexRange信息;客户端根据indexRange信息,构造索引分片(Index segment)的URL,向服务器请求索引分片;客户端解析得到的索引分片,得到第i个segment的信息,其中,i=1到reference_count;客户端获得第i个segment的size信息,得到segment的byteRange信息,从而构造segment的URL,比如第i个segment之前的所有segment的大小总和为20000,第i个segment的大小为500,那么第i个segment对应的byteRange信息就是“20000-20499”,那么该segment
的URL为http://example.com/example.mp4/20000-20499;
在一种可能的实现方式中,可选的,解析索引分片获得第i个segment需要参考的知识库帧的个数(library_frame_count),如果该值为0,表示segment不需要参考帧解码;如果该值大于0,则该值表示segment解码需要的参考帧的数量。
解析获得参考帧的偏移值和字节数,通过参考帧的偏移值和字节数,判断客户端是否已经保存了该参考帧,在一种实现方式中,可以通过和已经存储的参考帧的偏移值和字节数进行比较的方式进行判断。
如果有该参考帧,客户端从本地获取参考帧,否则,构造参考帧的URL,向服务器请求知识库帧数据;在一种可能的实现方式中,也可以先构造参考帧的URL,通过URL信息来判断本地是否已经保存了参考帧的信息。
在本实施例中,参考帧与segment的对应参考关系不仅仅包含segment和知识库帧的参考关系,还描述了知识库帧是被segment中的第几个图像帧(sample)参考;针对上述实施例中的描述方式,这里也给出四种描述方式;
方式一:
方式二:
方式三:
方式四:
在上述的四种方式中,增加了sampleIndex语法,该语法表示当前所描述的知识库帧被segment中的第sampleIndex个图像帧(sample)参考;
上述列举的四种方式的其它语法元素的含义可参考前述实施例,在此不再赘述。
客户端在获取了segment和知识库帧数据后,根据sampleIndex信息确定对应的知识库帧需要在segment的中的哪个sample之前被送入解码器,比如sampleIndex的值是50,则表示该知识库帧需要在segment的第50个sample之前被送入解码器;
因为知识库帧也可以被segment中的多个帧参考,所以上述对应的四种方式中的sampleIndex位置的语法可以被替换为:
referenced_Times:对应的知识库帧被参考的次数
sampleIndex:segment中参考对应知识库帧的sample序号
客户端在解析到上述信息后就可以确定对应的知识库帧需要在segment中的哪些sample之前被送入解码器。
在本发明的另一个实施例中,被参考知识库帧与segment的对应参考关系描述在初始化分片(initialization segment)中,在初始化分片中增加uuid box(Universal Unique IDentifiers),uuid box是在标准ISO/IEO 14496-12中定义的,将被参考知识库帧与segment的对应参考关系携带在uuid box中;具体的语法如下:
在本实施例中,reference_count,library_frame_count,library_frame_size和前面的实施例语义相同。
libUUIDsize:描述知识库帧在当前表示码流中的总的字节数;
library_frame_offset:描述单个知识库帧在整个知识库数据中的偏移,单个知识库的library_frame_offset=某个固定偏移+该知识库前面存储的知识库的帧的字节数总和,这里的固定偏移可以是0,也可以是其他整数,比如16。
在本实施例中,客户端通过MPD中的initialization的range属性构造初始分片的URL,比如http://example/1.mp4/0-1000;客户端请求初始分片;客户端获得初始分片后,解析uuidbox,获得被参考的知识库帧与segment的对应参考关系,以及知识库帧在表示码流中的位置信息,根据该位置信息获得知识库帧;与本发明中前面的实施例相同,客户端可以通过解析索引分片获得segment的信息,客户端构造segment的请求URL,获得segment数据,之后将知识库帧和segment中的帧送入解码器解码,之后再呈现。
本实施例中,没有修改MPD和索引分片的语法,使得表示码流能够向下兼容已有技术,在实际的网络传输中,避免了已有CDN的兼容改动。
在本发明的另一个实施例中,被参考的知识库帧的信息可以描述在MPD中,在MPD的自适应集(AdaptationSet)元素或者表示(representation)元素中描述被参考的知识库帧的信息,比如在AdaptationSet或者representation的SegmentTemplate元素中增加被参考的
知识库帧的URL构造方式,MPD样例如下:
在本实施例中,referenceFrame描述的是知识库帧的URL构造方法,比如Representationid="v0"的知识库帧的URL为http://example/250000/ref.mp4v;客户端通过该URL获得知识库帧,在MPD不更新的情况下,该知识库帧是当前MPD中描述的所有segment要参考的知识库帧的信息。获得知识库帧之后的处理方式和本发明其他实施例相同。
本实施更加适合在直播应用中使用,可以通过不断更新MPD,实现知识库帧和MPD中所描述的分片的参考关系。
如图12所示,本发明的实施例公开了一种媒体数据获取装置20,装置20包括:获取模块21,用于获取媒体呈现描述文件,所述媒体呈现描述文件包括索引分片信息;获取模块21还用于根据所述索引分片信息得到索引分片;解析模块22,用于解析所述索引分片,得到数据分片对应的参考帧信息;解析模块22还用于解析所述索引分片,得到数据分片信息;获取模块21还用于根据所述数据分片对应的参考帧信息得到所述参考帧;获取模块21还用于根据所述数据分片信息得到数据分片。
在一种实现方式中,获取模块可以是接收器。
在本发明的实施例中,媒体数据获取装置20可以应用在多种设备中,这些设备包含数字电视机、数字直播系统、无线广播系统、个人数字助理(PDA)、膝上型或桌上型计算机、数码相机、数字记录装置、数字媒体播放器、视频游戏装置、视频游戏控制台、蜂窝式或卫星无线电电话、视频电话会议装置和类似装置。这些设备可以解压并播放视频数据,例如由MPEG-2、MPEG-4、ITU-T H.263、ITU-T H.264/MPEG-4第10部分高级视频译码(AVC),H.265定义的标准以及此些标准的扩展中描述的技术。
本发明实施例的媒体获取装置20,具体的实现方式可以参考上述各实施例中对应步骤的具体实现,在此不再赘述。
本发明上述实施中获取数据分片的方式可以采用现有DASH标准中的任意一种方式,本发明的实施例对此不做限制,也不在此赘述。
采用参考帧(知识库帧)的方式进行编码,参考帧的码流和非参考帧的码流之间存在参考关系,而且同一个非参考帧码流的不同segment会参考相同的参考帧数据进行解码,
本发明针对知识库技术编码的码流的这些特性提出了一种基于DASH技术的处理方法,该方法在DASH标准协议的框架下,以较小的语法改动来支持知识库编码技术的应用,使得客户端在不浪费带宽的情况下可以灵活的进行码流的切换和播放。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本发明所必须的。
上述装置和系统内的各模块之间的信息交互、执行过程等内容,由于与本发明方法实施例基于同一构思,具体内容可参见本发明方法实施例中的叙述,此处不再赘述。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,上述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程,相关的硬件包括处理器。其中,上述的存储介质可为磁碟、光盘、只读存储记忆体(ROM:Read-Only Memory)或随机存储记忆体(RAM:Random Access Memory)等。
本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其思想;同时,对于本领域的一般技术人员,依据本发明,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本发明的限制。
Claims (17)
- 一种媒体数据的获取方法,其特征在于,所述方法包括:获取媒体呈现描述文件,所述媒体呈现描述文件包括索引分片信息和统一资源定位符(URL)模板;根据所述索引分片信息得到索引分片;解析所述索引分片,得到数据分片信息和参考帧信息,所述数据分片信息用于描述数据分片,所述参考帧信息与所述数据分片相对应,且所述参考帧信息包括参考帧的字节偏移和参考帧的字节数;根据所述参考帧的字节偏移和所述参考帧的字节数得到参考帧的字节范围,根据所述参考帧的字节范围和所述URL模板得到参考帧的URL,根据所述参考帧的URL得到所述参考帧。
- 根据权利要求1所述的媒体数据的获取方法,其特征在于,所述媒体呈现描述文件包括参考帧的存储位置信息;相应的,所述根据所述参考帧的字节范围和所述URL模板得到参考帧的URL包括:根据所述参考帧的存储位置信息,所述参考帧的字节范围和所述URL模板得到所述参考帧的URL。
- 根据权利要求2所述的媒体数据的获取方法,其特征在于,所述参考帧的存储位置信息包括参考帧的存储范围;或者所述参考帧的存储位置信息包括参考帧的存储文件标识信息。
- 根据权利要求1所述的媒体数据的获取方法,其特征在于,所述参考帧和所述数据分片存储在同一文件。
- 根据权利要求1-4任意之一所述的媒体数据的获取方法,其特征在于,所述根据所述索引分片信息得到索引分片包括:根据所述索引分片信息和所述URL模板得到索引分片的URL;根据所述索引分片的URL发送索引分片获取请求;接收所述索引分片。
- 一种媒体数据的获取方法,其特征在于,所述方法包括:获取媒体呈现描述文件,所述媒体呈现描述文件包括索引分片信息和统一资源定位符(URL)模板;根据所述索引分片信息得到索引分片;解析所述索引分片,得到数据分片信息和参考帧信息,所述数据分片信息用于描述数据分片,所述参考帧信息与所述数据分片相对应,所述参考帧信息包括参考帧的标识信息;根据所述参考帧的标识信息得到所述参考帧。
- 根据权利要求6所述的媒体数据的获取方法,其特征在于,所述媒体呈现描述文件包括统一资源定位符(URL)模板,所述根据所述参考帧的标识信息得到所述参考帧,包括:根据所述参考帧的标识信息和所述URL模板得到参考帧的URL;根据所述参考帧的URL得到所述参考帧。
- 根据权利要求7所述的媒体数据的获取方法,其特征在于,所述媒体呈现描述文件包括参考帧的存储位置信息;相应的,所述根据所述参考帧的标识信息和所述URL模板得到参考帧的URL包括:根据所述参考帧的存储位置信息,所述参考帧的标识信息和所述URL模板得到所述参考帧的URL。
- 根据权利要求6-8任意之一所述的媒体数据的获取方法,其特征在于,所述根据所述索引分片信息得到索引分片包括:根据所述索引分片信息和所述URL模板得到索引分片的URL;根据所述索引分片的URL发送索引分片获取请求;接收所述索引分片。
- 一种媒体数据的获取装置,其特征在于,所述装置包括:获取模块,用于获取媒体呈现描述文件,所述媒体呈现描述文件包括索引分片信息;所述获取模块还用于根据所述索引分片信息得到索引分片;解析模块,用于解析所述索引分片,得到数据分片信息和参考帧信息,所述数据分片信息用于描述数据分片,所述参考帧信息与所述数据分片相对应;所述获取模块还用于根据所述参考帧信息得到所述参考帧。
- 根据权利要求10所述的媒体数据的获取装置,其特征在于,所述参考帧信息包括参考帧的字节偏移和参考帧的字节数;所述获取模块用于根据所述参考帧的字节偏移和所述参考帧的字节数得到所述参考帧。
- 根据权利要求11所述的媒体数据的获取装置,所述媒体呈现描述文件包括统 一资源定位符(URL)模板,其特征在于,所述获取模块用于:根据所述参考帧的字节偏移和所述参考帧的字节偏移得到参考帧的字节范围;根据所述参考帧的字节范围和所述URL模板得到参考帧的URL;根据所述参考帧的URL得到所述参考帧。
- 根据权利要求12所述的媒体数据的获取装置,其特征在于,所述媒体呈现描述文件包括参考帧的存储位置信息;所述获取模块用于根据所述参考帧的存储位置信息,所述参考帧的字节范围和所述URL模板得到所述参考帧的URL。
- 根据权利要求13所述的媒体数据的获取装置,其特征在于,所述参考帧的存储位置信息包括参考帧的存储范围;或者所述参考帧的存储位置信息包括参考帧的存储文件标识信息。
- 根据权利要求10所述的媒体数据的获取装置,其特征在于,所述参考帧信息包括参考帧的标识信息;所述获取模块用于根据所述参考帧的标识信息得到所述参考帧。
- 根据权利要求15所述的媒体数据的获取装置,所述媒体呈现描述文件包括统一资源定位符(URL)模板,其特征在于,所述获取模块用于:根据所述参考帧的标识信息和所述URL模板得到参考帧的URL;根据所述参考帧的URL得到所述参考帧。
- 根据权利要求10-16任意之一所述的媒体数据的获取装置,其特征在于,所述解析模块还用于解析所述索引分片,得到数据分片对应的参考帧数量。
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610570310.6 | 2016-07-18 | ||
CN201610570310.6A CN107634930B (zh) | 2016-07-18 | 2016-07-18 | 一种媒体数据的获取方法和装置 |
PCT/CN2017/070994 WO2018014523A1 (zh) | 2016-07-18 | 2017-01-12 | 一种媒体数据的获取方法和装置 |
CNPCT/CN2017/070994 | 2017-01-12 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018014691A1 true WO2018014691A1 (zh) | 2018-01-25 |
Family
ID=60991705
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/070994 WO2018014523A1 (zh) | 2016-07-18 | 2017-01-12 | 一种媒体数据的获取方法和装置 |
PCT/CN2017/089161 WO2018014691A1 (zh) | 2016-07-18 | 2017-06-20 | 一种媒体数据的获取方法和装置 |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/070994 WO2018014523A1 (zh) | 2016-07-18 | 2017-01-12 | 一种媒体数据的获取方法和装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN107634930B (zh) |
WO (2) | WO2018014523A1 (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109905479A (zh) * | 2019-03-04 | 2019-06-18 | 腾讯科技(深圳)有限公司 | 文件传输方法和装置 |
CN114501166A (zh) * | 2021-11-18 | 2022-05-13 | 武汉市烽视威科技有限公司 | Dash点播快进快退方法及系统 |
CN118250494A (zh) * | 2024-05-27 | 2024-06-25 | 湖南快乐阳光互动娱乐传媒有限公司 | 视频版权保护方法及装置 |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019227366A1 (zh) * | 2018-05-31 | 2019-12-05 | 海能达通信股份有限公司 | 一种基于切片的rtp流媒体存储、读取方法及装置 |
CN110858916B (zh) * | 2018-08-24 | 2020-11-24 | 上海交通大学 | 支持大跨度相关性信息编码的标识方法及系统 |
CN110876083B (zh) * | 2018-08-29 | 2021-09-21 | 浙江大学 | 指定参考图像的方法及装置及处理参考图像请求的方法及装置 |
US11716505B2 (en) | 2018-08-29 | 2023-08-01 | Zhejiang University | Methods and apparatus for media data processing and transmitting and reference picture specifying |
CN109274696A (zh) * | 2018-09-20 | 2019-01-25 | 青岛海信电器股份有限公司 | 基于dash协议的流媒体播放方法及装置 |
CN111083573A (zh) * | 2018-10-22 | 2020-04-28 | 杭州海康威视系统技术有限公司 | 一种视频文件处理方法、装置和存储节点 |
CN111405291B (zh) * | 2019-01-02 | 2021-10-19 | 浙江大学 | 视频编解码方法与装置 |
CN109960731B (zh) * | 2019-03-28 | 2022-11-18 | 腾讯音乐娱乐科技(深圳)有限公司 | 一种数据处理方法、设备及存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012162995A1 (zh) * | 2011-09-30 | 2012-12-06 | 华为技术有限公司 | 传输流媒体的方法及设备 |
US20120317303A1 (en) * | 2011-06-08 | 2012-12-13 | Futurewei Technologies, Inc. | System and Method of Media Content Streaming with a Multiplexed Representation |
CN103053159A (zh) * | 2010-08-05 | 2013-04-17 | 高通股份有限公司 | 用信号传递网络串流传输视频数据的属性 |
CN104768011A (zh) * | 2015-03-31 | 2015-07-08 | 浙江大学 | 图像编解码方法和相关装置 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130170561A1 (en) * | 2011-07-05 | 2013-07-04 | Nokia Corporation | Method and apparatus for video coding and decoding |
EP2805463A1 (en) * | 2012-01-17 | 2014-11-26 | Telefonaktiebolaget L M Ericsson (publ) | Method for sending respectively receiving a media stream |
KR101741484B1 (ko) * | 2012-04-26 | 2017-05-30 | 퀄컴 인코포레이티드 | 저-레이턴시 스트림을 처리하기 위한 개선된 블록-요청 스트리밍 시스템 |
-
2016
- 2016-07-18 CN CN201610570310.6A patent/CN107634930B/zh active Active
-
2017
- 2017-01-12 WO PCT/CN2017/070994 patent/WO2018014523A1/zh active Application Filing
- 2017-06-20 WO PCT/CN2017/089161 patent/WO2018014691A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103053159A (zh) * | 2010-08-05 | 2013-04-17 | 高通股份有限公司 | 用信号传递网络串流传输视频数据的属性 |
US20120317303A1 (en) * | 2011-06-08 | 2012-12-13 | Futurewei Technologies, Inc. | System and Method of Media Content Streaming with a Multiplexed Representation |
WO2012162995A1 (zh) * | 2011-09-30 | 2012-12-06 | 华为技术有限公司 | 传输流媒体的方法及设备 |
CN104768011A (zh) * | 2015-03-31 | 2015-07-08 | 浙江大学 | 图像编解码方法和相关装置 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109905479A (zh) * | 2019-03-04 | 2019-06-18 | 腾讯科技(深圳)有限公司 | 文件传输方法和装置 |
CN114501166A (zh) * | 2021-11-18 | 2022-05-13 | 武汉市烽视威科技有限公司 | Dash点播快进快退方法及系统 |
CN118250494A (zh) * | 2024-05-27 | 2024-06-25 | 湖南快乐阳光互动娱乐传媒有限公司 | 视频版权保护方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
CN107634930B (zh) | 2020-04-03 |
WO2018014523A1 (zh) | 2018-01-25 |
CN107634930A (zh) | 2018-01-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018014691A1 (zh) | 一种媒体数据的获取方法和装置 | |
US10110654B2 (en) | Client, a content creator entity and methods thereof for media streaming | |
CN114503599B (zh) | 使用gltf2场景描述中的扩展来支持视频和音频数据 | |
US10863211B1 (en) | Manifest data for server-side media fragment insertion | |
CA2965484C (en) | Adaptive bitrate streaming latency reduction | |
US10432690B1 (en) | Manifest partitioning | |
WO2016138844A1 (zh) | 音视频文件直播方法和系统、服务器 | |
US11722711B2 (en) | System and method for data stream fragmentation | |
US11665219B2 (en) | Processing media data using a generic descriptor for file format boxes | |
US10104143B1 (en) | Manifest segmentation | |
US10116719B1 (en) | Customized dash manifest | |
US11438645B2 (en) | Media information processing method, related device, and computer storage medium | |
WO2019128668A1 (zh) | 视频码流处理方法、装置、网络设备和可读存储介质 | |
TW202236856A (zh) | 媒體資料的後台資料流量分配 | |
US11825136B2 (en) | Video transcoding method and apparatus | |
WO2024114519A1 (zh) | 点云封装与解封装方法、装置、介质及电子设备 | |
TWI574558B (zh) | 播放複合濃縮串流之方法以及播放器 | |
WO2022100742A1 (zh) | 视频编码及视频播放方法、装置和系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17830325 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17830325 Country of ref document: EP Kind code of ref document: A1 |