WO2018014545A1 - 一种码流数据的处理方法及装置 - Google Patents

一种码流数据的处理方法及装置 Download PDF

Info

Publication number
WO2018014545A1
WO2018014545A1 PCT/CN2017/073623 CN2017073623W WO2018014545A1 WO 2018014545 A1 WO2018014545 A1 WO 2018014545A1 CN 2017073623 W CN2017073623 W CN 2017073623W WO 2018014545 A1 WO2018014545 A1 WO 2018014545A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer segment
knowledge layer
target knowledge
segment
target
Prior art date
Application number
PCT/CN2017/073623
Other languages
English (en)
French (fr)
Inventor
邸佩云
范宇群
刘欣
赵寅
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2018014545A1 publication Critical patent/WO2018014545A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/65Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using error resilience

Definitions

  • the present invention relates to the field of streaming media data processing, and in particular, to a method and an apparatus for processing code stream data.
  • some random access points are inserted in the encoded video.
  • the video is divided into a plurality of video segments with random access functions by a random access point, which is simply referred to as a random access segment.
  • a random access segment In the conventional technology, an image in a random access segment can only serve as a reference picture/reference frame of other images in the random access segment, that is, inter-frames that do not allow random access points are allowed.
  • Prediction (English: Interprediction) greatly limits the efficiency of video encoding/decoding.
  • an encoder or decoder
  • An image similar to the current encoded image (or decoded image) texture content may be selected from the database as a reference image, such reference image is referred to as a knowledge base image, and a database storing the set of the reference images is referred to as a knowledge base.
  • the method of encoding and decoding at least one image with reference to at least one knowledge base image is called a library-based video coding (LBVC).
  • Encoding a video sequence using LBVC produces a knowledge layer code stream containing the knowledge base image and a sequence layer code stream containing images of each frame of the video sequence and the image encoded by the reference knowledge base image.
  • Multiple discontinuous segments in the sequence layer code stream may refer to the same knowledge layer segment, and the client needs to refer to the same knowledge layer segment to decode multiple discontinuous sequence layer segments.
  • the client decodes the data of different sequence layer segments independently, and the knowledge layer segment referenced by the previous sequence layer segment is cleared after the next sequence layer segment starts decoding, and the client cannot The knowledge layer is saved for the knowledge layer. If the knowledge layer segment referenced by the subsequent sequence layer segment is the same as the knowledge layer segment referenced by the previous sequence layer segment, the client needs to request the same knowledge layer segment again. The client repeatedly requests the same knowledge layer segment to waste the data transmission bandwidth, and repeatedly storing the same knowledge layer segment repeatedly wastes the storage space of the client.
  • the application provides a method and a device for processing code stream data, which can enhance the controllability of the storage time of the knowledge layer segment and improve the applicability of the code stream data management.
  • the first aspect provides a method for processing code stream data, which may include:
  • the client acquires management data of the target knowledge layer segment, where the target knowledge layer segment is one of at least one knowledge layer segment included in the code stream, and the target knowledge layer segment is at least two not included in the code stream Continuous sequence layer segment dependency, the management data is used to determine a preset effective time;
  • the client parses the management data to obtain a preset effective time of the target knowledge layer segment, where the at least two discontinuous sequence layer segments are decoded within the preset effective time;
  • the client deletes the target knowledge layer segment at the deleted moment of the target knowledge layer segment.
  • the client manages the target knowledge layer segment by the preset effective time of the target knowledge layer segment, and deletes the target knowledge layer segment at the deleted moment of the target knowledge layer segment, and determines the preset effective time of the target knowledge layer segment.
  • the deleted time can ensure that the preset effective time depends on the decoding of the sequence layer segment of the target knowledge layer segment without reloading, thereby avoiding waste of bandwidth of data transmission. Deleting the target knowledge layer segment at the deleted time also reduces the occupation of the client local storage space by the target knowledge layer segment, and enhances the applicability of the management of the code stream data.
  • the management data of the target knowledge layer segment is an initialization segment of the code stream or a media expression description MPD of the code stream;
  • the client parses the management data, and the preset valid time of acquiring the target knowledge layer segment includes:
  • Determining, by the client, the deleted moment of the target knowledge layer segment according to the preset effective time of the target knowledge layer segment includes:
  • the client may obtain the effective duration of the target knowledge layer segment from the initialization segment of the code stream, and manage the local storage time of the target knowledge layer segment on the client according to the effective duration of the target knowledge layer segment. It is ensured that the decoding of the sequence layer segment dependent on the target knowledge layer segment is completed within the effective duration of the target knowledge layer segment, and no repeated loading is required, thereby avoiding waste of bandwidth caused by repeated downloading of the knowledge layer segment.
  • the application can also obtain the effective duration of the target knowledge layer segment in the MPD of the code stream, the data transmission is simpler, the data transmission resource is saved, and the applicability of the management of the knowledge layer segment is enhanced.
  • the management data of the target knowledge layer segment is segment information of the target knowledge layer segment
  • the client parses the management data, and the preset valid time of acquiring the target knowledge layer segment includes:
  • Determining, by the client, the deleted moment of the target knowledge layer segment according to the preset effective time of the target knowledge layer segment includes:
  • the client may determine the deleted time of the target knowledge layer segment according to the initial effective time and the effective duration. , improve the management accuracy of the client local storage time of the target knowledge layer segment, and reduce the client's target.
  • the management data of the target knowledge layer segment is segment information of the target knowledge layer segment
  • the client parses the management data, and the preset valid time of acquiring the target knowledge layer segment includes:
  • T5 Determining, by the client, a failure time T5 of the target knowledge layer segment carried in the segment information of the target knowledge layer segment, where the T5 is a termination time of the preset effective time of the target knowledge layer segment;
  • Determining, by the client, the deleted moment of the target knowledge layer segment according to the preset effective time of the target knowledge layer segment includes:
  • the client determines the T5 as the deleted moment of the knowledge layer segment.
  • the client of the application can obtain the failure time of the target knowledge layer segment in the segment information of the target knowledge layer segment, determine the failure time of the target knowledge layer segment as the deleted time, and the operation is simple, and the preservation time of the knowledge layer segment is improved. Management accuracy. Further deleting the target knowledge layer segment at the time of deletion can reduce the memory waste of the client in the management of the knowledge layer segment, and enhance the applicability of the management of the knowledge layer segment.
  • the at least two discontinuous sequence layer fragments are decoded before the T5.
  • the present application sets the deleted time of the target knowledge layer segment to be deleted after the sequence layer segment of the target knowledge layer segment is decoded, and deletes the target knowledge layer segment after decoding the sequence layer segment that depends on the target knowledge layer segment, thereby ensuring the sequence layer segment.
  • the correct decoding enhances the applicability of the processing of the code stream data.
  • the second aspect provides a processing device for code stream data, which may include:
  • An acquiring unit configured to acquire management data of a target knowledge layer segment, where the target knowledge layer segment is one of at least one knowledge layer segment included in the code stream, where the target knowledge layer segment is included in at least one of the code stream segments Two discontinuous sequence layer segments are dependent, and the management data is used to determine a preset effective time;
  • a parsing unit configured to parse the management data acquired by the acquiring unit, and acquire a preset effective time of the target knowledge layer segment, where the at least two discontinuous sequence layer segments are in the preset effective time Decoded internally;
  • a determining unit configured to determine, according to a preset effective time of the target knowledge layer segment acquired by the parsing unit, a deleted moment of the target knowledge layer segment;
  • a deleting unit configured to delete the target knowledge layer segment at a deleted moment of the target knowledge layer segment determined by the determining unit.
  • the management data of the target knowledge layer segment is an initialization segment of the code stream or a media expression description MPD of the code stream;
  • the parsing unit is specifically configured to:
  • the determining unit is specifically configured to:
  • the management data of the target knowledge layer segment is segment information of the target knowledge layer segment
  • the parsing unit is specifically configured to:
  • the determining unit is specifically configured to:
  • the management data of the target knowledge layer segment is segment information of the target knowledge layer segment
  • the parsing unit is specifically configured to:
  • the determining unit is specifically configured to:
  • the T5 acquired by the parsing unit is determined as the deleted moment of the knowledge layer segment.
  • the at least two discontinuous sequence layer fragments are decoded before the T5 acquired by the parsing unit.
  • the processing device of the code stream data manages the target knowledge layer segment by the preset effective time of the target knowledge layer segment, and deletes the target knowledge layer segment at the deleted moment of the target knowledge layer segment, and the target knowledge layer segment is pre-processed.
  • Setting the effective time to determine the deleted time ensures that the preset effective time depends on the decoding of the sequence layer segment of the target knowledge layer segment without reloading, thereby avoiding waste of data transmission bandwidth. Deleting the target knowledge layer segment at the deleted time also reduces the occupation of the client local storage space by the target knowledge layer segment, and enhances the applicability of the management of the code stream data.
  • a third aspect provides a client, which can include: a memory and a processor, the memory being coupled to the processor;
  • the memory is for storing a set of program codes
  • the processor is configured to invoke a program code stored in the memory to execute a processing method of the code stream data provided by the first aspect.
  • FIG. 1 is a schematic diagram of an example of a framework for DASH standard transmission used in system layer video streaming media transmission
  • FIG. 2 is a schematic structural diagram of an MPD transmitted by a DASH standard used for system layer video streaming media transmission
  • FIG. 3 is a schematic diagram of a plurality of mutually independent random access segments
  • FIG. 4 is a schematic diagram of a knowledge base providing an encoding reference for a random access segment
  • FIG. 5 is a schematic flowchart of a method for processing code stream data according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of a device for processing code stream data according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of a client according to an embodiment of the present invention.
  • the current client-side system layer video streaming media transmission scheme may adopt a dynamic adaptive streaming over HTTP (DASH) standard framework based on a hypertext transfer protocol (HTTP).
  • FIG. 1 is a schematic diagram of an example of a frame for DASH standard transmission used in system layer video streaming media transmission.
  • the data transmission process of the system layer video streaming media transmission scheme includes two processes: a server side (such as an HTTP server, hereinafter referred to as a server) processes for generating media data for video content, and a client (such as an HTTP streaming media client) requests the server. And the process of getting media data.
  • the media expression on the server includes multiple description layers, and each description layer describes multiple segments.
  • the HTTP streaming request control module of the client obtains the media presentation description (MPD) sent by the server, and analyzes the MPD to determine the fragment to be requested, and requests the corresponding fragment from the server through the HTTP request receiving end, and passes the The media player performs decoding playback.
  • MPD media presentation description
  • the media data generated by the server for the video content includes different versions of the video code stream of the same video content, and the MPD of the code stream.
  • the server generates a low-resolution low-rate low frame rate (such as 360p resolution, 300kbps code rate, 15fps frame rate) for the video content of the same episode, and a medium-rate medium-rate high frame rate (such as 720p).
  • Resolution 1200 kbps, 25 fps frame rate, high resolution, high bit rate, high frame rate (such as 1080p resolution, 3000 kbps, 25 fps frame rate).
  • FIG. 2 is a schematic structural diagram of an MPD of a system transmission scheme DASH standard.
  • the MPD of the above code stream includes a plurality of description layers (English: Representation).
  • Each description layer describes one or more segments of the above code stream.
  • the description layers included in the MPD in the foregoing code stream may be independent of each other or may depend on each other.
  • each of the foregoing description layers is independent of each other, and the codec of each description layer does not refer to other description layers (for example, a description layer describing a knowledge layer segment, and the codec of the knowledge layer segment does not refer to other segments), and each description layer is between Interdependence means that the codec of each description layer needs to refer to other description layers (for example, a description layer describing a sequence layer fragment, and the codec of the sequence layer fragment needs to refer to the knowledge layer fragment).
  • Each description layer describes information of several segments (English: Segment) according to time series, such as initialization segment (English: Initialization segment), Media Segment1. Media Segment1,..., Media Segment20, etc., all segments are connected end to end in time.
  • Each segment contains a video stream within a time period, and the description of the segment in the description layer includes a playback start time, a playback duration, and a network storage address (for example, a Uniform Resource Locator (URL). Fragment information such as the form of the network storage address).
  • a network storage address for example, a Uniform Resource Locator (URL).
  • Fragment information such as the form of the network storage address).
  • the segment is further allowed to be subdivided into a plurality of sub-segments (English: Subsegment), and each sub-segment includes a part of the segment, and the information of the sub-segment includes a playback start time, a playback duration, and a code of the sub-segment in the segment to which the sub-segment belongs.
  • the range of bytes in the stream (English: Byte Range) and so on.
  • the information of the above sub-segments is described by a segment index (English: Segment Index), each segment index describes information of all sub-segments in a segment; the segment index may be merged with the segment, stored at the beginning of the segment, or may be stored separately.
  • index fragment (English: Index Segment).
  • the client In the process of the client requesting and obtaining the media data from the server, when the user selects to play the video, the client obtains the corresponding MPD according to the video content requested by the user, and generates a segment according to the segment information of the video content described in the MPD. List.
  • the above fragment list records the playback period of each clip and the network storage address of each clip.
  • the client obtains a network storage address of one or more segments of the playback time from the segment list according to the on-demand time of the user's on-demand, and sends a request for downloading the video segment data corresponding to the network storage address to the server, and the server receives the request according to the received Request to send video clip data to the client.
  • the client After the client obtains the video clip data sent by the server, it can perform decoding, playback, and the like through the media player.
  • the system layer video streaming media transmission scheme adopts the DASH standard, and realizes the transmission of video data by analyzing the MPD by the client, requesting the video data to the server as needed, and receiving the data sent by the server.
  • the system layer video streaming media transmission scheme adopts the DASH standard and is mainly applied to a video stream generated by a conventional video encoding (for example, an encoding standard such as H.264, HEVC (High Efficiency Video Coding)).
  • a conventional video encoding for example, an encoding standard such as H.264, HEVC (High Efficiency Video Coding)
  • FIG. 3 is a schematic diagram of a plurality of mutually independent random access segments.
  • the dot represents a random access point
  • the square represents a random access segment after the random access point
  • the dotted arrow with an x symbol indicates that the random access segment pointed by the arrow cannot refer to the information of the random access segment starting from the dotted line when encoding. That is, in the codec technology of the conventional video, the image in one random access segment can only serve as the reference image/reference frame of other images in the random access segment, that is, inter-prediction across random access points is not allowed, Limits the efficiency of video encoding/decoding.
  • FIG. 4 is a schematic diagram of providing a coding reference for other random access segments by using one knowledge base in the video coding of the knowledge base.
  • the dot represents a random access point
  • the square represents a random access segment after the random access point
  • the arrow indicates that a plurality of random access segments are referenced by the information provided by the knowledge base (English: Library) at the time of encoding.
  • This knowledge base-based coding method extracts similar content that appears multiple times in the video into the knowledge base, and improves the coding efficiency of the video by referring to the image in the knowledge base.
  • the random access point image can be encoded/decoded with reference to the image in the knowledge base, or the conventional intra coding method can be directly used.
  • the random access point image is not encoded/decoded depending on other images in the video sequence, and the random access segments are still independent of each other.
  • each description layer has a separate ID.
  • the syntax of the description level of the MPD ie, the representation level
  • the syntax used to describe the attribute information of the layer includes an attribute dependencyId indicating the description layer of the attribute to decode or describe the ID of another description layer that needs to be relied upon.
  • the client requests to carry the segmentation (segment segment1) data of the dependencyId attribute, it needs to obtain the segment (assumed to be segment2) that segment1 depends on in order to correctly decode or describe segment1.
  • the following describes the dependency of the segments in each representation in combination with the description of the partial representation in the MPD. The information at the upper level of the representation is not described here:
  • the above description indicates that the representation of the Id "tag6” depends on the representation of the "tag5", that is, the segment decoding described in the representation of the above Id "tag6” depends on the segment described in the representation of the "tag5".
  • the above description describes the URL of a segment by an index segment (ie, an index segment).
  • reference_ID the ID of the code stream
  • Timescale time unit
  • Earliest_presentation_time The earliest rendering time of the code stream described in the index segment, in units of timescale;
  • First_offset the starting offset of the first segment after the index segment
  • Reference_count the number of segments described in the index segment
  • Reference_type 1 indicates that the segment is an index segment, and 0 indicates that the segment is a media content
  • Referenced_size the size of the segment
  • Subsegment_duration the duration of the segment in timescale
  • starts_with_SAP the stream access type of the segment
  • SAP_delta_time The earliest rendering time of the first streaming access point.
  • the process of the client acquiring the code stream data is as follows:
  • the client receives the MPD sent by the server and contains the above information, and parses the information contained in the MPD to obtain the dependency relationship between the representation and the information of the index segment.
  • the client constructs the URL of the request index segment according to the indexRange information in the MPD, such as http://example.com/video-512k.mp4/0-4332, and then according to the index segment.
  • the URL requests the index segment.
  • the client obtains the index segment, parses the sidx box information of the index segment, obtains the segment information, and further constructs the segment URL according to the segment information, and requests the segment according to the segment URL.
  • the sidx box is a specific syntax box in a segment whose segment name is an index segment.
  • the code stream time point of the client switching is the time point of the progress progress time axis of the client player is 1 minute
  • the URL of the segment is http://example.com/video-512k.mp4/10000-10500
  • the URL of the segment is http://example.com/video-768k.mp4/9000-9400.
  • the segment of tag6 when the client decodes depends on the data of the segment of tag5.
  • the client After the client determines the URL of the two segments, it can send a segment request to the server.
  • the URLs of the two segments are http://example.com/video-512k.mp4/10000-10500 and http:// respectively. Example.com/video-768k.mp4/9000-9400.
  • the server After the server receives the request from the client, it can send the data of the above two segments to the client.
  • the client sends the received segment number to the decoder for decoding.
  • the client decodes data of different segments independently. After the client decodes segment1, segment1 depends on segment2 before the client decodes the next segment (assumed to be segment3). If the subsequent segment (assumed to be segment4) also depends on segment2, then segment2 needs to be requested again, and the client cannot determine the save duration of segment2 according to the dependent state of segment2 (ie, segment2 is dependent on multiple other segments), resulting in repeated request and download of segment2. This in turn causes a waste of bandwidth on the client side.
  • the embodiment of the present invention provides a method for processing code stream data, which can save or manage the knowledge layer segment according to the save time information of the knowledge layer segment in the knowledge layer code stream, and reduce the number of repeated requests of the knowledge layer segment. , save the client's data transmission bandwidth.
  • FIG. 5 is a schematic flowchart diagram of a method for processing code stream data according to an embodiment of the present invention.
  • the method provided by the embodiment of the present invention includes the following steps:
  • the client acquires target knowledge layer segment management data.
  • the preset valid time of the target knowledge layer segment may be set in advance according to the coded period of the sequence layer segment that depends on the target knowledge layer segment.
  • the sequence layer segment that depends on the target knowledge layer segment is encoded within a preset effective time of the target knowledge layer segment.
  • the client decodes the video data the sequence layer segment that depends on the target knowledge layer segment is decoded within a preset effective time of the target knowledge layer segment.
  • the target knowledge layer segment is one of multiple knowledge layer segments obtained by segmentation of the knowledge layer code stream, and the target knowledge layer segment is dependent on at least two discontinuous sequence layer segments.
  • the plurality of segments obtained by segmentation of the sequence layer code stream are referred to as sequence layer segments.
  • the sequence layer segment includes consecutive sequence layer segments and discontinuous sequence layer segments, that is, temporally consecutive segments and temporally discontinuous sequence layer segments, and at least one sequence layer segment is encoded by one or more knowledge layers.
  • the fragment is a reference fragment.
  • the server may determine the knowledge layer segment that is dependent on at least two discontinuous sequence layer segments as the target knowledge layer segment.
  • the target knowledge layer segment is dependent on the sequence layer segment 1, the sequence layer segment 2, the sequence layer segment 4, and the sequence layer segment 5, wherein the sequence layer segment 1 and the sequence layer segment 2 are temporally consecutive sequence layer segments, and the sequence layer Fragment 4 and sequence layer segment 5 are also temporally consecutive sequence layer segments, sequence layer segment 1 and sequence layer segment 4, Sequence layer segment 2 and sequence layer segment 4 are sequence segments and the like which are discontinuous in time, and sequence layer segment 1 and sequence layer segment 5, sequence layer segment 2 and sequence layer segment 5 are also temporally discontinuous sequence layers. Fragment.
  • the target knowledge layer segment may also be a segment that is dependent on the sequence layer segment 2 and the sequence layer segment 4, that is, the sequence layer segment that depends on the target knowledge layer segment contains at least two discontinuous segments.
  • the knowledge layer segment of the at least two consecutive sequence layer segments may include multiple pieces.
  • the embodiment of the present invention will be described by taking one of the knowledge layer segments as the target knowledge layer segment as an example.
  • the server may encapsulate information such as the preset effective time of the target knowledge layer segment in the media data of the code stream, and may feed back the foregoing code stream and its media data to the client when the client sends the request for obtaining the media data.
  • the client may obtain the code stream sent by the server and the media data thereof, and obtain the management data of the target knowledge layer segment by parsing the code stream and the media data thereof.
  • the management data of the target knowledge layer segment is used to determine a preset effective time of the knowledge layer segment.
  • the management data of the target knowledge layer segment may include an initialization segment of the code stream, an MPD of the code stream, or a knowledge layer segment of the code stream, and may be determined according to an actual application scenario, and is not limited herein.
  • the client parses the management data, and obtains a preset valid time of the target knowledge layer segment.
  • the preset effective time of the target knowledge layer segment may be an effective duration of the target knowledge layer segment carried in the initialization segment of the code stream, or an effective duration of the target knowledge layer segment described in the MPD of the code stream. Further, the preset effective time of the target knowledge layer segment may also be information such as an initial effective time and an effective duration carried in the segment information of the target knowledge layer segment, or an invalidation time. It can be determined according to the actual application scenario, and no limitation is imposed here. The preset effective time determination of the target knowledge layer segment will be described below in conjunction with step S103.
  • the client determines, according to the preset effective time of the target knowledge layer segment, the deleted moment of the target knowledge layer segment.
  • the management data of the target knowledge layer segment may be an initialization segment of the code stream, where the initialization segment carries an effective duration of the target knowledge layer segment.
  • the knowledge layer segment obtained by the above-mentioned knowledge layer code stream segmentation may be one frame of data in the code stream, and one knowledge layer segment is a video frame.
  • the client may parse the foregoing management data to obtain an effective duration included in the initialization segment.
  • the effective duration of the target knowledge layer segment may be added in the initialization segment of the code stream in the following syntax format, and the client may parse the initialization segment described by the syntax format to obtain the effective duration.
  • Timescale time unit or time scale
  • Duration The effective duration in timescale.
  • the client can request the server to obtain an initialization fragment of the code stream.
  • the client can parse it, and obtain the effective duration of the target knowledge layer segment from the initialization segment (set to L). Further, the client may request the server to obtain the target knowledge layer segment according to the actual application scenario requirements such as the user's on-demand or the decoding requirements of the sequence layer segment.
  • the target knowledge layer fragment that the server can feed back to the client according to the client's request. After the client obtains the target knowledge layer segment sent by the server, the time when the target knowledge layer segment is first referenced may be recorded.
  • the time at which the target knowledge layer segment is referenced for the first time may specifically be a dependent time (set to T1) that the target knowledge layer segment is dependent on the target sequence layer segment, wherein the target sequence layer segment is dependent on the target knowledge.
  • the first of the at least two sequence layer segments of the slice segment is decoded.
  • the time-dependent moment of the target knowledge layer segment may be a time when the target knowledge layer segment is sent to the decoder.
  • the client may start timing when the knowledge layer segment is sent to the decoder, stop timing when the time length is equal to the effective duration L of the knowledge layer segment described in the syntax element, and determine the time to stop timing as the target. The time at which the knowledge layer fragment was deleted.
  • the effective duration of the target knowledge layer segment may be determined.
  • the client can manage the target knowledge layer segment in the client local storage by the maximum effective time (ie, the effective duration of the target knowledge layer segment).
  • the client manages the target knowledge layer segment by the maximum effective time to ensure the effective duration of the target knowledge layer segment. If the non-knowledge layer segment (ie, the sequence layer segment) needs to decode the target knowledge layer segment, the client may first The local knowledge store finds the target knowledge layer segment, and does not need to request re-request.
  • the local-managed knowledge layer segment can be reused, which avoids the waste of bandwidth caused by repeated download of the knowledge layer segment.
  • the effective duration of the target knowledge layer segment may also be carried in the MPD of the code stream.
  • the server When the server generates the MPD of the code stream, the effective duration of the target knowledge layer segment can be added to the MPD of the code stream.
  • the server may add a new syntax element in the description layer in the MPD, such as @EffectiveDuration.
  • the above syntax element @EffectiveDuration indicates that the effective duration of the knowledge layer segment described by the description layer in which it is located is the value of EffectiveDuration, wherein the value of the above EffectiveDuration is in units of the timescale attribute in the MPD. For example, assuming that the value of the above EffectiveDuration is 100000 and the value of the timescale in the MPD is 1000, the effective duration of the knowledge layer segment described by the description layer of the EffectiveDuration is 100 seconds.
  • the client may request the server to obtain the MPD of the code stream, and then parse the acquired MPD to obtain the effective duration of the target knowledge layer segment carried in the MPD of the code stream (for example, the value of EffectiveDuration in the timescale attribute unit). ).
  • the time when the target knowledge layer segment is first referenced may be recorded.
  • the time at which the target knowledge layer segment is referenced for the first time may specifically be a dependent time (set to T1) that the target knowledge layer segment is dependent on the target sequence layer segment, wherein the target sequence layer segment is dependent on the target knowledge.
  • the first of the at least two sequence layer segments of the slice segment is decoded.
  • the above implementation describes the effective duration of the target knowledge layer segment in the MPD, the data transmission is more convenient, the data transmission resource is saved, and the applicability of the management of the knowledge layer segment is enhanced.
  • the effective duration of each knowledge layer segment may be set to be consistent.
  • the usage time of each knowledge layer segment is different, for example, if the effective use time of the knowledge layer segment 2 (decoding or time period to be relied upon) is 5 seconds.
  • the effective duration of each knowledge layer segment added by the server in the initialization segment is 50 seconds, and the knowledge layer segment 2 will not be used after being sent to the decoder for 5 seconds, then the knowledge layer segment 2 will continue. Saving 45s on the client is likely to cause a waste of local storage space on the client.
  • the server in the encapsulation of the knowledge layer segment, may separately encapsulate each knowledge layer segment, and carry the preset effective time of the knowledge layer segment in the segment information of each knowledge layer segment. .
  • the preset effective time of the target knowledge layer segment may be added to the segment information of the target knowledge layer segment.
  • the preset effective time of the target knowledge layer segment may include an initial effective time of the target knowledge layer segment (set to T3) and an effective duration of the target knowledge layer segment (set to L1).
  • the server may encapsulate the target knowledge layer segment by using the following encapsulation syntax format, and carry the initial effective time and the effective duration in the target knowledge layer segment.
  • Timescale time unit or time scale
  • Start_time the starting effective time in units of timescale
  • Duration The effective duration in timescale.
  • timescale syntax The specific implementation of the timescale syntax may not be.
  • the client may request the server to acquire the target knowledge layer segment according to the actual application scenario requirements such as the user's on-demand or the decoding requirements of the sequence layer segment.
  • the client may parse the efdu box information in the target knowledge layer segment, and obtain information such as start_time and duration included in the segment information of the target knowledge layer segment. Further, the client may determine the preset effective time of the knowledge layer segment according to the foregoing information such as start_time and duration.
  • the start_time may be the time when the target knowledge layer segment is referenced for the first time, for example, the time when the target knowledge layer segment is sent to the decoder, and may be determined according to the time application scenario, and is not limited herein.
  • the value of the duration data may be determined as the effective duration of the knowledge layer segment, and the current The media data processing instant determines the start_time of the knowledge layer segment. It can be determined according to the actual application scenario, and no limitation is imposed here.
  • the client may locally manage the target according to the initial effective time or the current media data processing time and the effective duration.
  • the knowledge layer segment improves the accuracy of the management of the target knowledge layer segment, further reduces the waste of the client bandwidth, reduces the memory waste of the client in the management of the knowledge layer segment, and enhances the applicability of the management of the knowledge layer segment.
  • the server may separately encapsulate each knowledge layer segment, and carry the failure time of the knowledge layer segment in the segment information of each knowledge layer segment (or Referring to the timeout period, the client is instructed to manage the knowledge layer segment by the failure time of the knowledge layer segment.
  • the failure time of the target knowledge layer segment may be carried in the segment information of the target knowledge layer segment.
  • the server may encapsulate the target knowledge layer segment by using the following encapsulation syntax format, and carry the failure time of the target knowledge layer segment in the target knowledge layer segment.
  • Timescale time unit or time scale
  • timescale syntax The specific implementation of the timescale syntax may not be.
  • the client may parse the expd box information in the knowledge layer segment, and obtain information such as Expiredate of the knowledge layer segment. Further, the client may determine the deleted moment of the knowledge layer segment by the time indicated by the above Expiredate (ie, the failure time T5). In a specific implementation, at least two sequence layer segments that depend on the target knowledge layer segment are decoded before the failure time of the target knowledge layer segment.
  • the client obtains the failure time of each knowledge layer segment, determines the failure time of each knowledge layer segment as the deleted time, and the operation is simple, and improves the accuracy of the management of the knowledge layer segment, further Reduce the waste of client bandwidth, reduce the memory waste of the client in the management of the knowledge layer segment, and enhance the applicability of the management of the knowledge layer segment.
  • the client may select from the foregoing implementation manners according to actual application requirements. Choose one or more, and there is no limit here.
  • the client deletes the target knowledge layer segment at the deleted moment of the target knowledge layer segment.
  • the client may delete the knowledge layer segment at the deleted time, thereby reducing waste of storage space of the client.
  • the client may determine the preset effective time of the target knowledge layer segment in the knowledge layer code stream according to the initialization segment or the MPD of the code stream, or determine the target knowledge according to the information carried in the target knowledge layer segment.
  • the preset effective time of the layer fragment Further, the last valid time of the target knowledge layer segment, that is, the deleted time of the target knowledge layer segment, may be determined according to the preset effective time of the target knowledge layer segment, and the target knowledge layer segment may be deleted at the deleted time; otherwise Save the target knowledge layer fragment in the local storage of the client.
  • the client manages the target knowledge layer segment by the preset effective time of the target knowledge layer segment, which can ensure the preset effective time.
  • the client can Firstly, the required target knowledge layer segment is searched in the local storage, and the locally managed knowledge layer segment can be reused without re-requesting the acquisition, thereby avoiding repeated downloading of the knowledge layer segment and avoiding bandwidth waste.
  • the knowledge layer segment is deleted when it is deleted, and the knowledge layer fragment is also occupied by the client local storage space.
  • FIG. 6 is a schematic structural diagram of a device for processing code stream data according to an embodiment of the present invention.
  • the processing device provided by the embodiment of the present invention includes:
  • the acquiring unit 61 is configured to acquire management data of a target knowledge layer segment, where the target knowledge layer segment is one of at least one knowledge layer segment included in the code stream, where the target knowledge layer segment is included in the code stream At least two discontinuous sequence layer segments are dependent, and the management data is used to determine a preset effective time.
  • the parsing unit 62 is configured to parse the management data acquired by the acquiring unit 61, and obtain a preset effective time of the target knowledge layer segment, where the at least two discontinuous sequence layer segments are in the preset It is decoded within the effective time.
  • the determining unit 63 is configured to determine, according to the preset effective time of the target knowledge layer segment acquired by the parsing unit 62, the deleted moment of the target knowledge layer segment.
  • the deleting unit 64 is configured to delete the target knowledge layer segment at the deleted moment of the target knowledge layer segment determined by the determining unit 63.
  • the management data of the target knowledge layer segment is an initialization segment of the code stream or a media expression description MPD of the code stream;
  • the parsing unit 62 is specifically configured to:
  • the determining unit 63 is specifically configured to:
  • the management data of the target knowledge layer segment is segment information of the target knowledge layer segment
  • the parsing unit 62 is specifically configured to:
  • the determining unit 63 is specifically configured to:
  • the management data of the target knowledge layer segment is segment information of the target knowledge layer segment
  • the parsing unit 62 is specifically configured to:
  • the management data of the target knowledge layer segment acquired by the acquiring unit 61 is parsed, and the failure time T5 of the target knowledge layer segment carried in the segment information of the target knowledge layer segment is acquired, and the T5 is used as the target knowledge.
  • the determining unit 63 is specifically configured to:
  • the T5 acquired by the parsing unit 62 is determined as the deleted moment of the knowledge layer segment.
  • the at least two discontinuous sequence layer fragments are decoded before the T5 acquired by the parsing unit 62.
  • the processing device of the code stream data provided by the embodiment of the present invention may be specifically the client provided by the foregoing embodiment.
  • the obtaining unit 61, the parsing unit 62, the determining unit 63, and the deleting unit 64 included in the processing device may be a function module of the client, for example, an HTTP streaming request control module in an HTTP streaming client, etc., specifically according to actual conditions.
  • the application scenario requirements are determined and will not be described here.
  • the foregoing processing device can perform the implementation manner of the client in the processing method of the foregoing code stream data by using the built-in units, and details are not described herein again.
  • the client may determine the preset effective time of the target knowledge layer segment in the knowledge layer code stream according to the initialization segment or the MPD of the code stream, or determine the target knowledge according to the information carried in the target knowledge layer segment.
  • the preset effective time of the layer fragment Further, the last valid time of the target knowledge layer segment, that is, the deleted time of the target knowledge layer segment, may be determined according to the preset effective time of the target knowledge layer segment, and the target knowledge layer segment may be deleted at the deleted time; otherwise Save the target knowledge layer fragment in the local storage of the client.
  • the client manages the target knowledge layer segment by the preset effective time of the target knowledge layer segment, which can ensure the preset effective time.
  • the client can Firstly, the required target knowledge layer segment is searched in the local storage, and the locally managed knowledge layer segment can be reused without re-requesting the acquisition, thereby avoiding repeated downloading of the knowledge layer segment and avoiding bandwidth waste.
  • the knowledge layer segment is deleted when it is deleted, and the knowledge layer fragment is also occupied by the client local storage space.
  • FIG. 7 is a schematic structural diagram of a client provided by an embodiment of the present invention.
  • the client provided by the embodiment of the present invention may include a memory 71 and a processor 72, and the memory 71 is connected to the processor 72.
  • the above memory 71 is used to store a set of program codes.
  • the processor 72 is configured to invoke the program code stored in the memory 71 to perform the following operations:
  • Obtaining management data of a target knowledge layer segment the target knowledge layer segment being one of at least one knowledge layer segment included in the code stream, the target knowledge layer segment being at least two discontinuous included in the code stream
  • the sequence layer segment is dependent, and the management data is used to determine a preset effective time
  • the target knowledge layer segment is deleted at the deleted time of the target knowledge layer segment.
  • the management data of the target knowledge layer segment is an initialization segment of the code stream or a media expression description MPD of the code stream;
  • the processor 72 is specifically configured to:
  • the management data of the target knowledge layer segment is segment information of the target knowledge layer segment
  • the processor 72 is specifically configured to:
  • the management data of the target knowledge layer segment is segment information of the target knowledge layer segment
  • the processor 72 is specifically configured to:
  • T5 Obtaining a failure time T5 of the target knowledge layer segment carried in the segment information of the target knowledge layer segment, where the T5 is a termination time of the preset effective time of the target knowledge layer segment;
  • the T5 is determined as the deleted time of the knowledge layer segment.
  • the at least two discontinuous sequence layer segments are decoded prior to the T5.
  • the client may perform the implementation manner of the client in the processing method of the code stream data provided in the foregoing embodiment by using the processor 72, and details are not described herein again.
  • the client may determine the preset effective time of the target knowledge layer segment in the knowledge layer code stream according to the initialization segment or the MPD of the code stream, or determine the target knowledge according to the information carried in the target knowledge layer segment.
  • the preset effective time of the layer fragment Further, the last valid time of the target knowledge layer segment, that is, the deleted time of the target knowledge layer segment, may be determined according to the preset effective time of the target knowledge layer segment, and the target knowledge layer segment may be deleted at the deleted time; otherwise Save the target knowledge layer fragment in the local storage of the client.
  • the client manages the target knowledge layer segment by the preset effective time of the target knowledge layer segment, which can guarantee the preset effective time, if non-knowledge layer Segment (ie, sequence layer segment) decoding requires the above-mentioned target knowledge layer segment, and the client can first find the required target knowledge layer segment in the local storage, and can re-use the locally managed knowledge layer segment without re-requesting acquisition, thereby avoiding Repeated download of the knowledge layer fragments avoids wasted bandwidth.
  • the knowledge layer segment is deleted when it is deleted, and the knowledge layer fragment is also occupied by the client local storage space.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本发明实施例公开了一种码流数据的处理方法及装置,所述方法包括:客户端获取目标知识层片段的管理数据,目标知识层片段为码流中包含的至少一个知识层片段中的一个,目标知识层片段被所述码流中包含的至少两个不连续的序列层片段依赖,管理数据用于确定预设有效时间;客户端解析所述管理数据,获取目标知识层片段的预设有效时间,其中,所述至少两个不连续的序列层片段在所述预设有效时间内被解码;客户端根据所述目标知识层片段的预设有效时间确定所述目标知识层片段的被删除时刻;客户端在所述目标知识层片段的被删除时刻删除所述目标知识层片段。采用本发明实施例,具有可增强知识层片段的存储时间的可控性,提高码流数据管理的适用性的优点。

Description

一种码流数据的处理方法及装置 技术领域
本发明涉及流媒体数据处理领域,尤其涉及一种码流数据的处理方法及装置。
背景技术
传统视频编码中,为了使编码后的视频支持随机访问功能,在编码视频中会插入一些随机访问点(英文:random access point)。视频由随机访问点分割成多个具有随机访问功能的视频片段,简称为随机访问片段。在传统技术中,一个随机访问片段中的图像只能作为该随机访问片段中的其他图像的参考图像/参考帧(英文:reference picture/reference frame)),即不允许跨随机访问点的帧间预测(英文:Interprediction),较大地限制了视频编/解码的效率。
为了挖掘和利用多个随机访问片段之间的图像在编码时相互参考的信息(简称互信息(英文:mutual information)),在编码(或解码)一幅图像时,编码器(或解码器)可以从数据库中选择与当前编码图像(或解码图像)纹理内容相近的图像作为参考图像,这种参考图像称为知识库图像,存储上述参考图像的集合的数据库称为知识库,这种视频中至少一幅图像参考至少一幅知识库图像进行编解码的方法称为基于知识库的视频编码(英文:library-based video coding,LBVC)。采用LBVC对一个视频序列进行编码会产生一个包含知识库图像的知识层码流和一个包含视频序列各帧图像、参考知识库图像编码得到的图像的序列层码流。序列层码流中的多个不连续片段可能会参考同一个知识层片段,客户端解码多个不连续的序列层片段都需要参考同一个知识层片段。
现有技术中,客户端对不同序列层片段的数据进行解码时都是独立解码,在前的序列层片段参考的知识层片段在后一个序列层片段开始解码后都会被清除,客户端无法根据需求对知识层片段进行保存。若后面的序列层片段参考的知识层片段与之前的序列层片段参考的知识层片段相同,客户端则需要再次请求同一个知识层片段。客户端重复多次请求同一个知识层片段浪费了数据传输带宽,重复存储多次同一个知识层片段浪费了客户端的存储空间。
发明内容
本申请提供了一种码流数据的处理方法及装置,可增强知识层片段的存储时间的可控性,提高码流数据管理的适用性。
第一方面提供了一种码流数据的处理方法,其可包括:
客户端获取目标知识层片段的管理数据,所述目标知识层片段为码流中包含的至少一个知识层片段中的一个,所述目标知识层片段被所述码流中包含的至少两个不连续的序列层片段依赖,所述管理数据用于确定预设有效时间;
所述客户端解析所述管理数据,获取所述目标知识层片段的预设有效时间,其中,所述至少两个不连续的序列层片段在所述预设有效时间内被解码;
所述客户端根据所述目标知识层片段的预设有效时间确定所述目标知识层片段的被删除时刻;
所述客户端在所述目标知识层片段的被删除时刻删除所述目标知识层片段。
在本申请中,客户端通过目标知识层片段的预设有效时间管理目标知识层片段,在目标知识层片段的被删除时刻将目标知识层片段删除,通过目标知识层片段的预设有效时间确定其被删除时间可保证该预设有效时间内,依赖目标知识层片段的序列层片段的解码,无需重新加载,避免了数据传输的带宽浪费。在目标知识层片段的被删除时刻将其删除,也减少目标知识层片段对客户端本地存储空间的占用,增强了码流数据的管理的适用性。
结合第一方面,在第一种可能的实现方式中,所述目标知识层片段的管理数据为所述码流的初始化片段或者所述码流的媒体表达描述MPD;
所述客户端解析所述管理数据,获取所述目标知识层片段的预设有效时间包括:
所述客户端获取所述初始化片段中携带的所述目标知识层片段的有效持续时间,或者所述MPD中描述的所述目标知识层片段的有效持续时间,作为所述目标知识层片段的预设有效时间L;
所述客户端根据所述目标知识层片段的预设有效时间确定所述目标知识层片段的被删除时刻包括:
所述客户端获取所述目标知识层片段被目标序列层片段依赖的被依赖时刻T1,所述目标序列层片段为所述至少两个不连续的序列层片段中的第一个;
计算T2,其中,T2=T1+L,并将所述T2确定为所述目标知识层片段的被删除时刻。
在本申请中,客户端可从码流的初始化片段中获取得到目标知识层片段的有效持续时间,根据目标知识层片段的有效持续时间来管理目标知识层片段在客户端的本地存储的时间,可保证目标知识层片段的有效持续时间范围内完成依赖目标知识层片段的序列层片段的解码,无需重复加载,避免了知识层片段的重复下载造成的带宽浪费。本申请也可在码流的MPD中获取目标知识层片段的有效持续时间,数据传输更简便,节省了数据传输资源,增强了知识层片段的管理的适用性。
结合第一方面,在第二种可能的实现方式中,所述目标知识层片段的管理数据为所述目标知识层片段的片段信息;
所述客户端解析所述管理数据,获取所述目标知识层片段的预设有效时间包括:
所述客户端获取所述目标知识层片段的片段信息中携带的所述目标知识层片段的起始有效时刻T3和所述目标知识层片段的有效持续时间L1,所述T3和所述L1作为所述目标知识层片段的预设有效时间;
所述客户端根据所述目标知识层片段的预设有效时间确定所述目标知识层片段的被删除时刻包括:
所述客户端计算T4,其中,T4=T3+L1,并将所述T4确定为所述知识层片段的被删除时刻。
在本申请中,客户端可在目标知识层片段中获取得到目标知识层片段的起始有效时刻和有效持续时间之后,可依据起始有效时刻和有效持续时间确定目标知识层片段的被删除时刻,提高了目标知识层片段的客户端本地存储时间的管理精确性,减少了客户端在目标 知识层片段的管理上的内存浪费,增强了知识层片段的管理的适用性。
结合第一方面,在第三种可能的实现方式中,所述目标知识层片段的管理数据为所述目标知识层片段的片段信息;
所述客户端解析所述管理数据,获取所述目标知识层片段的预设有效时间包括:
所述客户端获取所述目标知识层片段的片段信息中携带的所述目标知识层片段的失效时刻T5,所述T5作为所述目标知识层片段的预设有效时间的终止时刻;
所述客户端根据所述目标知识层片段的预设有效时间确定所述目标知识层片段的被删除时刻包括:
所述客户端将所述T5确定为所述知识层片段的被删除时刻。
本申请客户端可在目标知识层片段的片段信息中获取得到目标知识层片段的失效时刻,将目标知识层片段的失效时刻确定为其被删除时间,操作简便,提高了知识层片段的保存时间管理的精确性。进一步的可被删除时刻将目标知识层片段删除,减少了客户端在知识层片段的管理上的内存浪费,增强了知识层片段的管理的适用性。
结合第一方面第三种可能的实现方式,在第四种可能的实现方式中,所述至少两个不连续的序列层片段在所述T5之前被解码。
本申请将目标知识层片段的被删除时间设置在依赖目标知识层片段的序列层片段解码完成之后,在依赖目标知识层片段的序列层片段解码之后删除目标知识层片段,既可保证序列层片段的正确解码,又增强了码流数据的处理的适用性。
第二方面提供了一种码流数据的处理装置,其可包括:
获取单元,用于获取目标知识层片段的管理数据,所述目标知识层片段为码流中包含的至少一个知识层片段中的一个,所述目标知识层片段被所述码流中包含的至少两个不连续的序列层片段依赖,所述管理数据用于确定预设有效时间;
解析单元,用于解析所述获取单元获取的所述管理数据,获取所述目标知识层片段的预设有效时间,其中,所述至少两个不连续的序列层片段在所述预设有效时间内被解码;
确定单元,用于根据所述解析单元获取的所述目标知识层片段的预设有效时间确定所述目标知识层片段的被删除时刻;
删除单元,用于在所述确定单元确定的所述目标知识层片段的被删除时刻删除所述目标知识层片段。
结合第二方面,在第一种可能的实现方式中,所述目标知识层片段的管理数据为所述码流的初始化片段或者所述码流的媒体表达描述MPD;
所述解析单元具体用于:
解析所述获取单元获取的所述目标知识层片段的管理数据,获取所述初始化片段中携带的所述目标知识层片段的有效持续时间,或者所述MPD中描述的所述目标知识层片段的有效持续时间,作为所述目标知识层片段的预设有效时间L;
所述确定单元具体用于:
获取所述目标知识层片段被目标序列层片段依赖的被依赖时刻T1,所述目标序列层片段为所述至少两个不连续的序列层片段中的第一个;
结合所述解析单元获取的所述L计算T2,其中,T2=T1+L,并将所述T2确定为所述目标知识层片段的被删除时刻。
结合第二方面,在第二种可能的实现方式中,所述目标知识层片段的管理数据为所述目标知识层片段的片段信息;
所述解析单元具体用于:
解析所述获取单元获取的所述目标知识层片段的管理数据,获取所述目标知识层片段的片段信息中携带的所述目标知识层片段的起始有效时刻T3和所述目标知识层片段的有效持续时间L1,所述T3和所述L1作为所述目标知识层片段的预设有效时间;
所述确定单元具体用于:
根据所述解析单元获取的所述T3和所述L1计算T4,其中,T4=T3+L1,并将所述T4确定为所述知识层片段的被删除时刻。
结合第二方面,在第三种可能的实现方式中,所述目标知识层片段的管理数据为所述目标知识层片段的片段信息;
所述解析单元具体用于:
解析所述获取单元获取的所述目标知识层片段的管理数据,获取所述目标知识层片段的片段信息中携带的所述目标知识层片段的失效时刻T5,所述T5作为所述目标知识层片段的预设有效时间的终止时刻;
所述确定单元具体用于:
将所述解析单元获取的所述T5确定为所述知识层片段的被删除时刻。
结合第二方面第三种可能的实现方式,在第四种可能的实现方式中,所述至少两个不连续的序列层片段在所述解析单元获取的所述T5之前被解码。
在本申请中,码流数据的处理装置通过目标知识层片段的预设有效时间管理目标知识层片段,在目标知识层片段的被删除时刻将目标知识层片段删除,通过目标知识层片段的预设有效时间确定其被删除时间可保证该预设有效时间内,依赖目标知识层片段的序列层片段的解码,无需重新加载,避免了数据传输的带宽浪费。在目标知识层片段的被删除时刻将其删除,也减少目标知识层片段对客户端本地存储空间的占用,增强了码流数据的管理的适用性。
第三方面提供了一种客户端,其可包括:存储器和处理器,所述存储器和所述处理器相连;
所述存储器用于存储一组程序代码;
所述处理器用于调用所述存储器中存储的程序代码执行上述第一方面提供的码流数据的处理方法。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是系统层视频流媒体传输采用的DASH标准传输的框架实例示意图;
图2是系统层视频流媒体传输采用的DASH标准传输的MPD的结构示意图;
图3是多个相互独立的随机访问片段的示意图;
图4是知识库为随机访问片段提供编码参考的示意图;
图5是本发明实施例提供的码流数据的处理方法的流程示意图;
图6是本发明实施例提供的码流数据的处理装置的结构示意图;
图7是本发明实施例提供的客户端的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
当前以客户端为主导的系统层视频流媒体传输方案可采用基于通过超文本传输协议(英文:hypertext transfer protocol,HTTP)动态自适应流(英文:dynamic adaptive streaming over HTTP,DASH)标准框架,如图1,图1是系统层视频流媒体传输采用的DASH标准传输的框架实例示意图。系统层视频流媒体传输方案的数据传输过程包括两个过程:服务器端(如HTTP服务器,以下简称服务器)为视频内容生成媒体数据的过程,和客户端(如HTTP流媒体客户端)向服务器请求并获取媒体数据的过程。其中,服务器上的媒体表达中包括多个描述层,每个描述层描述多个片段。客户端的HTTP流媒体请求控制模块获取服务器发送的媒体表达描述(英文:Media Presentation Description,MPD),并对MPD进行分析确定要请求的片段,通过HTTP请求接收端向服务器请求相应的片段,并通过媒体播放器进行解码播放。
1)在上述服务器为视频内容生成媒体数据的过程中,服务器为视频内容生成的媒体数据包括同一视频内容的不同版本的视频码流,以及码流的MPD。例如,服务器为同一集电视剧的视频内容生成低分辨率低码率低帧率(如360p分辨率、300kbps码率、15fps帧率)的码流,中分辨率中码率高帧率(如720p分辨率、1200kbps码率、25fps帧率)的码流,高分辨率高码率高帧率(如1080p分辨率、3000kbps码率、25fps帧率)的码流等。
此外,服务器还可为该集电视剧的视频内容生成码流的MPD。其中,如图2,图2是系统传输方案DASH标准的MPD的结构示意图。上述码流的MPD包含多个描述层(英文:Representation),例如,图2的媒体表达(英文:Media Presentation)中的period start=100s部分可包含Representation1、Representation2,…等多个描述层。每个描述层描述上述码流的一个或者多个片段。其中,上述码流中的MPD包含的各个描述层之间可以相互独立、也可以相互依赖。其中,上述各个描述层之间相互独立表示各个描述层的编解码不参考其他描述层(例如描述知识层片段的描述层,该知识层片段的编解码不参考其他片段),各个描述层之间相互依赖表示各个描述层的编解码需要参考其他描述层(例如描述序列层片段的描述层,该序列层片段的编解码需要参考知识层片段)。每个描述层按照时序描述若干个片段(英文:Segment)的信息,例如初始化片段(英文:Initialization segment)、Media Segment1、 Media Segment1,…,Media Segment20等,所有片段在时间上首尾相接。每个片段包含一个时间段内的视频码流,片段在描述层中的描述包括播放起始时刻、播放持续时长、网络存储地址(例如以统一资源定位符(英文:Universal Resource Locator,URL)的形式表示的网络存储地址)等片段信息。
进一步的,片段还允许被细分为多个子片段(英文:Subsegment),每个子片段包含片段的一部分,子片段的信息包括播放起始时刻、播放持续时长、该子片段在其所属片段的码流中的字节范围(英文:Byte Range)等。上述子片段的信息由片段索引(英文:Segment Index)描述,每个片段索引描述一个片段中所有的子片段的信息;片段索引可以和片段合并、存储在片段的起始位置,也可以独立存储在索引片段(英文:Index Segment)中。上述子片段的更多描述可参见系统传输方案DASH标准中提供的信息,在此不做限制。
2)在客户端向服务器请求并获取媒体数据的过程中,用户选择播放视频时,客户端根据用户点播的视频内容向服务器获取相应的MPD,进而根据MPD中描述的视频内容的片段信息生成片段列表。上述片段列表记录了每个片段的播放时段和每个片段的网络存储地址。客户端根据用户点播的点播时刻从片段列表中获取播放时刻覆盖该点播时刻的一个或多个片段的网络存储地址,向服务器发送下载网络存储地址对应的视频片段数据的请求,服务器根据接收到的请求向客户端发送视频片段数据。客户端获取得到服务器发送的视频片段数据之后,则可通过媒体播放器进行解码、播放等操作。
系统层视频流媒体传输方案采用DASH标准,通过客户端分析MPD、按需向服务器请求视频数据并接收服务器发送的数据的方式实现视频数据的传输。系统层视频流媒体传输方案采用DASH标准主要适用于传统视频编码(例如H.264、HEVC(英文全称:High Efficiency Video Coding)等编码标准)产生的视频码流。如图3,图3是多个相互独立的随机访问片段的示意图。其中,圆点代表随机访问点,方块代表随机访问点后的随机访问片段,带有x号的虚线箭头表示箭头指向的随机访问片段在编码时不能参考虚线起始的随机访问片段的信息。即,在传统视频的编解码技术中,一个随机访问片段中的图像只能作为该随机访问片段中的其他图像的参考图像/参考帧,即不允许跨随机访问点的帧间预测,较大地限制了视频编/解码的效率。
LBVC将多个随机访问片段中的公共图像信息(包含了随机访问片段间的互信息(英文:mutual information),即随机访问片段间的图像在编解码时相互参考的信息)提取组织成知识库,这些公共图像信息仅编码一次,各随机访问片段中的图像允许参考这些公共图像信息进行编码(和解码),从而使得编码器(或解码器)利用随机访问片段之间的互信息,进一步去除视频序列的冗余信息,提升整个视频序列的编码效率,降低存储空间,节省传输带宽。如图4,图4是基于知识库的视频编码中以一个知识库为其他随机访问片段提供编码参考的示意图。其中,圆点代表随机访问点,方块代表随机访问点后的随机访问片段,箭头表示多个随机访问片段在编码时以知识库(英文:Library)提供的信息作为参考。
这种基于知识库的编码方法将视频中多次出现的相似内容提取放入知识库中,通过参考知识库中的图像提高视频的编码效率。此时,随机访问点图像可以参考知识库中图像进行编码/解码,也可以直接使用传统的帧内编码方法。随机访问点图像不依赖于视频序列中的其它图像进行编码/解码,各随机访问片段之间仍然相互独立。
在现有的DASH标准中,知识层片段和序列层片段之间的依赖关系(或称参考关系)在MPD中描述。在MPD中,每个描述层都有独立的ID。MPD的描述层级(即representation级)的语法(即用于描述描述层属性信息的语法)中包含一个属性dependencyId,用于指示该属性所在的描述层解码或者描述需要依赖的另一个描述层的ID。客户端在请求携带dependencyId属性的representation的segment(假设为segment1)数据时需要将segment1所依赖的segment(假设为segment2)也获取得到才能正确解码或者描述segment1。下面将结合MPD中部分representation的描述对各个representation中的segment的依赖关系进行描述,其中,在representation的上面层级的信息不做赘述:
Figure PCTCN2017073623-appb-000001
其中,上述描述表明Id为“tag6”的representation依赖于Id为“tag5”的representation,即上述Id为“tag6”的representation中描述的segment解码依赖于Id为“tag5”的representation中描述的segment。上述描述通过一个索引分段(即index segment)来描述segment的URL。
其中,上述index segment中的语法格式在ISO/IEC 14496-12中的描述如下:
Figure PCTCN2017073623-appb-000002
Figure PCTCN2017073623-appb-000003
其中,上述描述中包含的语法元素表示的含义如下:
reference_ID:码流的ID;
timescale:时间单位;
earliest_presentation_time:index segment中描述的码流的最早呈现时间,以timescale为单位;
first_offset:第一个segment在index segment后的起始偏移;
reference_count:index segment中描述的segment的个数;
reference_type;1表示segment是index segment,0表示segment是媒体内容;
referenced_size:segment的大小;
subsegment_duration:以timescale为单位的segment持续时长;
starts_with_SAP:segment的流接入类型;
SAP_delta_time:第一个流接入点的最早呈现时间。
针对上述的MPD中描述的信息,客户端获取码流数据的流程如下步骤:
1、客户端接收到服务器发送的含有上述信息的MPD,解析上述MPD中包含的信息以获得representation之间的依赖关系和index segment的信息。
2、客户端根据用户点播视频时选择的点播时刻等信息选择要请求的representation,比如,id="tag5"的representation。
3、在确定了要请求的representation后,客户端根据MPD中的indexRange信息构造请求index segment的URL,比如http://example.com/video-512k.mp4/0-4332,进而可根据index segment的URL请求index segment。
4、客户端获取到index segment,解析index segment的sidx box信息,获得segment的信息,进而可根据segment的信息构造segment的URL,并根据segment的URL请求segment。其中,上述sidx box为分段名为index segment的分段中的具体语法盒子。
5、客户端请求id="tag6"的representation的segment。具体的,客户端首先请求id="tag6"的representation的index segment,解析index segment获得segment的信息,进而可根据segment的信息构造segment的URL,并根据segment的URL请求segment。
具体的,客户端根据要切换码流的时间点信息,包括id="tag5"的representation的第i个segment信息和id="tag6"的representation的第i个segment的信息,确定要下载的id="tag5"的representation中第i个segment的URL和id="tag6"的representation中第i个segment的URL。比如,客户端切换的码流时间点是客户端播放器的播放进度时间轴中进度为1分钟的时间点,该时间点对应的id="tag5"的representation中第i个segment的range信息是10000-10500,那么该segment的URL为http://example.com/video-512k.mp4/10000-10500;该时间点对应的id="tag6"的representation中第i个segment的range信息是9000-9400,那么该segment的URL为http://example.com/video-768k.mp4/9000-9400。其中,客户端解码时tag6的segment依赖于tag5的segment的数据。
客户端确定了两个segment的URL之后,则可向服务器发出segment请求,请求的两个segment的URL分别为http://example.com/video-512k.mp4/10000-10500和http://example.com/video-768k.mp4/9000-9400。服务器接收到客户端的请求之后,则可将上述两个segment的数据发送给客户端。
6、客户端将接收到的segment数送入解码器进行解码。
在上述实现方式中,客户端对不同segment的数据进行解码时都是独立解码。客户端对segment1解码之后,segment1依赖的segment2在客户端解码下一个segment(假设为segment3)之前就会被清除。若后续segment(假设为segment4)也依赖segment2,则需要再次请求segment2,客户端无法根据segment2的被依赖状态(即segment2被多个其他segment依赖)确定segment2的保存时长,导致segment2的重复请求和下载,进而造成客户端的带宽浪费。对此,本发明实施例提供了一种码流数据的处理方法,可根据知识层码流中的知识层片段的保存时间信息对知识层片段进行保存或者管理,减少知识层片段的重复请求次数,节省客户端的数据传输带宽。
参见图5,是本发明实施例提供的码流数据的处理方法的流程示意图。本发明实施例提供的方法,包括步骤:
S101,客户端获取目标知识层片段管理数据。
在一些可行的实施方式中,服务器生成视频码流的媒体数据时,可预先根据依赖目标知识层片段的序列层片段的被编码时段,设定目标知识层片段的预设有效时间。其中,依赖目标知识层片段的序列层片段在目标知识层片段的预设有效时间内被编码。客户端解码视频数据时,在目标知识层片段的预设有效时间内解码依赖目标知识层片段的序列层片段。
具体实现中,上述目标知识层片段为知识层码流切分得到的多个知识层片段中的一个,目标知识层片段被至少两个不连续的序列层片段所依赖。序列层码流切分得到的多个片段称为序列层片段。序列层片段中包含连续的序列层片段和不连续的序列层片段,即时间上连续的片段和时间上不连续的序列层片段,并且至少一个序列层片段的编码以某一个或者多个知识层片段为参考片段。服务器可将被至少两个不连续的序列层片段依赖的知识层片段确定为目标知识层片段。例如,目标知识层片段被序列层片段1、序列层片段2、序列层片段4和序列层片段5依赖,其中,序列层片段1和序列层片段2为时间上连续的序列层片段,序列层片段4和序列层片段5也为时间上连续的序列层片段,序列层片段1和序列层片段4、 序列层片段2和序列层片段4均为时间上不连续的序列层片段等,序列层片段1和序列层片段5、序列层片段2和序列层片段5也均为时间上不连续的序列层片段。目标知识层片段也可为被序列层片段2和序列层片段4依赖的片段,即依赖目标知识层片段的序列层片段至少包含两个不连续的片段。
具体实现中,被至少两个不连续的序列层片段的知识层片段可包括多个,本发明实施例将以其中任一个知识层片段作为目标知识层片段为例进行说明。服务器可将目标知识层片段的预设有效时间等信息封装在码流的媒体数据中,进而可在客户端发送获取媒体数据的请求时将上述码流及其媒体数据等信息反馈给客户端。
在一些可行的实施方式中,客户端可获取服务器发送的码流及其媒体数据,通过解析上述码流及其媒体数据获取目标知识层片段的管理数据。其中,上述目标知识层片段的管理数据用于确定知识层片段的预设有效时间。具体实现中,上述目标知识层片段的管理数据可包括码流的初始化片段、码流的MPD或者码流的知识层片段等,具体可根据实际应用场景确定,在此不做限制。
S102,客户端解析所述管理数据,获取所述目标知识层片段的预设有效时间。
具体实现中,上述目标知识层片段的预设有效时间可为码流的初始化片段中携带的目标知识层片段的有效持续时间,或者码流的MPD中描述的目标知识层片段的有效持续时间。进一步的,上述目标知识层片段的预设有效时间也可为目标知识层片段的片段信息中携带的起始有效时刻和有效持续时间,或者失效时刻等信息。具体可根据实际应用场景确定,在此不做限制。下面将结合步骤S103对目标知识层片段的预设有效时间确定进行描述。
S103,客户端根据所述目标知识层片段的预设有效时间确定所述目标知识层片段的被删除时刻。
在一些可行的实施方式中,上述目标知识层片段的管理数据可为码流的初始化片段(英文:initialization segment),其中,上述初始化片段中携带目标知识层片段的有效持续时间。具体实现中,上述知识层码流切分得到的知识层片段可为码流中的一帧数据,一个知识层片段为一个视频帧。具体的,客户端可对上述管理数据进行解析,获取上述初始化片段中包含的有效持续时间。其中,上述目标知识层片段的有效持续时间可采用如下语法格式在码流的初始化片段中添加,客户端可解析如下语法格式描述的初始化片段获取有效持续时间。
语法格式:
Figure PCTCN2017073623-appb-000004
其中,上述语法格式中包含的语法元素表示的含义如下:
timescale:时间单位或者时间尺度;
duration:以timescale为单位的有效持续时间。
其中,上述语法元素表示的知识层片段的有效持续时间为duration/timescale的比值。例如,假设timescale=1000,duration=100000,则知识层片段的有效持续时间为100000/1000秒,即知识层片段的有效持续时间为100秒。在具体应用中timescale也可以不在edur box中,可以采用文件中的其他timescale。
客户端可向服务器请求获取码流的初始化片段。当客户端获取到上述码流的初始化片段时,可对其进行解析,从上述初始化片段中获取目标知识层片段的有效持续时间(设为L)。进一步的,客户端可根据用户点播或者序列层片段的解码需求等实际应用场景需求向服务器端请求获取目标知识层片段。服务器可根据客户端的请求向客户端反馈的目标知识层片段。客户端获取得到服务器发送的目标知识层片段之后,可记录目标知识层片段的第一次被参考的时刻。其中,上述目标知识层片段的第一次被参考的时刻具体可为目标知识层片段被目标序列层片段依赖的被依赖时刻(设为T1),其中,上述目标序列层片段为依赖该目标知识层片段的至少两个序列层片段中第一个被解码的片段。其中,上述目标知识层片段的被依赖时刻具体体现可为目标知识层片段被送入解码器的时刻。进一步的,客户端可计算T2,其中,T2=T1+L,并将T2作为目标知识层片段的被删除时刻。例如,客户端可在知识层片段被送入解码器时开始计时,当计时时间长度等于上述语法元素中记载的知识层片段的有效持续时间L时停止计时,并将停止计时的时刻确定为目标知识层片段的被删除时间。
在上述实现方式中,客户端获取得到码流的初始化分段后,则可确定目标知识层片段的有效持续时间。当目标知识层片段被下载到客户端本地存储后,客户端可通过最大有效时间(即目标知识层片段的有效持续时间)管理在客户端本地存储中的目标知识层片段。客户端通过最大有效时间管理目标知识层片段,可保证目标知识层片段的有效持续时间范围内,如果非知识层片段(即序列层片段)解码需要上述目标知识层片段,客户端则可首先在本地存储中查找目标知识层片段,无需重新请求获取,可重复使用本地所管理的知识层片段,避免了知识层片段的重复下载造成的带宽浪费。
进一步的,在一些可行的实施方式中,上述目标知识层片段的有效持续时间也可携带在码流的MPD中。服务器在生成码流的MPD时,可在码流的MPD中添加目标知识层片段的有效持续时间。具体的,服务器可在MPD中的描述层中添加新的语法元素,例如@EffectiveDuration。上述语法元素@EffectiveDuration表示其所在的描述层描述的知识层片段的有效持续时间为EffectiveDuration的值,其中,上述EffectiveDuration的值以MPD中的timescale属性为单位。例如,假设上述EffectiveDuration的值为100000,MPD中的timescale的值为1000,则EffectiveDuration所在的描述层描述的知识层片段的有效持续时间为100秒。
客户端可向服务器请求获取码流的MPD,进而可对获取的MPD进行解析,获取上述码流的MPD中携带的目标知识层片段的有效持续时间(例如上述以timescale属性为单位的EffectiveDuration的值)。客户端获取得到服务器发送的目标知识层片段时,可记录目标知识层片段的第一次被参考的时刻。其中,上述目标知识层片段的第一次被参考的时刻具体可为目标知识层片段被目标序列层片段依赖的被依赖时刻(设为T1),其中,上述目标序列层片段为依赖该目标知识层片段的至少两个序列层片段中第一个被解码的片段。其中,上述目标知识层片段的被依赖时刻具体体现可为目标知识层片段被送入解码器的时刻。进一步的,客户端可计算T2,其中,T2=T1+L,并将T2作为目标知识层片段的被删除时刻。例如, 客户端可在知识层片段被送入解码器时开始计时,当计时时间长度大于或者等于上述以timescale属性为单位的EffectiveDuration的值对应的知识层片段的有效持续时间时停止计时,并将停止计时的时刻确定为知识层片段的被删除时间。
上述实现方式在MPD中描述目标知识层片段的有效持续时间,数据传输更简便,节省了数据传输资源,增强了知识层片段的管理的适用性。
进一步的,在上述实现方式中,服务器在初始化片段中添加知识层片段的有效持续时间时可设定每个知识层片段的有效持续时间一致。在实际应用中,每个知识层片段的使用时间不同,比如,假如知识层片段2的有效使用时间(解码或者被依赖的时间段)是5秒。服务器在初始化片段中添加的每个知识层片段的有效持续时间均为50秒,知识层片段2在被送入解码器的5秒后将不再被使用,则知识层片段2还会被继续在客户端保存45s,容易造成客户端的本地存储空间的浪费。为此,在一些可行的实施方式中,在知识层片段的封装中,服务器可将每个知识层片段单独封装,在每个知识层片段的片段信息中携带该知识层片段的预设有效时间。具体的,可在目标知识层片段的片段信息中添加目标知识层片段的预设有效时间。其中,上述目标知识层片段的预设有效时间可包括目标知识层片段的起始有效时刻(设为T3)和目标知识层片段的有效持续时间(设为L1)等。具体的,服务器可采用如下封装语法格式对目标知识层片段进行封装,并在目标知识层片段中携带起始有效时刻和有效持续时间。
语法格式:
-Effective duration box(‘efdu’)
aligned(8)class EffectiveDurationbox extends FullBox(‘‘efdu’,version,flag){
unsigned int(32)timescale;
unsigned int(32)start_time;
unsigned int(32)duration;
}
其中,上述语法格式中包含的语法元素表示的含义如下:
timescale:时间单位或者时间尺度;
start_time:以timescale为单位的起始有效时刻;
duration:以timescale为单位的有效持续时间。
具体实现中timescale语法可以没有。
具体实现中,客户端可根据用户点播或者序列层片段的解码需求等实际应用场景需求向服务器请求获取目标知识层片段。客户端获取得到服务器发送的目标知识层片段之后,可解析上述目标知识层片段中的efdu box信息,从中获取目标知识层片段的片段信息中包含的start_time和duration等信息。进一步的,客户端可根据上述start_time和duration等信息确定知识层片段的预设有效时间。其中,上述start_time可为目标知识层片段第一次被参考的时刻,例如目标知识层片段被送入解码器的时刻,具体可根据时间应用场景确定,在此不做限制。进一步的,客户端也可计算目标知识层片段的起始有效时刻T3和有效持续时间L1的累计值T4,即T4=T3+L1,将上述T4确定为知识层片段的被删除时刻。例如,客户端可直接计算知识层片段的最后有效时刻为start_time+duration,进而可将上述最后有效时刻确定为 知识层片段的被删除时刻。此外,客户端还可根据知识层片段的起始有效时刻start_time(也可标记为T3)和当前媒体数据处理时刻T31(例如媒体数据的解码时刻等),计算目标知识层片段的剩余有效持续时间L11,其中,上述剩余有效持续时间L11=duration-(T31-start_time)。上述公式中的时间计算在统一的时间单位中计算。
需要说明的是,若具体实现中,上述知识层片段中的efdu box信息中只包含知识层片段的duration数据,则可将上述duration数据的值确定为知识层片段的有效持续时间,并将当前媒体数据处理时刻确定知识层片段的start_time。具体可根据实际应用场景确定,在此不做限制。
在上述实现方式中,客户端获取得到目标知识层片段的起始有效时刻和有效持续时间之后,则可依据起始有效时刻或者当前媒体数据处理时刻,以及有效持续时间管理客户端本地存储的目标知识层片段,提高了目标知识层片段的管理的精确性,进一步降低客户端带宽的浪费,减少了客户端在知识层片段的管理上的内存浪费,增强了知识层片段的管理的适用性。
进一步的,在一些可行的实施方式中,在知识层片段的封装中,服务器还可将每个知识层片段单独封装,在每个知识层片段的片段信息中携带知识层片段的失效时刻(或称超时时间),通过知识层片段的失效时刻来指示客户端对知识层片段进行管理。具体的,可在目标知识层片段的片段信息中携带目标知识层片段的失效时刻。服务器可采用如下封装语法格式对目标知识层片段进行封装,并在目标知识层片段中携带目标知识层片段的失效时刻。
语法格式:
-Expire date box(‘expd’)
aligned(8)class Expiredatebox extends FullBox(‘expd’,version,flag){
unsigned int(32)timescale;
unsigned int(32)Expiredate;
}
其中,上述语法格式中包含的语法元素表示的含义如下:
timescale:时间单位或者时间尺度;
Expiredate:失效时刻。
具体实现中timescale语法可以没有。
具体实现中,客户端获取得到目标知识层片段之后,可解析上述知识层片段中的expd box信息,从中获取知识层片段的Expiredate等信息。进一步的,客户端可将上述Expiredate指示的时刻(即失效时刻T5)确定知识层片段的被删除时刻。具体实现中,依赖目标知识层片段的至少两个序列层片段在目标知识层片段的失效时刻之前被解码完成。
在上述实现方式中,客户端获取得到每个知识层片段的失效时刻,将每个知识层片段的失效时刻确定为其被删除时间,操作简便,提高了知识层片段的管理的精确性,进一步降低客户端带宽的浪费,减少了客户端在知识层片段的管理上的内存浪费,增强了知识层片段的管理的适用性。
需要说明的是,在具体实现中,客户端可根据实际应用需求从上述各个实现方式中选 择任一种或者多种,在此不做限制。
S104,客户端在所述目标知识层片段的被删除时刻删除所述目标知识层片段。
在一些可行的实施方式中,客户端根据上述任一实现方式确定了知识层片段的被删除时刻之后,可在上述被删除时刻将上述知识层片段删除,减少客户端的存储空间浪费。
在本发明实施例中,客户端可根据码流的初始化分段或者MPD确定知识层码流中的目标知识层片段的预设有效时间,也可根据目标知识层片段中携带的信息确定目标知识层片段的预设有效时间。进一步的,可根据目标知识层片段的预设有效时间确定目标知识层片段的最后有效时时刻,即目标知识层片段的被删除时刻,并可在上述被删除时刻将目标知识层片段删除,否则将目标知识层片段保存中客户端的本地存储中。客户端通过目标知识层片段的预设有效时间管理目标知识层片段,可保证该预设有效时间内,如果非知识层片段(即序列层片段)解码需要上述目标知识层片段,客户端则可首先在本地存储中查找所需要的目标知识层片段,无需重新请求获取,可重复使用本地所管理的知识层片段,避免了知识层片段的重复下载,避免了带宽浪费。在知识层片段的被删除时刻将其删除,也减少知识层片段对客户端本地存储空间的占用。
参见图6,是本发明实施例提供的码流数据的处理装置的结构示意图。本发明实施例提供的处理装置包括:
获取单元61,用于获取目标知识层片段的管理数据,所述目标知识层片段为码流中包含的至少一个知识层片段中的一个,所述目标知识层片段被所述码流中包含的至少两个不连续的序列层片段依赖,所述管理数据用于确定预设有效时间。
解析单元62,用于解析所述获取单元61获取的所述管理数据,获取所述目标知识层片段的预设有效时间,其中,所述至少两个不连续的序列层片段在所述预设有效时间内被解码。
确定单元63,用于根据所述解析单元62获取的所述目标知识层片段的预设有效时间确定所述目标知识层片段的被删除时刻。
删除单元64,用于在所述确定单元63确定的所述目标知识层片段的被删除时刻删除所述目标知识层片段。
在一些可行的实施方式中,所述目标知识层片段的管理数据为所述码流的初始化片段或者所述码流的媒体表达描述MPD;
所述解析单元62具体用于:
解析所述获取单元61获取的所述目标知识层片段的管理数据,获取所述初始化片段中携带的所述目标知识层片段的有效持续时间,或者所述MPD中描述的所述目标知识层片段的有效持续时间,作为所述目标知识层片段的预设有效时间L;
所述确定单元63具体用于:
获取所述目标知识层片段被目标序列层片段依赖的被依赖时刻T1,所述目标序列层片段为所述至少两个不连续的序列层片段中的第一个;
结合所述解析单元62获取的所述L计算T2,其中,T2=T1+L,并将所述T2确定为所述目标知识层片段的被删除时刻。
在一些可行的实施方式中,所述目标知识层片段的管理数据为所述目标知识层片段的片段信息;
所述解析单元62具体用于:
解析所述获取单元61获取的所述目标知识层片段的管理数据,获取所述目标知识层片段的片段信息中携带的所述目标知识层片段的起始有效时刻T3和所述目标知识层片段的有效持续时间L1,所述T3和所述L1作为所述目标知识层片段的预设有效时间;
所述确定单元63具体用于:
根据所述解析单元62获取的所述T3和所述L1计算T4,其中,T4=T3+L1,并将所述T4确定为所述知识层片段的被删除时刻。
在一些可行的实施方式中,所述目标知识层片段的管理数据为所述目标知识层片段的片段信息;
所述解析单元62具体用于:
解析所述获取单元61获取的所述目标知识层片段的管理数据,获取所述目标知识层片段的片段信息中携带的所述目标知识层片段的失效时刻T5,所述T5作为所述目标知识层片段的预设有效时间的终止时刻;
所述确定单元63具体用于:
将所述解析单元62获取的所述T5确定为所述知识层片段的被删除时刻。
在一些可行的实施方式中,所述至少两个不连续的序列层片段在所述解析单元62获取的所述T5之前被解码。
具体实现中,本发明实施例提供的码流数据的处理装置具体可为上述实施例提供的客户端。其中,上述处理装置包含的获取单元61、解析单元62、确定单元63和删除单元64可为上述客户端的功能模块,例如HTTP流媒体客户端中的HTTP流媒体请求控制模块等,具体可根据实际应用场景需求确定,在此不再赘述。上述处理装置可通过其内置的各个单元执行上述码流数据的处理方法中客户端所执行的实现方式,在此不再赘述。
在本发明实施例中,客户端可根据码流的初始化分段或者MPD确定知识层码流中的目标知识层片段的预设有效时间,也可根据目标知识层片段中携带的信息确定目标知识层片段的预设有效时间。进一步的,可根据目标知识层片段的预设有效时间确定目标知识层片段的最后有效时时刻,即目标知识层片段的被删除时刻,并可在上述被删除时刻将目标知识层片段删除,否则将目标知识层片段保存中客户端的本地存储中。客户端通过目标知识层片段的预设有效时间管理目标知识层片段,可保证该预设有效时间内,如果非知识层片段(即序列层片段)解码需要上述目标知识层片段,客户端则可首先在本地存储中查找所需要的目标知识层片段,无需重新请求获取,可重复使用本地所管理的知识层片段,避免了知识层片段的重复下载,避免了带宽浪费。在知识层片段的被删除时刻将其删除,也减少知识层片段对客户端本地存储空间的占用。
参见图7,是本发明实施例提供的客户端的结构示意图。本发明实施例提供的客户端可包括存储器71和处理器72,上述存储器71和处理器72相连。
上述存储器71用于存储一组程序代码。
上述处理器72用于调用上述存储器71中存储的程序代码执行如下操作:
获取目标知识层片段的管理数据,所述目标知识层片段为码流中包含的至少一个知识层片段中的一个,所述目标知识层片段被所述码流中包含的至少两个不连续的序列层片段依赖,所述管理数据用于确定预设有效时间;
解析所述管理数据,获取所述目标知识层片段的预设有效时间,其中,所述至少两个不连续的序列层片段在所述预设有效时间内被解码;
根据所述目标知识层片段的预设有效时间确定所述目标知识层片段的被删除时刻;
在所述目标知识层片段的被删除时刻删除所述目标知识层片段。
在一些可行的实施方式中,所述目标知识层片段的管理数据为所述码流的初始化片段或者所述码流的媒体表达描述MPD;
上述处理器72具体用于:
获取所述初始化片段中携带的所述目标知识层片段的有效持续时间,或者所述MPD中描述的所述目标知识层片段的有效持续时间,作为所述目标知识层片段的预设有效时间L;
获取所述目标知识层片段被目标序列层片段依赖的被依赖时刻T1,所述目标序列层片段为所述至少两个不连续的序列层片段中的第一个;
计算T2,其中,T2=T1+L,并将所述T2确定为所述目标知识层片段的被删除时刻。
在一些可行的实施方式中,所述目标知识层片段的管理数据为所述目标知识层片段的片段信息;
上述处理器72具体用于:
获取所述目标知识层片段的片段信息中携带的所述目标知识层片段的起始有效时刻T3和所述目标知识层片段的有效持续时间L1,所述T3和所述L1作为所述目标知识层片段的预设有效时间;
计算T4,其中,T4=T3+L1,并将所述T4确定为所述知识层片段的被删除时刻。
在一些可行的实施方式中,所述目标知识层片段的管理数据为所述目标知识层片段的片段信息;
上述处理器72具体用于:
获取所述目标知识层片段的片段信息中携带的所述目标知识层片段的失效时刻T5,所述T5作为所述目标知识层片段的预设有效时间的终止时刻;
将所述T5确定为所述知识层片段的被删除时刻。
在一些可行的实施方式中,所述至少两个不连续的序列层片段在所述T5之前被解码。
具体实现中,客户端可通过其处理器72执行上述实施例提供的码流数据的处理方法中客户端所执行的实现方式,在此不再赘述。
在本发明实施例中,客户端可根据码流的初始化分段或者MPD确定知识层码流中的目标知识层片段的预设有效时间,也可根据目标知识层片段中携带的信息确定目标知识层片段的预设有效时间。进一步的,可根据目标知识层片段的预设有效时间确定目标知识层片段的最后有效时时刻,即目标知识层片段的被删除时刻,并可在上述被删除时刻将目标知识层片段删除,否则将目标知识层片段保存中客户端的本地存储中。客户端通过目标知识层片段的预设有效时间管理目标知识层片段,可保证该预设有效时间内,如果非知识层片 段(即序列层片段)解码需要上述目标知识层片段,客户端则可首先在本地存储中查找所需要的目标知识层片段,无需重新请求获取,可重复使用本地所管理的知识层片段,避免了知识层片段的重复下载,避免了带宽浪费。在知识层片段的被删除时刻将其删除,也减少知识层片段对客户端本地存储空间的占用。
本发明的说明书、权利要求书以及附图中的术语“第一”、“第二”、“第三”和“第四”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或者单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或者单元,或可选地还包括对于这些过程、方法、系统、产品或设备固有的其他步骤或单元。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。
以上所揭露的仅为本发明较佳实施例而已,当然不能以此来限定本发明之权利范围,因此依本发明权利要求所作的等同变化,仍属本发明所涵盖的范围。

Claims (10)

  1. 一种码流数据的处理方法,其特征在于,包括:
    客户端获取目标知识层片段的管理数据,所述目标知识层片段为码流中包含的至少一个知识层片段中的一个,所述目标知识层片段被所述码流中包含的至少两个不连续的序列层片段依赖,所述管理数据用于确定预设有效时间;
    所述客户端解析所述管理数据,获取所述目标知识层片段的预设有效时间,其中,所述至少两个不连续的序列层片段在所述预设有效时间内被解码;
    所述客户端根据所述目标知识层片段的预设有效时间确定所述目标知识层片段的被删除时刻;
    所述客户端在所述目标知识层片段的被删除时刻删除所述目标知识层片段。
  2. 如权利要求1所述的方法,其特征在于,所述目标知识层片段的管理数据为所述码流的初始化片段或者所述码流的媒体表达描述MPD;
    所述客户端解析所述管理数据,获取所述目标知识层片段的预设有效时间包括:
    所述客户端获取所述初始化片段中携带的所述目标知识层片段的有效持续时间,或者所述MPD中描述的所述目标知识层片段的有效持续时间,作为所述目标知识层片段的预设有效时间L;
    所述客户端根据所述目标知识层片段的预设有效时间确定所述目标知识层片段的被删除时刻包括:
    所述客户端获取所述目标知识层片段被目标序列层片段依赖的被依赖时刻T1,所述目标序列层片段为所述至少两个不连续的序列层片段中的第一个;
    计算T2,其中,T2=T1+L,并将所述T2确定为所述目标知识层片段的被删除时刻。
  3. 如权利要求1所述的方法,其特征在于,所述目标知识层片段的管理数据为所述目标知识层片段的片段信息;
    所述客户端解析所述管理数据,获取所述目标知识层片段的预设有效时间包括:
    所述客户端获取所述目标知识层片段的片段信息中携带的所述目标知识层片段的起始有效时刻T3和所述目标知识层片段的有效持续时间L1,所述T3和所述L1作为所述目标知识层片段的预设有效时间;
    所述客户端根据所述目标知识层片段的预设有效时间确定所述目标知识层片段的被删除时刻包括:
    所述客户端计算T4,其中,T4=T3+L1,并将所述T4确定为所述知识层片段的被删除时刻。
  4. 如权利要求1所述的方法,其特征在于,所述目标知识层片段的管理数据为所述目标知识层片段的片段信息;
    所述客户端解析所述管理数据,获取所述目标知识层片段的预设有效时间包括:
    所述客户端获取所述目标知识层片段的片段信息中携带的所述目标知识层片段的失效时刻T5,所述T5作为所述目标知识层片段的预设有效时间的终止时刻;
    所述客户端根据所述目标知识层片段的预设有效时间确定所述目标知识层片段的被删除时刻包括:
    所述客户端将所述T5确定为所述知识层片段的被删除时刻。
  5. 如权利要求4所述的方法,其特征在于,所述至少两个不连续的序列层片段在所述T5之前被解码。
  6. 一种码流数据的处理装置,其特征在于,包括:
    获取单元,用于获取目标知识层片段的管理数据,所述目标知识层片段为码流中包含的至少一个知识层片段中的一个,所述目标知识层片段被所述码流中包含的至少两个不连续的序列层片段依赖,所述管理数据用于确定预设有效时间;
    解析单元,用于解析所述获取单元获取的所述管理数据,获取所述目标知识层片段的预设有效时间,其中,所述至少两个不连续的序列层片段在所述预设有效时间内被解码;
    确定单元,用于根据所述解析单元获取的所述目标知识层片段的预设有效时间确定所述目标知识层片段的被删除时刻;
    删除单元,用于在所述确定单元确定的所述目标知识层片段的被删除时刻删除所述目标知识层片段。
  7. 如权利要求6所述的处理装置,其特征在于,所述目标知识层片段的管理数据为所述码流的初始化片段或者所述码流的媒体表达描述MPD;
    所述解析单元具体用于:
    解析所述获取单元获取的所述目标知识层片段的管理数据,获取所述初始化片段中携带的所述目标知识层片段的有效持续时间,或者所述MPD中描述的所述目标知识层片段的有效持续时间,作为所述目标知识层片段的预设有效时间L;
    所述确定单元具体用于:
    获取所述目标知识层片段被目标序列层片段依赖的被依赖时刻T1,所述目标序列层片段为所述至少两个不连续的序列层片段中的第一个;
    结合所述解析单元获取的所述L计算T2,其中,T2=T1+L,并将所述T2确定为所述目标知识层片段的被删除时刻。
  8. 如权利要求6所述的处理装置,其特征在于,所述目标知识层片段的管理数据为所述目标知识层片段的片段信息;
    所述解析单元具体用于:
    解析所述获取单元获取的所述目标知识层片段的管理数据,获取所述目标知识层片段的片段信息中携带的所述目标知识层片段的起始有效时刻T3和所述目标知识层片段的有效持续时间L1,所述T3和所述L1作为所述目标知识层片段的预设有效时间;
    所述确定单元具体用于:
    根据所述解析单元获取的所述T3和所述L1计算T4,其中,T4=T3+L1,并将所述T4确定为所述知识层片段的被删除时刻。
  9. 如权利要求6所述的处理装置,其特征在于,所述目标知识层片段的管理数据为所述目标知识层片段的片段信息;
    所述解析单元具体用于:
    解析所述获取单元获取的所述目标知识层片段的管理数据,获取所述目标知识层片段的片段信息中携带的所述目标知识层片段的失效时刻T5,所述T5作为所述目标知识层片段的预设有效时间的终止时刻;
    所述确定单元具体用于:
    将所述解析单元获取的所述T5确定为所述知识层片段的被删除时刻。
  10. 如权利要求9所述的处理装置,其特征在于,所述至少两个不连续的序列层片段在所述解析单元获取的所述T5之前被解码。
PCT/CN2017/073623 2016-07-18 2017-02-15 一种码流数据的处理方法及装置 WO2018014545A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610567461.6 2016-07-18
CN201610567461.6A CN107634928B (zh) 2016-07-18 2016-07-18 一种码流数据的处理方法及装置

Publications (1)

Publication Number Publication Date
WO2018014545A1 true WO2018014545A1 (zh) 2018-01-25

Family

ID=60991728

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/073623 WO2018014545A1 (zh) 2016-07-18 2017-02-15 一种码流数据的处理方法及装置

Country Status (2)

Country Link
CN (1) CN107634928B (zh)
WO (1) WO2018014545A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110876083A (zh) * 2018-08-29 2020-03-10 浙江大学 指定参考图像的方法及装置及处理参考图像请求的方法及装置

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11716505B2 (en) 2018-08-29 2023-08-01 Zhejiang University Methods and apparatus for media data processing and transmitting and reference picture specifying
WO2020156054A1 (zh) * 2019-02-03 2020-08-06 华为技术有限公司 视频解码方法、视频编码方法、装置、设备及存储介质
CN115396691A (zh) * 2021-05-21 2022-11-25 北京金山云网络技术有限公司 一种数据流处理方法、装置及电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101529914A (zh) * 2006-10-24 2009-09-09 汤姆逊许可证公司 用于多视角视频编码的图像管理
CN104053012A (zh) * 2014-05-28 2014-09-17 北京大学深圳研究生院 一种基于字典库的视频编解码方法及装置
CN104768011A (zh) * 2015-03-31 2015-07-08 浙江大学 图像编解码方法和相关装置
WO2015140391A1 (en) * 2014-03-17 2015-09-24 Nokia Technologies Oy Method and apparatus for video coding and decoding

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20000059799A (ko) * 1999-03-09 2000-10-05 구자홍 웨이브릿 부호화를 이용한 움직임 보상 부호화 장치 및 방법
US9386064B2 (en) * 2006-06-09 2016-07-05 Qualcomm Incorporated Enhanced block-request streaming using URL templates and construction rules
US9615119B2 (en) * 2010-04-02 2017-04-04 Samsung Electronics Co., Ltd. Method and apparatus for providing timeshift service in digital broadcasting system and system thereof
CN104902279B (zh) * 2015-05-25 2018-11-13 浙江大学 一种视频处理方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101529914A (zh) * 2006-10-24 2009-09-09 汤姆逊许可证公司 用于多视角视频编码的图像管理
WO2015140391A1 (en) * 2014-03-17 2015-09-24 Nokia Technologies Oy Method and apparatus for video coding and decoding
CN104053012A (zh) * 2014-05-28 2014-09-17 北京大学深圳研究生院 一种基于字典库的视频编解码方法及装置
CN104768011A (zh) * 2015-03-31 2015-07-08 浙江大学 图像编解码方法和相关装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110876083A (zh) * 2018-08-29 2020-03-10 浙江大学 指定参考图像的方法及装置及处理参考图像请求的方法及装置
CN110876083B (zh) * 2018-08-29 2021-09-21 浙江大学 指定参考图像的方法及装置及处理参考图像请求的方法及装置

Also Published As

Publication number Publication date
CN107634928A (zh) 2018-01-26
CN107634928B (zh) 2020-10-23

Similar Documents

Publication Publication Date Title
JP6469788B2 (ja) メディアコンテンツの適応型ストリーミングのための品質情報の使用
US10171541B2 (en) Methods, devices, and computer programs for improving coding of media presentation description data
US9917872B2 (en) Method and apparatus for performing adaptive streaming on media contents
CN107634930B (zh) 一种媒体数据的获取方法和装置
US9544344B2 (en) Method and apparatus for streaming media content to client devices
EP2499783B1 (en) Method and apparatus for providing trick play service
CN109348251B (zh) 用于视频播放的方法、装置、计算机可读介质及电子设备
WO2015035942A1 (en) Method for playing back live video and device
WO2018014545A1 (zh) 一种码流数据的处理方法及装置
US10476928B2 (en) Network video playback method and apparatus
EP2897367A1 (en) Methods and systems of storage level video fragment management
CN109587514B (zh) 一种视频播放方法、介质和相关装置
JP2016519895A (ja) メディアファイル受信およびメディアファイル送信方法、装置、およびシステム
US20220167025A1 (en) Method, device, and computer program for optimizing transmission of portions of encapsulated media content
US10230812B1 (en) Dynamic allocation of subtitle packaging
US11653040B2 (en) Method for audio and video just-in-time transcoding
EP3780642A1 (en) Streaming media data processing method and streaming media processing server
CN113079386B (zh) 一种视频在线播放方法、装置、电子设备及存储介质
KR101863598B1 (ko) 스트리밍 서비스를 위한 클라이언트의 동작 방법
WO2018014546A1 (zh) 一种视频数据的处理方法及装置
KR102196504B1 (ko) 콘텐츠 제공 장치 및 방법
US20240171809A1 (en) Progress content view processing method, apparatus, and system
JP2018113568A (ja) 送信装置、送信方法、およびプログラム
CN117459799A (zh) 一种视频播放方法、设备及存储介质
GB2620582A (en) Method, device, and computer program for improving indexing of portions of encapsulated media data

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17830194

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17830194

Country of ref document: EP

Kind code of ref document: A1