WO2018014546A1

WO2018014546A1 - Method and device for processing video data

Info

Publication number: WO2018014546A1
Application number: PCT/CN2017/073662
Authority: WO
Inventors: 虞露; 于化龙; 赵寅; 杨海涛
Original assignee: 华为技术有限公司
Priority date: 2016-07-18
Filing date: 2017-02-15
Publication date: 2018-01-25
Also published as: CN107635142A; CN107635142B

Abstract

Embodiments of the present invention disclose a method and device for processing video data. The method comprises: a server acquiring segment information of each sequence layer segment of all sequence layer segments in a bitstream; determining N sequence layer segments and a first target knowledge layer segment according to the segment information of the each sequence layer segment; acquiring segment information of the first target knowledge layer segment; adding, according to the segment information of the first target knowledge layer segment and the segment information of the N sequence layer segments, expansion time period information of the first target knowledge layer segment to a media presentation description (MPD) of the bitstream; and transmitting the MPD of the bitstream to a client. By adopting the embodiments of the present invention, advantages of preventing repeated transmission of video data, saving data transmission bandwidth and enhancing adaptability of video data processing are achieved.

Description

Method and device for processing video data

Technical field

The present invention relates to the field of communications technologies, and in particular, to a method and an apparatus for processing video data.

Background technique

In traditional video coding, in order to enable the encoded video to support the random access function, some random access points (English: random access point) are inserted in the encoded video. The video is divided into a plurality of video segments with random access functions by a random access point, which is simply referred to as a random access segment. In the conventional technique, an image in a random access segment can only be used as a reference picture/reference frame of other images in the random access segment (English: reference picture/reference frame), and inter prediction is not allowed across random access points. (English: Inter prediction), which greatly limits the efficiency of video encoding/decoding.

In order to mine and utilize information that the images between multiple random access segments are mutually referenced at the time of encoding, when encoding (or decoding) an image, the encoder (or decoder) can select from the database and the current encoded image (or Decoding an image) an image having similar texture content as a reference image, such a reference image is referred to as a knowledge base image, and a database storing the set of the reference images is referred to as a knowledge base, and at least one image in the video refers to at least one knowledge base image The method of encoding and decoding is called knowledge-based video coding (LBVC). Encoding a video sequence using LBVC produces a knowledge layer code stream containing the knowledge base image coded code stream and a sequence layer code stream containing the code stream encoded by the image reference knowledge base image of each frame of the video sequence. The two code streams are respectively similar to the base layer code stream and the enhancement layer code stream generated by scalable video coding (SVC), that is, the sequence layer code stream depends on the knowledge layer code stream. However, the dependency relationship between the LBVC dual code stream organization and the hierarchical code stream of the SVC hierarchical code stream organization is different. In the LBVC dual code stream, the knowledge layer code stream is segmented to obtain multiple knowledge layer segments, and the sequence layer code is obtained. Stream segmentation yields multiple sequence layer segments. Multiple discontinuous sequence layer segments in the sequence layer code stream may refer to the same knowledge layer segment. When the client decodes multiple sequence layer segments of the knowledge layer segment, the knowledge layer segment needs to be loaded, so the client decodes more. The sequence layer segments only need to be loaded once for the knowledge layer segment, and the wireless repeats multiple times.

The existing system layer transmission scheme based on dynamic adaptive streaming over HTTP (DASH) through Hypertext Transfer Protocol (HTTP) respectively separates the knowledge layer stream and the sequence layer stream. As the base layer code stream and the enhancement layer code stream to transmit the video data generated by the LBVC, it is impossible to distinguish the knowledge layer segment that is dependent on the plurality of discontinuous sequence layer segments from the knowledge layer segment that is dependent on one sequence layer segment, thereby failing to Inform the client which knowledge layer fragments are dependent on multiple sequence layer fragments, and it is unavoidable that the knowledge layer fragments that are dependent on multiple sequence layer fragments are loaded and transmitted multiple times, which wastes transmission bandwidth and has low applicability.

Summary of the invention

The application provides a method and a device for processing video data, which can avoid repeated transmission of video data, save bandwidth of data transmission, and improve applicability of video data processing.

The first aspect provides a method for processing video data, which may include:

The server acquires segment information of each sequence layer segment in all sequence layer segments in the code stream, and the segment information is used by the segment information Describe a dependency relationship between a sequence layer segment and a knowledge layer segment in the code stream;

Determining N sequence layer segments and a first target knowledge layer segment according to the segment information of each sequence layer segment, wherein the N sequence layer segments are dependent on the first target knowledge layer segment, the N sequence layers The segment includes at least two discontinuous sequence layer segments, and the first target knowledge layer segment is one of at least one knowledge layer segment included in the code stream;

Obtaining segment information of the first target knowledge layer segment;

Adding extended period information of the first target knowledge layer segment to the media expression description MPD of the code stream according to the segment information of the first target knowledge layer segment and the segment information of the N sequence layer segments, N sequence layer segments are encoded within a period indicated by the extended period information;

Sending the MPD of the code stream to the client.

In the present application, the server may determine a knowledge layer segment that is dependent on at least two discontinuous sequence layer segments as a target knowledge layer segment, and add extended time period information in the MPD of the code stream to mark an extended period of the target knowledge layer segment. The information is used for the client to distinguish between the target knowledge layer segment and the non-target knowledge layer segment, thereby avoiding repeated loading and transmission of the target knowledge layer segment, saving data transmission bandwidth, and enhancing the applicability of the video data processing.

With reference to the first aspect, in a first possible implementation, the determining, by the segment information of each sequence layer segment, the N sequence layer segments and the first target knowledge layer segment includes:

Determining, according to the identifier of the knowledge layer segment included in the segment information of each sequence layer segment, a knowledge layer segment on which each sequence layer segment depends;

Determining a first target knowledge layer segment and determining N sequence layer segments that depend on the first target knowledge layer segment.

The application can determine the knowledge layer segment dependent on the sequence layer segment according to the identifier carried in the segment information of the sequence layer segment, and further can determine the first target knowledge layer segment and the N sequence layer segments that depend on the first target knowledge layer segment, thereby improving The accuracy of the determination of the dependence of the knowledge layer segment and the sequence layer segment enhances the processing efficiency of the video data.

With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner, the N sequence layer segments include at least two packet sequence layer segments, and the at least two packets include at least a first time period Corresponding first sequence layer segment group and second sequence layer segment group corresponding to the second time period;

The first sequence layer segment group includes N1 sequence layer segments, the second sequence layer segment group includes N2 sequence layer segments, the N1 sequence layer segments and the N2 sequence layer segments are discontinuous, and N1 +N2<=N;

If the N1>1, the N1 sequence layer segments are consecutive sequence layer segments; if the N2>1, the N2 sequence layer segments are consecutive sequence layer segments;

The MPD of the code stream includes at least two description layers, a first description layer of the at least two description layers describes a first target knowledge layer segment, and a second description layer describes a sequence layer segment;

The adding the extended period information of the first target knowledge layer segment to the MPD of the code stream according to the segment information of the first target knowledge layer segment and the segment information of the N sequence layer segments includes:

Adding first extended period information to the first segment description corresponding to the first period included in the first description layer, and in the second segment description corresponding to the second period included in the first description layer Add the second extended period information.

The present application can determine a temporally consecutive sequence layer segment group and a temporally discontinuous sequence layer segment group included in the N sequence layer segments that depend on the first target knowledge layer segment, and determine the sequence layer segment group in different time periods. Further knot The description layer included in the MPD of the combined stream describes the time series description feature of the knowledge layer segment. The extended time period information corresponding to different time periods is added in the segment description corresponding to different time periods, the applicability of the extended time period information is improved, and the processing of the video data is increased. applicability.

With reference to the second possible implementation manner of the first aspect, in a third possible implementation manner, the first extended period information and the second extended period information are both first extended identifiers;

The adding extended period information of the first target knowledge layer segment in the MPD of the code stream includes:

Adding a first extended identifier to the segment information of the first target knowledge layer segment included in the first segment description, and segment information of the first target knowledge layer segment included in the second segment description Add the first extension ID.

The application can mark the target knowledge layer segment by adding the first extended identifier to the segment information of the knowledge layer segment, improve the accuracy of the mark of the target knowledge layer segment, and improve the recognition efficiency of the knowledge layer segment.

With reference to the second possible implementation of the first aspect, in a fourth possible implementation, the first extended period information and the second extended period information are both second extended identifiers;

Adding, in the description layer attribute information of the first description layer, a second extension identifier corresponding to the first period and a second extension identifier corresponding to the second period.

The present application adds a second extended identifier to the description layer attribute information describing the description layer of the knowledge layer segment to mark the target knowledge layer segment, improves the convenience of marking of the target knowledge layer segment segment, and enhances the target knowledge layer segment. The applicability of tag addition.

With reference to the second possible implementation of the first aspect, in a fifth possible implementation, the method further includes:

If the first sequence layer segment group further depends on the second target knowledge layer segment, the MPD of the code stream further includes a third description layer, and the third description layer describes the second target knowledge layer segment.

The present application may describe the second target knowledge layer segment by using the third description layer when the sequence layer segment that depends on the second target knowledge layer segment is included in the N sequence layer segments, and the description accuracy of the target knowledge layer segment is enhanced. Improve the applicability of the processing of video data.

With reference to the fifth possible implementation manner of the first aspect, in a sixth possible implementation, the method further includes:

Adding, in the third segment description corresponding to the first time period included in the third description layer, third extended period information, where the third extended period information is a first extended identifier; or

The third extended period information is added to the description layer attribute information of the third description layer, where the third extended period information is a second extended identifier.

The application may add a mark of the second target knowledge layer segment in the segment description included in the third description layer, or add a mark of the second target knowledge layer segment in the description layer attribute information of the third description layer, thereby improving the target knowledge. The diversity of the marking method of the layer segment segment enhances the convenience of marking of the target knowledge layer segment, and enhances the applicability of the tag addition of the target knowledge layer segment.

With reference to the third possible implementation manner of the first aspect, or the fourth possible implementation manner of the first aspect, or the sixth possible implementation manner of the first aspect, in the seventh possible implementation manner, The first extended identifier or the second extended identifier is a first character string;

The first character string is used to describe that the first target knowledge layer segment or the second target knowledge layer segment has Extended time period of fixed length;

The length of time of the extended period is a value corresponding to the first string.

The application can mark the target knowledge layer segment with the extended time period of the fixed time length by using the first character string, and the operation is simple, and the processing efficiency of the video data is improved.

With reference to the third possible implementation manner of the first aspect, or the fourth possible implementation manner of the first aspect, or the sixth possible implementation manner of the first aspect, in an eighth possible implementation manner, The first extended identifier or the second extended identifier is a second string;

The second character string is used to describe an extended time period of a variable time length possessed by the first target knowledge layer segment or the second target knowledge layer segment;

The length of time of the extended period is determined by fragment information of the target knowledge layer segment included in the MPD.

The application can mark the target knowledge layer segment with the extended time period of the variable time length by using the second character string, and the determination of the time length of the extended time period is determined by the segment information included in the MPD, and the mark form of the target knowledge layer segment is improved. Diversity enhances the applicability of video data processing.

The second aspect provides a method for processing video data, which may include:

The media expression description MPD of the code stream sent by the client parsing server determines the extended period information carried in the MPD, and the extended period information is used to determine a time-dependent period of the target knowledge layer segment included in the code stream. The target knowledge layer segment is one of at least one knowledge layer segment included in the code stream, and the target knowledge layer segment is dependent on N sequence layer segments in the code stream;

Determining a target knowledge layer segment according to the extended time period information, and determining a dependent time period of the target knowledge layer segment, the N sequence layer segments being encoded within a dependent time period of the target knowledge layer segment;

Obtaining a network storage address of the target knowledge layer segment from the MPD of the code stream, and recording a dependent time period and a network storage address of the target knowledge layer segment;

Determining, when the video on demand request is received, whether the on-demand time carried in the video on demand request is included in the dependent time period of the target knowledge layer segment;

If the time period of the target knowledge layer segment includes the on-demand time, view a storage state of the target knowledge layer segment in a storage space of the client, and determine the target knowledge layer segment according to the storage state How to get it.

The client of the application can obtain the extended time period information included in the MPD by parsing the MPD of the code stream sent by the server, determine the dependent time period of the target knowledge layer segment, and store the dependent time period and the target knowledge layer of the target knowledge layer segment. The storage status flag of the storage state of the fragment in the client. Further, when receiving the on-demand request of the user-on-demand video, the client may search for the dependent time period including the on-demand time according to the on-demand time carried in the on-demand request, and further determine the target knowledge layer segment corresponding to the dependent time period and its storage. status. The client can determine whether to request the target knowledge layer segment from the server according to the storage state of the target knowledge layer segment, which can avoid multiple loading and storage of the same knowledge layer segment, save data transmission bandwidth, and improve the processing efficiency of the code stream data.

With reference to the second aspect, in a first possible implementation, the N sequence layer segments include at least two grouped sequence layer segments, and the at least two groups include at least a first sequence layer segment corresponding to the first time period. Group and second period Corresponding second sequence layer segment group;

The extended period information includes first extended period information corresponding to the first period and second extended period information corresponding to the second period;

The first extended period information is used to determine a first extended period of the dependent period of the target knowledge layer segment, and the second extended period information is used to determine a first of the dependent periods of the target knowledge layer segment Two extended time slots.

In the present application, the client may parse and acquire the first extended period information and the second extended period information included in the MPD of the code stream, and may further determine that the first extended period and the second extended period information corresponding to the first extended period information correspond to The second extended period of time can improve the determining efficiency of the extended period of the target knowledge layer segment.

With reference to the first possible implementation manner of the second aspect, in a second possible implementation manner, the first extended period information and the second extended period information are first extended identifiers;

The media expression description MPD of the code stream sent by the client parsing server determines that the extended time period information carried in the MPD includes:

The client parses the MPD, and acquires a first extended identifier that is included in the fragment information that is described by the description layer in the MPD;

Determining, according to the extended period information, the target knowledge layer segment includes:

Determining a segment corresponding to the segment information carrying the first extended identifier as a target knowledge layer segment;

The segment information includes a first segment information corresponding to the first time period and a second segment information corresponding to the second time period, where the first segment information carries the first extended time period information, and the second segment information carries the first Second, extended time period information.

In the present application, the client may determine the target knowledge layer segment according to the first extended identifier, which may improve the recognition efficiency of the target knowledge layer segment, thereby improving the processing efficiency of the video data.

With reference to the first possible implementation manner of the second aspect, in a third possible implementation manner, the first extended period information and the second extended period information are second extended identifiers;

The client parses the MPD, and acquires a second extended identifier included in the description layer attribute information of the description layer included in the MPD;

Determining, by the description layer description segment carrying the second extended identifier, a target knowledge layer segment;

The first layer extension period information and the second extension period information are included in the description layer attribute information, and the first fragment information and the second fragment information respectively carry a second extension identifier.

In the present application, the client may determine the target knowledge layer segment according to the second extended identifier, which may improve the recognition efficiency of the target knowledge layer segment, thereby improving the processing efficiency of the video data.

With reference to the second possible implementation of the second aspect or the third possible implementation of the second aspect, in a fourth possible implementation, the determined time period of determining the target knowledge layer segment includes:

Determining, according to the first extended period information, a first extended period of the target knowledge layer segment, and determining a second extended period of the target knowledge layer segment according to the second extended period information;

As a dependent time of the target knowledge layer segment, the union of the first extended period and the second extended period segment.

In the present application, the client may determine the extended time period according to the extended time period information carried in the MPD of the code stream, determine the dependent time period of the target knowledge layer segment according to the determined extended time period, and mark the target knowledge layer segment by the dependent time period. The recognition degree of the target knowledge layer segment is improved, and the operability of the management of the target knowledge layer segment is improved.

With reference to the fourth possible implementation of the second aspect, in a fifth possible implementation, the recording the dependent time period and the network storage address of the target knowledge layer segment includes:

Generating a knowledge layer segment list according to the network storage address of the target knowledge layer segment, and recording a dependent time period of the target knowledge layer segment in the knowledge layer segment list;

The method further includes:

Adding, in the knowledge layer segment list, a storage state flag of the target knowledge layer segment, to indicate whether the target knowledge layer segment is already in the storage space of the client;

The viewing the storage state of the target knowledge layer segment in the storage space of the client includes:

Viewing a storage status flag of the target knowledge layer segment in the knowledge layer segment list according to a network storage address of the target knowledge layer segment;

If the storage status flag is true, determining that the storage state of the target knowledge layer segment in the storage space is not empty, otherwise it is empty;

Determining, by the storing state, the acquiring manner of the target knowledge layer segment includes:

And if the storage state is not empty, acquiring the target knowledge layer segment from the storage space, and otherwise sending a request for acquiring the target knowledge layer segment to the server.

In the present application, the client may store the stored status flag of the dependent time period of the target knowledge layer segment and the storage state of the target knowledge layer segment in the client, and then may receive the on-demand time carried in the on-demand request when receiving the request for the on-demand request. Finding a target knowledge layer segment corresponding to the extended time period of the on-demand time, and determining whether to request the target knowledge layer segment from the server according to the storage state of the target knowledge layer segment, or acquiring the target knowledge layer segment from the storage space, the same knowledge can be avoided Multiple loading and storage of layer fragments saves data transmission bandwidth and improves processing efficiency of code stream data.

With the fifth possible implementation of the second aspect, in a sixth possible implementation, after the sending, by the server, the request for acquiring the target knowledge layer segment, the method further includes:

Receiving the target knowledge layer segment sent by the server;

If the remaining space size of the storage space is not less than the data size of the target knowledge layer segment, storing the target knowledge layer segment into the storage space, and recording the storage state flag of the target knowledge layer segment as true;

If the remaining space size of the storage space is smaller than the data size of the target knowledge layer segment, deleting the specified target knowledge layer segment stored in the storage space, and storing the target knowledge layer segment into the storage space, and Recording the storage status flag of the target knowledge layer segment as true;

The time interval between the time-dependent period of the specified target knowledge layer segment and the on-demand time is greater than a preset time threshold.

In the present application, the client may update the storage state flag stored in the knowledge layer segment list of the target knowledge layer segment after receiving or deleting the target knowledge layer segment, thereby improving the management accuracy of the storage state of the target knowledge layer segment. Thereby, the accuracy of the management of the target knowledge layer segment is improved.

A third aspect provides a processing device for video data, which may include:

And an acquiring unit, configured to acquire segment information of each sequence layer segment in all sequence layer segments in the code stream, where the segment information is used to describe a dependency relationship between the sequence layer segment and the knowledge layer segment in the code stream;

a determining unit, configured to determine, according to the segment information of each sequence layer segment acquired by the acquiring unit, N sequence layer segments and a first target knowledge layer segment, wherein the N sequence layer segments are dependent on the first a target knowledge layer segment, the N sequence layer segments include at least two discontinuous sequence layer segments, and the first target knowledge layer segment is one of at least one knowledge layer segment included in the code stream;

The acquiring unit is further configured to acquire segment information of the first target knowledge layer segment determined by the determining unit;

An adding unit, configured to add the first in the media expression description MPD of the code stream according to the segment information of the first target knowledge layer segment and the segment information of the N sequence layer segments acquired by the acquiring unit Extended period information of the target knowledge layer segment, the N sequence layer segments being encoded within a period indicated by the extended period information;

And a sending unit, configured to send the MPD of the code stream obtained by the adding unit to the client.

With reference to the third aspect, in a first possible implementation manner, the determining unit is specifically configured to:

Determining, according to the identifier of the knowledge layer segment included in the segment information of each sequence layer segment acquired by the acquiring unit, a knowledge layer segment on which each sequence layer segment depends;

With reference to the first possible implementation manner of the third aspect, in a second possible implementation, the N sequence layer segments include at least two packet sequence layer segments, and the at least two packets include at least a first time period Corresponding first sequence layer segment group and second sequence layer segment group corresponding to the second time period;

The adding unit is specifically used to:

With reference to the second possible implementation manner of the third aspect, in a third possible implementation manner, the first extended period information and the second extended period information are both first extended identifiers;

The adding unit is specifically used to:

With reference to the second possible implementation manner of the third aspect, in a fourth possible implementation manner, the first extended period information and the second extended period information are both second extended identifiers;

The adding unit is specifically used to:

With reference to the second possible implementation manner of the third aspect, in a fifth possible implementation, if the first sequence layer segment group further depends on the second target knowledge layer segment, the MPD of the code stream further includes a third description layer, the third description layer describing the second target knowledge layer segment.

In conjunction with the fifth possible implementation of the third aspect, in a sixth possible implementation, the adding unit is further configured to:

A fourth aspect provides a processing device for video data, which may include:

a parsing unit, configured to parse a media expression description MPD of the code stream sent by the server, and determine extended period information carried in the MPD, where the extended period information is used to determine that the target knowledge layer segment included in the code stream is dependent a period of time, the target knowledge layer segment is one of at least one knowledge layer segment included in the code stream, and the target knowledge layer segment is dependent on N sequence layer segments in the code stream;

a determining unit, configured to determine a target knowledge layer segment according to the extended time period information acquired by the parsing unit, and determine a dependent time period of the target knowledge layer segment, where the N sequence layer segments are in the target knowledge The slice segment has been encoded within the time period of the dependency;

a recording unit, configured to acquire, from an MPD of the code stream parsed by the parsing unit, a network storage address of the target knowledge layer segment, and record a dependent time period of the target knowledge layer segment determined by the determining unit, and the Network storage address;

a determining unit, configured to determine, when the video on demand request is obtained, whether an on-demand time carried in the video-on-demand request is included in a time-dependent period of the target knowledge layer segment recorded by the recording unit;

An obtaining unit, configured to: when the determination result of the determining unit is yes, view a storage state of the target knowledge layer segment in a storage space of the client, and determine, according to the storage state, the target knowledge layer segment method of obtaining.

With reference to the fourth aspect, in a first possible implementation manner, the N sequence layer segments include at least two grouped sequence layer segments, and the at least two groups include at least a first sequence layer segment corresponding to the first time period. a second sequence layer segment group corresponding to the group and the second time period;

The first extended period information is used to determine a first extended period of the dependent time period of the target knowledge layer segment, The second extended period information is used to determine a second extended period of the dependent time period of the target knowledge layer segment.

With reference to the first possible implementation manner of the fourth aspect, in a second possible implementation manner, the first extended period information and the second extended period information are first extended identifiers;

The parsing unit is specifically configured to:

Parsing the MPD, and acquiring a first extended identifier included in the fragment information including the description of the description layer in the MPD;

The determining unit is specifically configured to:

Determining a segment corresponding to the segment information of the first extended identifier acquired by the parsing unit as a target knowledge layer segment;

With reference to the first possible implementation manner of the fourth aspect, in a third possible implementation, the first extended period information and the second extended period information are second extended identifiers;

The parsing unit is specifically configured to:

Parsing the MPD, and acquiring a second extended identifier included in the description layer attribute information of the description layer included in the MPD;

The determining unit is specifically configured to:

Determining, by the parsing unit, a segment of the description layer of the second extended identifier that is obtained by the parsing unit as a target knowledge layer segment;

With reference to the second possible implementation manner of the fourth aspect or the third possible implementation manner of the fourth aspect, in a fourth possible implementation manner, the determining unit is specifically configured to:

A union of the first extended period and the second extended period is taken as a dependent period of the target knowledge layer segment.

With reference to the fourth possible implementation of the fourth aspect, in a fifth possible implementation, the recording unit is specifically configured to:

The recording unit is further configured to:

The obtaining unit is specifically configured to:

Determining a storage state of the target knowledge layer segment in the storage space if the storage status flag is true Not empty, otherwise empty;

With reference to the fifth possible implementation manner of the fourth aspect, in a sixth possible implementation, the acquiring unit is further configured to:

Receiving the target knowledge layer segment sent by the server;

If the remaining space size of the storage space is not less than the data size of the target knowledge layer segment, storing the target knowledge layer segment into the storage space, and using the recording unit to segment the target knowledge layer segment The storage status flag is marked as true;

If the remaining space size of the storage space is smaller than the data size of the target knowledge layer segment, deleting the specified target knowledge layer segment stored in the storage space, and storing the target knowledge layer segment into the storage space, and Recording, by the recording unit, a storage status flag of the target knowledge layer segment as true;

In the present application, the client may obtain the extended time period information included in the MPD by parsing the MPD of the code stream, determine the dependent time period of the target knowledge layer segment, and store the dependent time period and the target knowledge layer of the target knowledge layer segment. The storage status flag of the storage state of the fragment in the client. Further, when receiving the request for the on-demand video of the user, the client may search for the extended time period including the on-demand time according to the on-demand time carried in the on-demand request, and further determine the target knowledge layer segment corresponding to the extended time period and the storage status flag thereof. . The client may determine whether to request the target knowledge layer segment from the server according to the storage state of the target knowledge layer segment, or obtain the target knowledge layer segment from the local storage space, thereby avoiding multiple loading and storage of the same knowledge layer segment, and saving data transmission. Bandwidth, improve the processing efficiency of the stream data.

A fifth aspect provides a server, which can include: a memory and a processor, the storage being coupled to the processor;

The memory is for storing a set of program codes;

The processor is configured to invoke a program code stored in the memory to execute a processing method of the video data as provided in the first aspect above.

A sixth aspect provides a client, which can include: a memory and a processor, the storage being connected to the processor;

The memory is for storing a set of program codes;

The processor is configured to invoke a program code stored in the memory to execute a processing method of video data as provided in the second aspect above.

DRAWINGS

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present invention. Other drawings may also be obtained from those of ordinary skill in the art in light of the inventive work.

1 is a schematic diagram of an example of a transmission framework of a system transmission scheme DASH standard;

2 is a schematic structural diagram of an MPD of a system transmission scheme DASH standard;

3 is a schematic diagram of mutually independent random access segments;

4 is a schematic diagram of a knowledge base encoding reference in a video encoding based on a knowledge base;

5 is a schematic diagram showing the relationship between a base layer code stream and an enhancement layer code stream of an SVC;

6 is a schematic diagram of an example of an MPD generated for an SVC code stream according to the DASH standard;

FIG. 7 is a schematic diagram of a video data processing system according to an embodiment of the present invention; FIG.

FIG. 8 is a schematic flowchart of a method for processing video data according to an embodiment of the present disclosure;

9 is a fragmentary diagram of video content generated by LBVC;

FIG. 10 is a schematic diagram of an MPD according to an embodiment of the present invention; FIG.

11 is a schematic diagram of adding an extended identifier based on a syntax element of a DASH standard;

12 is another schematic diagram of adding an extended identifier based on syntax elements of the DASH standard;

FIG. 13 is another schematic diagram of adding an extended identifier based on syntax elements of the DASH standard; FIG.

14 is a schematic diagram of extracting a knowledge base image from a video sequence by using an LBVC method;

15 is a schematic diagram of a knowledge base image segmented into knowledge layer segments;

16 is another schematic diagram of the knowledge base image being divided into knowledge layer segments;

17 is a schematic diagram of a knowledge layer fragment list;

18 is another schematic diagram of a knowledge layer fragment list;

19 is another schematic diagram of a knowledge layer fragment list;

20 is a schematic structural diagram of a video data processing apparatus according to an embodiment of the present invention;

21 is another schematic structural diagram of a video data processing apparatus according to an embodiment of the present invention;

FIG. 22 is a schematic structural diagram of a server according to an embodiment of the present invention;

FIG. 23 is a schematic structural diagram of a client according to an embodiment of the present invention.

detailed description

The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, but not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.

The current client-side system layer video streaming media transmission scheme can adopt the DASH standard framework, as shown in FIG. 1. FIG. 1 is a schematic diagram of a transmission framework of the system transmission scheme DASH standard. The data transmission process of the system layer video streaming media transmission scheme includes two processes: a process in which a server (such as an HTTP server) generates media data for video content, and a client (such as an HTTP streaming media client) requests and obtains media data from a server. process. Wherein, the media expression on the server includes multiple description layers, and each description layer describes multiple segments. The HTTP streaming request control module of the client obtains the media presentation description (MPD) sent by the server, and analyzes the MPD to determine the fragment to be requested, and requests the corresponding fragment from the server through the HTTP request receiving end, and passes the The media player performs decoding playback.

1) The media generated by the server for the video content in the process of generating media data for the video content by the server The data includes different versions of the video stream and the MPD of the code stream for the same video content. For example, the server generates a low-resolution low-rate low frame rate (such as 360p resolution, 300kbps code rate, 15fps frame rate) for the video content of the same episode, and a medium-rate medium-rate high frame rate (such as 720p). Resolution, 1200 kbps, 25 fps frame rate, high resolution, high bit rate, high frame rate (such as 1080p resolution, 3000 kbps, 25 fps frame rate).

In addition, the server can also generate an MPD of the stream for the video content of the episode. 2, FIG. 2 is a schematic structural diagram of an MPD of a system transmission scheme DASH standard. The MPD of the above code rate includes a plurality of description layers (English: Representation). For example, the period start=100s part in the media expression (English: Media Presentation) of FIG. 2 may include multiple description layers such as Representation 1, Representation 2, . Each description layer describes one or more segments of the above code stream. The description layers included in the MPD in the foregoing code stream may be independent of each other or may depend on each other. Wherein, each of the foregoing description layers is independent of each other, and the codec of each description layer does not refer to other description layers (for example, a description layer describing a knowledge layer segment, and the codec of the knowledge layer segment does not refer to other segments), and each description layer is between Interdependence means that the codec of each description layer needs to refer to other description layers (for example, a description layer describing a sequence layer fragment, and the codec of the sequence layer fragment needs to refer to the knowledge layer fragment). Each description layer describes information of several segments (English: Segment) according to time series, such as initialization segments (English: Initialization segment), Media Segment1, Media Segment1, ..., Media Segment20, etc., all of which are connected end to end in time. Each segment contains a video stream within a time period, and the description of the segment in the description layer includes a playback start time, a playback duration, and a network storage address (for example, a Uniform Resource Locator (URL). Fragment information such as the form of the network storage address).

Further, the segment is further allowed to be subdivided into a plurality of sub-segments (English: Subsegment), and each sub-segment includes a part of the segment, and the information of the sub-segment includes a playback start time, a playback duration, and a code of the sub-segment in the segment to which the sub-segment belongs. The range of bytes in the stream (English: Byte Range) and so on. The information of the above sub-segments is described by a segment index (English: Segment Index), each segment index describes information of all sub-segments in a segment; the segment index may be merged with the segment, stored at the beginning of the segment, or may be stored separately. In the index fragment (English: Index Segment). For a more detailed description of the above sub-segments, refer to the information provided in the DASH standard of the system transmission scheme, which is not limited herein.

2) In the process of the client requesting and obtaining the media data from the server, when the user selects to play the video, the client obtains the MPD of the video code stream according to the operation request of the user's on-demand, and then according to the video segment described in the MPD of the code stream. The information generates a list of fragments. The above fragment list describes the playback period of each clip and the network storage address of the clip. The client obtains the network storage address of the one or more segments from the segment list according to factors such as the playback time of the user's on-demand, and sends a request for downloading the video segment data corresponding to the network storage address to the server, and the server sends the request to the client according to the received request. Video clip content. After the client obtains the content of the video clip sent by the server, it can perform decoding, playback, and the like through the media player.

The system layer video streaming media transmission scheme adopts the DASH standard, and realizes the transmission of video data by analyzing the MPD by the client, requesting the video data to the server as needed, and receiving the data sent by the server. The system layer video streaming media transmission scheme adopts the DASH standard and is mainly applied to a video stream generated by a conventional video encoding (for example, an encoding standard such as H.264, HEVC (High Efficiency Video Coding)). 3, FIG. 3 is a schematic diagram of a plurality of mutually independent random access segments. Wherein, the dot represents a random access point, the square represents a random access segment after the random access point, and the dotted arrow with an x symbol indicates that the random access segment pointed by the arrow cannot refer to the information of the random access segment starting from the dotted line when encoding. That is, in the conventional video codec technology, the image in a random access segment can only be made The reference image/reference frame of other images in the random access segment, that is, inter-prediction across random access points is not allowed, which greatly limits the efficiency of video encoding/decoding.

LBVC extracts common image information in multiple random access segments (including mutual information between random access segments (English: mutual information), that is, information that the images between random access segments are mutually referenced during encoding and decoding). These common image information are encoded only once, and the images in each random access segment are allowed to be encoded (and decoded) with reference to the common image information, thereby enabling the encoder (or decoder) to further remove the mutual information between the randomly accessed segments. The redundant information of the video sequence improves the coding efficiency of the entire video sequence, reduces the storage space, and saves the transmission bandwidth. As shown in FIG. 4, FIG. 4 is a schematic diagram of providing a coding reference for other random access segments by using one knowledge base in the video coding of the knowledge base. Among them, the dot represents a random access point, the square represents a random access segment after the random access point, and the arrow indicates that a plurality of random access segments are referenced by the information provided by the knowledge base (English: Library) at the time of encoding.

An image in a sequence layer code stream (called a sequence image) generated by encoding a video sequence by using a video encoding method based on a knowledge base has its corresponding time. When the sequence image is operated at the time, the time is called the sequence image. The moment of being operated. Wherein, the above operations include being encoded, decoded, played, or used. In a specific implementation, the sequence image is used for playback in most cases, so the operation time of the sequence image below will be described by taking the playback time as an example. Correspondingly, the set of playing times of the sequence layer code stream is referred to as the playing period of the sequence layer code stream, and the operated time period of the sequence layer segment will be described by taking the playing time period as an example. However, since the knowledge base image can be encoded (or decoded, which will be described below by way of example) as an image of any playback time in the video sequence, the knowledge base image itself does not have the same playback time information as the sequence image. In the system layer transmission, in order to obtain the dependence of the sequence layer code stream and the knowledge layer code stream through the time information (ie, the reference and the referenced relationship), the system allocates a dependent time period for each knowledge base image (English: Depended Duration) , DD). The dependent time period of a knowledge base image covers at least the playing time of all the sequence images that depend on the above-mentioned knowledge base image, that is, the time-dependent period of a knowledge base image includes the playing of each sequence image with the knowledge base image as a coding reference. time. Therefore, when requesting the sequence layer code stream of a certain playing period, the client needs to simultaneously request the knowledge layer code stream of the playing period covered by the dependent period to ensure correct decoding of the video data by the media player.

The dual-stream organization of LBVC has certain similarities with the hierarchical code stream organization of SVC. However, since the dependencies between the hierarchical code streams are different in the two modes, that is, the reference relationship is different, the dual-stream for LBVC is required. The characteristics of the organization mode improve the existing DASH system layer transmission scheme to determine the data transmission mode that can realize the data transmission corresponding to the code stream organization mode of LBVC and the advantage of LBVC.

In SVC coding, SVC coding produces a scalable video code stream, the video code stream comprising a base layer code stream and at least one enhancement layer code stream. FIG. 5 is a schematic diagram showing the relationship between the base layer code stream and the enhancement layer code stream of the SVC. Each of the squares represents an image, and the arrows between the layers indicate that an image of the enhancement layer can only refer to the image at the same time in the base layer when encoding using Inter-Layer Prediction. In the system layer transmission of the SVC code stream, the DASH standard uses different description layers in the MPD to describe the information of the base layer code stream and the enhancement layer code stream, and the description layer indicating the enhancement layer code stream depends on the description of the base layer code stream. Floor. 6, FIG. 6 is a schematic diagram of an example of an MPD generated for an SVC code stream according to the DASH standard. The DASH standard describes the characteristics of the two-layer code stream with dependency by the dep_id or the like in the description layer of the MPD. Among them, the id of Representation1 is rep1, and Representation2 is rep2. The information described by Representation2 contains dep_id=rep1, indicating Representation2 dependency. Representation1.

Specifically, on the server side, the enhancement layer code stream and the base layer code stream are sliced into an enhancement layer segment and a base layer segment, and each segment contains data of one period in the code stream. Since the enhancement layer code stream can only refer to the base layer code stream that depends on the same moment, the period covered by the enhancement layer fragment and its dependent base layer fragment is uniformly aligned. That is, the period of the enhancement layer code stream corresponding to the enhancement layer fragment, and the period of the base layer code stream corresponding to the base layer fragment to which it depends, the start time and the end time of both periods are the same. When requesting video data, the client requests an enhancement layer fragment of a certain period of time, and needs to simultaneously request one or more base layer fragments aligned with the period of the enhancement layer fragment, thereby ensuring that the enhancement layer fragment and its dependent base layer fragment coexist. And combining the two parts of the code stream into a code stream that meets the SVC decoding requirement and transmitting it to the client for decoding.

Since the dependence of the base layer code stream and the enhancement layer code stream in the SVC code stream is different from the dependence of the knowledge layer code stream and the sequence layer code stream in the LBVC code stream, the method of describing the SVC code stream by the DASH standard cannot be simply described. LBVC code stream, otherwise it can not play the advantage of LBVC to reduce storage space and save transmission bandwidth. The specific reasons are as follows:

1) The difference between the SVC code stream and the LBVC code stream depends on:

The SVC code stream includes a separate base layer code stream and at least one enhancement layer code stream. Assuming that there is only one enhancement layer code stream, an image in the enhancement layer code stream can only rely on the image at the same time in the base layer code stream when using inter-layer predictive coding.

The LBVC code stream includes at least one knowledge layer code stream (where at least one knowledge layer code stream is independent, one possible embodiment is that all knowledge layer code streams are independent) and at least one sequence layer code stream. Assuming that there is only one knowledge layer code stream and one sequence layer code stream, one image in the sequence layer code stream can rely on at least one knowledge base image in the knowledge layer code stream when encoding, that is, one image in the sequence layer code stream is When relying on the knowledge base image during encoding, at least one knowledge base image in the knowledge layer code stream can be used as a reference. Meanwhile, a knowledge base image in the knowledge layer code stream of the LBVC code stream is dependent on at least two images in the sequence layer code stream, and there may be other sequence layer segments between the sequence layer segments corresponding to the at least two images, ie These sequence layer segments that depend on the knowledge base image may be discontinuous in time. It should be noted that, in the LBVC code stream, the sequence layer code stream may also be independent of the knowledge layer code stream. If the sequence layer code stream depends on the knowledge layer code stream, the following implementation manner may be performed. The embodiment of the present invention does not limit the scenario in which the sequence layer code stream does not depend on the knowledge layer code stream.

2) In the system layer transmission, the code stream is sliced and encapsulated into segments, and the difference between the SVC code stream and the LBVC code stream depends on the SVC code stream segmentation encapsulated segment and the LBVC code stream segmentation encapsulated segment dependency. There are also differences in relationships:

In the SVC code stream, the enhancement layer code stream is sliced into enhancement layer segments, and the base layer code stream is sliced and encapsulated into a base layer segment. At the same time, the enhancement layer fragment can only rely on the same base layer fragment as its period, that is, the enhancement layer fragment of any period obtained by the enhancement layer code stream segmentation can only depend on the base layer of the period obtained by the base layer code stream segmentation. Fragment.

In the LBVC code stream, the sequence layer code stream is sliced into sequence layer segments, and the knowledge layer code stream is sliced and encapsulated into knowledge layer segments. At the same time, at least one sequence layer segment depends on at least one knowledge layer segment (ie, a sequence layer segment that depends on the knowledge layer segment in the LBVC code stream, and a sequence layer segment that does not depend on the knowledge layer segment. If the LBVC code stream stores the dependency in the stream The sequence layer segment of the knowledge layer segment may be implemented according to an implementation manner provided by the embodiment of the present invention, which is not limited herein, and at least one knowledge layer segment is dependent on at least two sequence layer segments, and the at least two sequence layer segments are at least two sequence layer segments. There may be other sequence layer segments in between, ie the at least two sequence layer segments may be discontinuous two sequence layer segments.

Therefore, when transmitting an LBVC code stream, there is a case where a knowledge layer segment is dependent on at least two discontinuous sequence layer segments. In the ideal case, the above knowledge layer segment is used to be downloaded only once on the client, thereby saving the code rate. However, if the LBVC code stream is transmitted according to the SVC-DASH system layer transmission scheme (ie, the scheme for transmitting the SVC code stream according to the DASH standard), since the existing DASH standard, the MPD can only describe the information of each fragment one by one according to the timing. In order to correctly decode the sequence layer code stream, the knowledge layer segment is respectively dependent on the rules of the sequence layer segments that are referenced by the operation time period (exemplified by the playing time period), and the MPD needs to be discontinuous. The sequence layer segment repeatedly describes the information of the same knowledge layer segment. When the MPD repeatedly describes the information of the same knowledge layer segment for the discontinuous sequence layer segment, the client repeatedly requests the same knowledge layer segment while requesting the sequence layer segment separately, so that the corresponding knowledge layer code stream data is generated. The download is downloaded, which seriously increases the transmission bit overhead.

For example, assuming that a certain knowledge layer segment K is dependent on sequence layer segments S1 and S2 whose two playback periods are T1 and T2 respectively, then the DASH standard describes the information of the sequence layer code stream and the knowledge layer code stream. The knowledge layer segment K to which it depends is described for the sequence layer segments S1 and S2, respectively, and the knowledge layer segments have playback periods T1 and T2, respectively. The client requests and acquires the sequence layer segment S1 and the knowledge layer segment K in the T1 period and transmits it to the player for decoding. The client requests and acquires the sequence layer segment S2 during the T2 period, and requests and acquires the knowledge layer again. Fragment K causes the knowledge layer fragment K to be downloaded twice repeatedly, wasting transmission bandwidth. Therefore, although the DASH standard can normally transmit the code stream of the knowledge base-based video encoding, since the knowledge layer segment is repeatedly downloaded, the transmission bandwidth is wasted, and the coding efficiency of the knowledge base-based video coding is not fully utilized.

The above case only illustrates the problem that one sequence layer segment only depends on one knowledge layer segment. However, the LBVC code stream allows one sequence layer segment to simultaneously depend on multiple knowledge layer segments, and the DASH standard requires fragments when describing segments in a description layer. The playback period cannot be overwritten by each other, so using SVC-DASH to describe the LBVC stream will bring new problems.

For example, a LBVC coded sequence layer image may rely on multiple knowledge base images, ie, one sequence layer segment may rely on multiple knowledge layer segments. It is assumed that the sequence layer segment S1 depends on the knowledge base images P1, P2, P5, the sequence layer segment S2 depends on the knowledge base images P1, P3, and the sequence layer segment S3 depends on the knowledge base images P2, P4, P5. Since the DASH standard needs to ensure the timing of the segments when segmenting the code stream, the segments corresponding to the knowledge base image will have the same playback time. Specifically, if the code stream generated by the five images encoding the knowledge base images P1 to P5 in the above example is segmented into pieces according to the data of each frame, K1 to K5 are divided into five segments, since K1, K2, and K5 are S1 depends, and their playing time is the same. In the DASH standard MPD, K1, K2, and K5 cannot be described in a description layer in time series, and K1, K2, and K5 can be described in only three description layers. If the MPD uses only one description layer, the DASH standard can only splicing the code streams encoded by the knowledge base images P1, P2, P3, P4, and P5 into three knowledge layer segments K1 (including the knowledge base images P1, P2). P5 code stream), K2 (including the code stream of the knowledge base images P1, P3) and K3 (the code stream containing the knowledge base images P2, P4, P5). However, this causes the code stream of the same knowledge base image to be repeatedly stored in a plurality of different knowledge layer segments, wasting storage space.

Further, in the request mechanism of the client to the MPD, the existing DASH-compliant client (hereinafter referred to as the DASH client) generates a segment list according to the MPD, records the network storage address of the segment and its corresponding playing period, and then according to The playback time of the user's on-demand selects the segment in which the playback period covers the playback time (ie, the playback period includes the segment of the playback period), and sends a request for the segment to the server. For system layer transmission of SVC-DASH, the client requests both the base layer fragment and the enhancement layer fragment to ensure that the SVC code stream can be decoded normally. However, existing The DASH client can simply cache a segment of a playback period (including a base layer segment and an enhancement layer segment), and decode and play the segment during the playback period. After the playback period, the segment is cleared or no longer used. For the LBVC code stream, since the knowledge layer segment is dependent on the sequence layer segments of the multiple discontinuous playback periods, if the existing DASH client is used to request the LBVC code stream from the server, the DASH client will separately request each broadcast period. Different sequence layer fragments and the same knowledge layer fragment they depend on cause a knowledge layer fragment to be downloaded repeatedly, which wastes transmission bandwidth.

For example, assuming that the knowledge layer segment K is dependent on the sequence layer segments S1, S3, S6 of a plurality of discontinuous playback periods, the DASH client requests the sequence layer segment S1 and the knowledge layer segment K during the first play period. After the player correctly decodes the sequence layer segment S1 during the playing period, the DASH client no longer manages the knowledge layer segment K, or directly clears the knowledge layer segment K in the cache. In the third play period, the DASH client requests the sequence layer segment S3 and requests the knowledge layer segment K again. Similarly, after the player correctly decodes the sequence layer segment S3 during the playing period, the DASH client no longer manages the knowledge layer segment K, or directly clears the knowledge layer segment K in the cache. In the sixth playing period, the DASH client also requests the knowledge layer segment K. Ideally, the knowledge layer fragment K should be used three times by the client and downloaded only once, but the existing DASH client's request mechanism causes the same knowledge layer fragment K to be repeatedly requested and downloaded three times, wasting transmission bandwidth.

In summary, simply using the existing SVC-DASH method to describe the dependencies between the knowledge layer code stream and the sequence layer code stream and transmit the data, can not give full play to the advantages of LBVC, will result in the same segment of knowledge layer code stream It is transmitted and stored multiple times, which wastes storage space and reduces data transmission efficiency. In order to solve the above technical defects, the processing and processing of the knowledge layer segment and the sequence layer segment in the LBVC code stream are more similar to the above-mentioned ideal situation. The embodiment of the present invention provides a method and a device for processing video data according to the existing The processing method of video data in the DASH standard of the system layer transmission scheme has been improved. The method and apparatus for processing video data provided by the embodiments of the present invention will be specifically described below with reference to FIG. 7 to FIG.

FIG. 7 is a schematic diagram of a video data processing system according to an embodiment of the present invention. The processing system provided by the embodiment of the present invention includes: a server and a client. The server prepares the related media content of the video data. Specifically, the server may generate the media content by using the media content generating unit included therein, and then generate the MPD of the media content by using the media content description unit, and then the media content may be obtained through the content storage unit. The MPD is stored in the specified storage space. The server may also transmit the media content and the MPD to the client via the HTTP response service unit in response to the client's request. The client can request and obtain relevant media content of the video data from the server and process the received media content. Specifically, the client may request the media content and the MPD from the server through the HTTP request client unit, and parse the MPD transmitted by the server through the media expression description parsing unit to determine the sequence layer segment and the knowledge layer segment and the like that need to be requested. . Further, the client may trigger the HTTP request client unit to request the sequence layer segment and other related data from the server through the media request control unit, or trigger the knowledge library storage management unit to acquire the knowledge layer segment and the like data from the knowledge base through the media request control unit, and further The acquired sequence layer segment and its dependent knowledge layer segment may be transmitted to the media playing unit for decoding, playing, and the like.

The processing system composed of the above server and client can realize the operations of generating, storing, transmitting and decoding complete media content. The above-mentioned server and client can perform various implementations described in the embodiments of the present invention through various modules built therein, and any part of the operations performed by the server and the client can be separately used as the processing system under different working states. An embodiment.

FIG. 8 is a schematic flowchart diagram of a method for processing video data according to an embodiment of the present invention. The method provided by the embodiment of the present invention includes the following steps:

S101. The server acquires segment information of each sequence layer segment in all sequence layer segments in the code stream.

In a specific implementation, the segment information is used to describe a dependency relationship between a sequence layer segment and a knowledge layer segment in the code stream. The server may use the LBVC method to encode the video content, and generate a sequence layer code stream and a knowledge layer code stream corresponding to the independent or interdependent video content, and then the sequence layer code stream may be sliced into sequence layer segments, and the knowledge layer code stream is cut. Divided into knowledge layer fragments. Specifically, the server may extract and extract the knowledge base image from the video content, as shown in FIG. 14 . FIG. 14 is a schematic diagram of extracting a knowledge base image from a video sequence by using the LBVC method. At least one sequence image (SP) depends on a knowledge base image (LB). For example, the sequence image SP1 depends on the knowledge base images LB1, LB2, LB3, and LB5. The server can separately encode the sequence image and the knowledge base image to obtain a sequence layer code stream and a knowledge layer code stream. Further, the code stream can be sliced into segments, wherein at least one sequence layer segment depends on at least one knowledge layer segment, and one knowledge layer segment includes at least one knowledge base image. In a possible implementation manner, as shown in FIG. 15, FIG. 15 is a schematic diagram of the knowledge base image being divided into knowledge layer segments. Each of the knowledge base images LB is divided into a knowledge layer segment K, and each sequence layer image (SP) is divided into a sequence layer segment (S). The sequence layer segment S depends on the knowledge layer segment K, for example, the sequence layer segment S1 depends on the knowledge layer segments K1, K3 and K5. Each sequence layer segment has a play period, and each knowledge layer segment has a dependent time period. In another possible implementation manner, as shown in FIG. 16, FIG. 16 is another schematic diagram of the knowledge base image being divided into knowledge layer segments. The knowledge base images with the same time-dependent period may be merged into one knowledge layer segment. For example, the sequence layer segment S1 depends on the knowledge layer segments K1 and K2, wherein the dependent periods of LB1 and LB2 are the same, and then divided into one knowledge. The layer segments K1, LB3 and LB5 are equally divided into time segments and can be further divided into the same knowledge layer segment K2. The specific implementation manner of the server to segment the sequence layer code stream and the knowledge layer code stream can be further described in the SVC-DASH standard, and details are not described herein.

In some feasible implementation manners, the dependency between the sequence layer segment and the knowledge layer segment in the code stream may be represented by an identifier such as dep_id. Specifically, the identifiers such as the dep_id may be carried in the segment information of each sequence layer segment to indicate which knowledge layer segment the sequence layer segment carrying the identifier depends on. For example, if the code stream contains the knowledge layer segment 1 (denoted as rep1) and the knowledge layer segment 2 (denoted as rep2), and the segment information of the sequence layer segment 1 carries dep_id=rep1, it can be determined that the sequence layer segment depends on the knowledge layer. Fragment 1.

S102. Determine, according to the segment information of each sequence layer segment, N sequence layer segments and a first target knowledge layer segment.

In some feasible implementation manners, the server may determine, according to the identifier of the knowledge layer segment carried in the segment information of each sequence layer segment (for example, dep_id=rep1), the knowledge layer segment that each sequence layer segment depends on, and further, Identify the target knowledge layer fragment. The target knowledge layer segment is a knowledge layer segment that is dependent on at least two sequence layer segments. It should be noted that, in the LBVC code stream, the knowledge layer segment that is dependent on at least two discontinuous sequence layer segments may be one or more, and the embodiment of the present invention uses one of the segments as the target knowledge layer segment as an example for description. . Specifically, the server may select one of the one or more sequence layer segments that are dependent on the at least two sequence layer segments as the target knowledge layer segment, and the subsequent operations corresponding to the other knowledge layer segments may refer to the target knowledge layer. The operation corresponding to the segment is not described below. After the server determines the target knowledge layer segment, the sequence layer segment that depends on the target knowledge layer segment may be determined as the N sequence layer segments of the target knowledge layer segment. Where N is greater than or An integer equal to 2.

In some feasible implementation manners, if the N sequence layer segments only depend on one knowledge layer segment, the knowledge layer segment may be determined as the first target knowledge layer segment, if one or more of the N sequence layer segments are The fragment also depends on another knowledge layer fragment, and the other knowledge layer fragment can be marked as the second target knowledge layer fragment. Specifically, it can be determined according to the time application scenario, and no limitation is imposed here.

In the LBVC code stream, since a certain knowledge layer segment (such as the above-mentioned target knowledge layer segment) is dependent on at least two discontinuous sequence layer segments, the existing DASH standard cannot tell the client at the playback period of the target knowledge layer segment. In addition, whether the above-mentioned target knowledge layer segment is still used, it is possible to no longer manage the above-mentioned target knowledge layer segment or delete the target knowledge layer segment prematurely outside the playing period, and then need to repeat outside the playing period. Request the target knowledge layer fragment.

In some possible implementations, in order for the client to identify the target knowledge layer segment and store the target knowledge layer segment for next use, such that a knowledge layer segment is used multiple times without being erroneously deleted or discarded, the present invention The embodiment introduces another time period information of the knowledge layer segment different from the playing period of the knowledge layer segment, an extended duration (ED), and can describe the knowledge layer segment through the MPD corresponding to the LBVC code stream. Extended time period information. The extended time period information indicates that the target knowledge layer segment is not only used in its corresponding playing period, but may also be used in other time periods (for example, for providing codec reference, analysis, or playback, etc.), and therefore needs to be additionally stored. Keep a certain amount of time. When the client recognizes that a segment has extended period information from the MPD, the client can know that the segment is a target knowledge layer segment. Alternatively, when the client identifies the description layer attribute information of the description layer from the MPD and carries the extended period information, the fragment described by the description layer may be determined as the target knowledge layer fragment. Further, the client may store the foregoing target knowledge layer segment for use in other time periods, thereby avoiding multiple transmissions of the same knowledge layer segment and avoiding waste of transmission bandwidth.

S103. Acquire segment information of the first target knowledge layer segment.

S104. Add extended period information of the first target knowledge layer segment to the MPD of the code stream according to the segment information of the first target knowledge layer segment and the segment information of the N sequence layer segments.

In a specific implementation, when the server generates the MPD of the LBVC code stream, the segment information of the knowledge layer segment of the LBVC code stream and the segment information of the sequence layer segment may be first acquired. The LBVC code stream may include one or more knowledge layer segments, and the LBVC code stream may include two or more sequence layer segments. The sequence layer segment in the LBVC code stream may be encoded by referring to at least one knowledge layer segment, that is, the sequence layer segment encoded by the reference knowledge layer segment in the LBVC code stream depends on at least one knowledge layer segment. It should be noted that, in a specific application, the sequence layer segment in the LBVC code stream may also be an independent segment, that is, the sequence layer segment in the LBVC code stream may also be independent of the knowledge layer segment. The embodiment of the present invention describes a method for processing video data in a scenario in which a sequence layer segment in a LBVC code stream depends on a knowledge layer segment, and a scenario in which a sequence layer segment in the LBVC code stream does not depend on a knowledge layer segment is not limited.

9, FIG. 9 is a schematic diagram of a fragment of video content generated by LBVC. The segment generated by the LBVC includes sequence layer segments S1 to S8 and knowledge layer segments K1 to K4. The server can obtain the segment information of the above S1 to S8 and the segment information of K1 to K4. Among them, S1~S8 are the serially consecutive 8 sequences, and S1 and S3 are temporally discontinuous sequence layer segments (separated by S2), and so on can determine the continuity between each sequence layer segment. Discontinuous relationship. Among them, S1 depends on K1 and K2, S2 depends on K1 and K3, S3 depends on K1 and K4, S4 depends on K2 and K4, S5 depends on K3, S6 depends on K1 and K3, S7 depends on K3, and S8 depends on K3. K1 is S1, S2, S3 and S6 Dependent, wherein S1, S2, and S3 are consecutive segments, and S6 is discontinuous with S1, S2, and S3. K2 is dependent on S1 and S4, where S1 and S4 are discontinuous segments. K3 is dependent on S2, S5, S6, and S7, where S5, S6, and S7 are consecutive segments, and S2 is a discontinuous segment of S5, S6, and S7. K4 is dependent on S3 and S4. As shown in FIG. 9, each sequence layer segment in S1 to S8 depends on at least one knowledge layer segment, and each sequence layer segment has a playback period (English: Presentation Duration, PD), and S1 to S8 correspond to PD1 to PD8. At least one of the knowledge layer segments K1 to K4 is dependent on at least two discontinuous sequence layer segments, for example, K1, K2 or K3, and the above K1, K2 and K3 can be determined as target knowledge layer segments. Among them, S1, S2, S3, and S6 are N sequence layer segments that depend on K2, and N is 4 at this time. The N sequence layer segments comprise two grouped sequence layer segments including a first sequence layer segment group and a second sequence layer segment group. The first sequence layer segment set contains three consecutive sequence layer segments, namely S1, S2 and S3. The second sequence layer segment set contains one sequence layer segment, namely S6. The first sequence layer segment group corresponds to the first time period, the first time period is the intersection of the playing periods of S1, S2, and S3, the second sequence layer group group corresponds to the second time period, and the second time period is the playing period of S6.

As shown in FIG. 9, each knowledge layer segment has a dependent time period (ie, DD), and the DD of the target knowledge layer segment that is dependent on at least two discontinuous sequence layer segments is composed of at least two EDs, wherein each The ED covers the PD of one or more sequence layer segments. For example, K1 is dependent on S1, S2, S3, and S6, and the dependent period DD1 of K1 is composed of DD1-1 and DD1-2, that is, DD1 is a set of DD1-1 and DD1-2. Among them, DD1-1 covers PDs of S1, S1, and SD3 (ie, DD1-1 covers PD1, PD2, and PD3), and DD1-2 covers PDs of S6 (ie, PD6). Wherein, the PD of the ED covering one or more sequence layer segments indicates that PDs of one or more sequence layer segments fall on the ED, that is, the ED includes PDs of one or more sequence layer segments. For example, suppose the PD of S1 is 00:00 to 00:59, the PD of S2 is 01:00 to 01:59, the PD of S3 is 02:00 to 02:59, and DD1-1 is at least 00:00 to 02. :59. It can be determined according to the actual application scenario, and no limitation is imposed here.

In some possible implementation manners, after the server acquires the knowledge layer segment and the sequence layer segment of the LBVC code stream, the segment information of each knowledge layer segment and the segment information of each sequence layer segment may be acquired. The fragment information of the knowledge layer may further include information such as a network storage address of the knowledge layer segment, a play period, and a time period of the knowledge layer segment, which may be determined according to actual application requirements, and is not limited herein. The segment information of the sequence layer includes information such as a network storage address of the sequence layer segment, a play period, and an identifier of the knowledge layer segment on which the sequence layer segment depends, and may be determined according to an actual application scenario, and is not limited herein.

Further, in some feasible implementation manners, after the server acquires the operated time period of each sequence layer segment and the dependent time period of each knowledge layer segment, it may be determined that each knowledge layer segment is dependent on the sequence layer segment. Depending on the state, the target knowledge layer fragment can be determined. For the determination of the target knowledge layer segment, refer to the foregoing description, and details are not described herein again. After the server determines the target knowledge layer segment and its corresponding N sequence layer segments, the extended time period information of the target knowledge layer segment may be added to the MPD of the code stream. Wherein, the server may determine the number of extended periods of the target knowledge layer segment (ie, how many sequence layer segment groups the target knowledge layer segment is dependent on), and the length of each extended period (ie, each sequence included in each sequence layer segment group) The length of time during which the playback period of the layer segment is superimposed, etc., thereby adding extended time period information to the target knowledge layer segment. After the client parses the MPD of the code stream, the dependent time period of the target knowledge layer segment may be determined according to the extended period information of the target knowledge layer segment. In a specific implementation, the dependent time period of the target knowledge layer segment includes at least one extended time period. It should be noted that the method and the device for processing the code stream data provided by the embodiment of the present invention are applicable to a scenario in which the time-dependent period of the target knowledge layer segment is an extended period, and is not limited herein. In the following, the dependent time period of the target knowledge layer segment includes at least two extended time periods as an example. Two extended time slots can be described as an example.

In some feasible implementation manners, the MPD of the code stream carries extended period information of the target knowledge layer segment. The extended period information of the target knowledge layer segment may be specifically a first extended identifier or a second extended identifier. In a specific implementation, the MPD of the code stream includes at least two description layers. The codec of at least one of the description layers included in the above MPD does not refer to other description layers (hereinafter referred to as independent description layers), and the above independent description layer is used to describe the knowledge layer fragments. The codec of at least one of the description layers included in the above MPD refers to other description layers for describing sequence layer segments that depend on the knowledge layer segment. In a specific implementation, if each of the N sequence layer segments that depend on the target knowledge layer segment depends only on the target knowledge layer segment (set as the first target knowledge layer segment, for example, K1 above), then an independent The description layer (set as the first description layer) describes the first target knowledge layer segment. A description layer (setting the second description layer) describes N sequence layer segments, wherein the codec of the second description layer refers to the first description layer. If the N1 sequence layer segments (eg, S1 above) that depend on the N sequence layer segments of the first target knowledge layer segment also depend on another target knowledge layer segment (set as the second target knowledge layer segment, such as K2), then A separate description layer (set as the third description layer) is used to describe the second target knowledge layer segment.

Referring to FIG. 10, FIG. 10 is a schematic diagram of an MPD according to an embodiment of the present invention. The server describes the information of the knowledge layer fragments in two separate description layers, and describes the information of the sequence layer fragments in a description layer that depends on the above two description layers. Specifically, the server may generate an MPD including three description layers according to the segment information of the knowledge layer segment and the segment information of the sequence layer segment. Wherein, the description layer 1 (such as the second description layer) is used to describe the sequence layer fragment information, and the description layer 2 (such as the first description layer) and the description layer 3 (such as the third description layer) are two independent description layers. The description of the knowledge layer segment information, that is, the segments described in the description layer 2 and the description layer 3 may all be the target knowledge layer segments (the target knowledge layer segments contained therein may also be determined according to time, which is not limited herein). When the segment information of the sequence layer segment is described in the description layer 1, each sequence layer segment corresponds to one PD. The DD of each target knowledge layer segment may be described by one or more EDs of each target knowledge layer segment when the segment information of the knowledge layer segment is described in the description layer 2 and the description layer 3.

In addition, since the LBVC code stream allows one sequence layer segment to simultaneously depend on multiple knowledge layer segments (set as the first target knowledge layer segment and the second target knowledge layer segment), that is, the first target knowledge layer segment and the second target knowledge layer segment The dependent time periods can be covered by each other. However, when the existing SVC-DASH standard describes a segment in a description layer, it is required that the time periods of the segments cannot be overlapped with each other, and therefore the description layer in the SVC-DASH standard cannot be used to describe the knowledge layer segment using only one description layer. As shown in FIG. 10, in order to correctly describe the information of the target knowledge layer segment that is covered by the dependent time period, the embodiment of the present invention uses multiple independent description layers to describe the information of the knowledge layer segment corresponding to the knowledge layer code stream, and each The discontinuous time-dependent segments of the target knowledge layer segments are split into a plurality of extended time periods, and the target knowledge layer segments are described in the plurality of description layers according to the extended time period, for example, the DD of K1 can be described by describing the ED1 and the description layer in layer 2. Description of ED7 in 3. At the same time, it is ensured that the extension periods of the target knowledge layer segments in each description layer do not overlap each other. For example, the server may describe K1, K2, and K3 in a description layer (such as description layer 2) in time series, and may also describe K2, K3, K4, and K1 in time series in another description layer (such as description layer 3). The extended periods of the target knowledge layer segments described by the description layer do not overlap each other. In this way, the server can correctly describe the dependent time period of the target knowledge layer segment in the MPD, and does not repeatedly store the same target knowledge layer segment when one sequence layer segment simultaneously depends on multiple target knowledge layer segments, thereby avoiding storage. The waste of space.

In some feasible implementation manners, the extended period information may be specifically an extended identifier (set as the first extended identifier) for marking the target knowledge layer segment or an extended flag (set as the second extended identifier) of the application marking description layer. On The first extended identifier or the second extended identifier may be used to determine detailed information such as a start time of the extended period and a length of the extended period. The extended time period is used to indicate that the target knowledge layer segment that the client has the extended time period is used in the playback period of the currently processed sequence layer segment that depends on the target knowledge layer segment, and is also in other sequence layers that depend on the target knowledge layer segment. The clip is used during the playback period.

In some feasible implementation manners, the foregoing first extended identifier for marking the target knowledge layer segment may be added to the segment information of the target knowledge layer segment as an attribute of the knowledge layer segment. Specifically, the first extended identifier may be added to the first fragment description included in the first description layer (and the third description layer) of the MPD, where the first fragment is described as a fragment of the target knowledge layer segment corresponding to the first time period. Description of the information. The first extended identifier may also be added to the second fragment description included in the first description layer (and the third description layer) of the MPD, where the second fragment is described as fragment information of the target knowledge layer segment corresponding to the second time period. description. The target knowledge layer segment carrying the first extended identifier has one or more extended time periods, and the first time period and the second time period respectively correspond to one extended time period. For details, refer to the foregoing description, and details are not described herein again. When the client parses the segment information of the target knowledge layer segment, if the first extended identifier is obtained, the segment may be determined as the target knowledge layer segment.

In some feasible implementation manners, the foregoing second extended identifier for marking the description layer may be added in the description layer attribute information of the description layer as an attribute of the description layer. Specifically, a second extended identifier corresponding to the first time period and a second extended identifier corresponding to the second time period may be added to the description layer attribute information of the first description layer (and the third description layer). The one or more or all of the segments described in the description layer carrying the second extended identifier are target knowledge layer segments. Each of the target knowledge layer segments described by the description layer carrying the second extended identifier has one or more extended time periods, where the first time period and the second time period respectively correspond to one extended time period. For details, refer to the foregoing description. No longer. When the client parses the MPD of the code stream, the description layer attribute information of the description layer in the MPD may be obtained. If the description layer attribute information of the description layer includes the second extension identifier, the segment description description segment may include the target knowledge. Layer fragment. After the client determines that the segment of the description layer includes the target knowledge layer segment, the target knowledge layer segment may be determined according to the specific segment information, or the segment described by the description layer may be determined as the target knowledge layer segment, which may be determined according to the actual application scenario. Ok, I won't go into details here.

In a specific implementation, the server may add the extended period information of the knowledge layer segment according to any one of the foregoing implementation manners according to the requirements of the actual application scenario, and is not limited herein.

In some feasible implementation manners, the implementation manner that the server uses the extension identifier to mark the extension period information of the knowledge layer segment in the MPD may include any one of the following:

1) Mode 1: The first extended identifier and the second extended identifier may be the first character string (for example, ExtDuration). In the first mode, the first extended identifier and the second extended identifier may be collectively referred to as an extended identifier. The server may add the first character string as an extended identifier on the basis of the syntax elements of the existing DASH standard, and describe the target knowledge layer segment (including the target knowledge layer segment carrying the extended identifier, or the description layer description carrying the extended identifier). The target knowledge layer segment) has an extended time period of fixed length. In a specific implementation, if all the knowledge layer segments described in the description layer (including the knowledge layer segment carrying the extended identifier) have consecutive extension periods and the same length of time, that is, the length of the extension period of the knowledge layer segment described in the description layer Fixed, the syntax element ExtDuration can be used to describe the extension period of the knowledge layer segment. FIG. 11 is a schematic diagram of adding an extended identifier based on syntax elements of the DASH standard. The ExtDuration describes the extended time period of the fixed time length, and the ExtSegmentTimeLine describes the extended time period of the variable time length. In the specific implementation, one of the syntax elements may be selected according to the actual application scenario. Figure 11 The upper part is an extensible mark-up language (XML) syntax table, and the following part is an application example.

When the server describes the LBVC code stream using the description layer of the MPD, the syntax element ExtDuration is used to describe the extended time period of a fixed length of time. Specifically, the extended period of the target knowledge layer segment may be acquired according to the syntax element ExtDuration and the segment information of the target knowledge layer segment, including information such as a start time of the extended time period and a length of the extended time period. Specifically, the length value of the extended period of the target knowledge layer segment is the same as the value corresponding to the ExtDuration. For example, ExtDuration=10s in FIG. 11 may indicate that the extended period of the target knowledge layer segment has a length of 10 s. In addition, the syntax element Duration can also be used to describe a fixed time length of the play period, which is not limited herein. When the server uses the syntax element ExtDuration to describe the extended time period of the fixed time length, the starting time of the extended time period may be calculated by the time length of all the knowledge layer segments before the target knowledge layer segment, and the specific knowledge of the target knowledge layer segment before. The number of layer segments can be determined by the segment information of the target knowledge layer segment.

2) Method 2: The first extended identifier and the second extended identifier may be the second string (for example, ExtSegmentTimeLine). In the second mode, the first extended identifier and the second extended identifier may be collectively referred to as an extended identifier. The server may add a second character string as an extended identifier on the basis of the syntax elements of the existing DASH standard, and describe the target knowledge layer segment (including the target knowledge layer segment carrying the extended identifier, or the description layer description carrying the extended identifier). The target knowledge layer segment) has an extended time period of variable length. In a specific implementation, if all the knowledge layer segments described in the description layer (including the knowledge layer segment carrying the extended identifier) have discontinuous extension periods, or the extension periods of different target knowledge layer segments are different, the syntax may be used. The element ExtSegmentTimeLine describes the extended period of the target knowledge layer segment. As shown in FIG. 11, when the server describes the LBVC code stream using the description layer of the MPD, the syntax element ExtSegmentTimeLine is used to describe the extended time period of the variable time length. Specifically, the extended period of the target knowledge layer segment may be acquired according to the syntax element ExtDuration, including information such as a start time of the extended period and a length of the extended period. Specifically, the syntax element ExtSegmentTimeLine indicates that information for describing the start time of the extended period and the length of the extended period is stored in the MPD. In a specific implementation, the information about the start time and the length of the extended time period may be determined by the codec standard used by the server, and may be determined according to the actual application scenario, and is not limited herein. In addition, the server can also use the syntax element SegmentTimeLine to describe the playing time of the variable length of time, which is not limited herein.

3) Mode 3: The server may add a syntax element Extended as an extension identifier (ie, a second extension identifier) for marking the description layer based on the syntax elements of the existing DASH standard. The foregoing extension identifier is used to identify at least one of the segments described in the description layer as a target knowledge layer segment. FIG. 12 is another schematic diagram of adding an extended identifier based on syntax elements of the DASH standard. The above part is the XML syntax table, and the following part is the application example. In a specific implementation, the extended description is a true description indicating that the description layer carrying the extended identifier describes an extended period of the at least one target knowledge layer segment; the extended not true flag indicates that the description layer carrying the extended identifier does not describe the extended period of the knowledge layer segment. That is, the server may use the syntax element Extended=True to identify the information describing the at least one target knowledge layer segment of the description layer, and use the syntax element Extended=False or the default syntax element to identify that the description layer only describes the information of the sequence layer segment without describing Information about the segment of the knowledge layer. The extended period of the target knowledge layer segment described by the description layer carrying the syntax element Extended=True is obtained by the syntax syntax element Extended=True and the segment information of the target knowledge layer segment. Specifically, the extended period of the target knowledge layer segment is the same as the playback period of the segment in the description layer of the syntax element Extended=True.

4) Method 4: The server may add a syntax element ExtSegment as an extension identifier (ie, a second extension identifier) for marking the description layer based on the syntax elements of the existing DASH standard. The foregoing extension identifier is used to identify at least one of the segments described in the description layer as a target knowledge layer segment. FIG. 13 is another schematic diagram of adding an extended identifier based on syntax elements of the DASH standard. Among them, the above figure is an XML syntax table, and the following figure is an application example. In a specific implementation, the ExtSegment is an extended period that describes at least one target knowledge layer segment in the description layer that carries the extended identifier. The ExtSegment is not true. The description layer carrying the extended identifier does not describe the extended period of the knowledge layer segment. That is, the server uses the syntax element ExtSegment=True to identify a segment (or a set of segments) described by the description layer as the target knowledge layer segment, and uses the syntax element ExtSegment=False or the default syntax element to identify a segment describing the layer description (or one) The group fragment) is a sequence layer fragment. The extension period of the target knowledge layer segment described by the description layer carrying the syntax element ExtSegment=True is obtained by the syntax syntax element ExtSegment=True and the segment information of the target knowledge layer segment. Specifically, the extended period of the target knowledge layer segment is the same as the playback period of the segment of the syntax element ExtSegment=True.

Further, in some feasible implementation manners, when the server generates the MPD of the LBVC code stream, the extended identifier may be added by using any one of the foregoing four implementation manners, and the description may be used to describe the knowledge according to the characteristics of the extended time period. The number of independent description layers of the layer segments.

In a specific implementation, the dependent time period of the target knowledge layer segment is composed of at least one extended time period, each extended time period is one continuous time period, and the case that the extended time period is included in the dependent time period includes the following two types:

Case 1: The dependent period of the target knowledge layer segment contains only one extended period, which corresponds to at least two playing periods in the sequence layer, such as the knowledge layer segment K4 in FIG. 9 described above. At this time, the dependent period of the knowledge layer segment K4 can be described by an extended period in the MPD.

Case 2: If the dependent time period of the target knowledge layer segment contains at least two extended time periods, such as the knowledge layer segments K1, K2 and K3 in FIG. 9 above. At this time, the dependent time period of the target knowledge layer segment may be split into multiple extended time periods, and multiple consecutive time periods of the target knowledge layer segment are respectively described in the MPD, such as the knowledge layer in FIG. 10 above. The dependent period DD1 of the segment K1 (including DD1-1 and DD1-2) is described by the extended period ED1 and the extended period ED7 in the above-described FIG.

Further, in some feasible implementation manners, if the dependent period of the target knowledge layer segment includes at least two extended periods, such as the knowledge layer segment K3 in FIG. 9 above, the dependent period DD3 of K3 includes the extended period ED3 and the extended period ED5. The playing period of the sequence layer segment corresponding to ED5 is PD2, and the playing period of the sequence layer segment corresponding to ED3 is PD5, PD6, PD7 and PD8, that is, DD3=PD2+PD5+PD6+PD7+PD8. The server can modify the dependent period of K3 to DD3=PD2+PD3+PD4+PD5+PD6+PD7+PD8, which can be described as a continuous fragment containing PD2 to PD8 when describing K3 in MPD, which simplifies The description of the MPD reduces the difficulty of implementing the extension of the DASH standard.

In some feasible implementation manners, if the extended periods of the at least two target knowledge layer segments overlap each other (ie, one sequence layer segment depends on multiple target knowledge layer segments at the same time), as shown in FIG. 9 , the knowledge layer segments K1 and K2 The extended period of K3, K4, and the extended period of K2 and K4, the information of the knowledge layer segments whose extended periods are not covered by each other is described as much as possible in one description layer, so that the number of finally described layers is as small as possible.

In a possible implementation manner, if the extended period of any target knowledge layer segment does not overlap each other, that is, one sequence layer segment only depends on one target knowledge layer segment, for example, the knowledge layer segment only exists in the knowledge layer segment K3 in FIG. With K4, only a single description layer is used to describe the information of the target knowledge layer segment.

In another possible implementation, the information of the knowledge layer segment may be described using M description layers, where M is the maximum number of knowledge layer segments that each sequence layer segment depends on. For the sequence layer segment i, the extension period of the Mi knowledge layer segments it depends on is set as the play period of the sequence layer segment, and the Mi knowledge layer segments are distributed in any of the M description layers in the M description layer, for example 1st to the fourthth description layer, or the 2nd to the Mi+1th description layer (if Mi+1<=M), or the 1st, 3rd to the 4thth to the 1st description layer (if Mi+1< =M), which can be determined according to the actual application scenario. There is no restriction here.

S105. Send the MPD of the code stream to the client.

In some feasible implementation manners, after the server generates the MPD of the LBVC code stream, the knowledge layer segment (including the target knowledge layer segment) and the sequence layer segment, and the MPD may be stored according to the specified network address. Further, the server can wait to receive a request sent by the client. When the server receives the request sent by the client, the MPD of the above code stream can be sent to the client. The server may also send the corresponding sequence layer segment or the knowledge layer segment to the client through HTTP according to the requested network storage address when receiving the HTTP request sent by the client. For details, refer to the implementation manners provided in the DASH standard, and details are not described herein again.

S106. The client parses the MPD of the code stream sent by the server, and determines extended time period information carried in the MPD.

In a specific implementation, since the server of the embodiment of the present invention extends the syntax elements of the DASH standard, the extended time period information is added to the MPD of the code stream, and the existing DASH-compliant client (hereinafter referred to as the client) cannot be parsed and Get the extended time period. Therefore, the embodiment of the present invention adds an extension period extension mechanism to the client to identify the extended syntax element, so that the client can parse and acquire the extension period of the target knowledge layer segment, so that the playback period and the extension period can be used. The sequence layer segment and the target knowledge layer segment are distinguished.

In the embodiment of the present invention, a knowledge layer fragment request analysis mechanism is added to the client, and whether the plurality of knowledge layer segments described in the MPD are target knowledge layer segments are determined according to whether the network storage address information of the knowledge layer segment is the same, so that the storage space can be Specifically, the storage state of the target knowledge layer segment may be checked in a storage device for storing the knowledge base to determine whether the target knowledge layer segment is to be requested and downloaded to the server to avoid repeatedly downloading the same segment, thereby saving transmission bandwidth. Further, the embodiment of the present invention adds a knowledge storage management mechanism to the client to manage the storage of the target knowledge layer segment, and ensures that the target knowledge layer segment is stored in the client during the time period in which it is dependent. At the same time, a storage list of the knowledge layer segments is constructed, and the storage state of the target knowledge layer segments is recorded, so that the client checks the storage state of the target knowledge layer segments.

In some feasible implementation manners, since the characteristics of the sequence layer segment of the LBVC code stream are similar to those of the enhancement layer segment in the SVC-DASH, the client may request the sequence layer segment from the server according to the existing DASH standard. This will not be repeated here. For the target knowledge layer segment, when obtaining the type of the playback media selected by the user, the client may first request the MPD of the LBVC code stream from the server, and parse the MPD sent by the server. The client may obtain extended period information included in the MPD by parsing the MPD sent by the server, where the extended period information may include a first extended identifier for marking the target knowledge layer segment, or a second extended identifier for marking the description layer . The target knowledge layer segment carrying the first extended identifier is included in one or more description layers, and the description layer carrying the second extended identifier describes one or more target knowledge layer segments.

S107. Determine a target knowledge layer segment according to the extended time period information, and determine a dependent time period of the target knowledge layer segment.

In some feasible implementation manners, after the client obtains the foregoing first extended identifier or the second extended identifier, the fragment carrying the first extended identifier may be determined as a target knowledge layer segment, or a description of carrying the extended identifier may be carried. One or more segments of the layer description are determined as target knowledge layer segments, and information of the extended time period of the target knowledge layer segment is acquired. The information of the foregoing extended time period includes: a start time of the extended time period and a length of the extended time period, and the like. For details, refer to the implementation manner of the foregoing server description, and details are not described herein again.

In some feasible implementation manners, after the client determines the target knowledge layer segment according to the MPD sent by the server, the one or more extended time periods owned by the target knowledge layer segment may be further determined. The target knowledge layer segment is one or more segments included in the MPD that carry the first extended identifier or one or more description layers that carry the second extended identifier. In a specific implementation, the client parses the information of the fragment marked by the extended identifier (including the first extended identifier or the second extended identifier) according to the MPD sent by the server, and obtains the extended period of the fragment according to the extended identifier and the server pair extension. The description of the logo corresponds. The obtaining manner of the information of the extended period of the target knowledge layer segment by the client may include any one of the following four methods:

1) Method 1: When the extended identifier (including the first extended identifier or the second extended identifier) adopted by the server is the first character string (for example, ExtDuration), the client may parse the MPD sent by the server, and identify ExtDuration from the server. Further, the value corresponding to the ExtDuration may be determined as the length of the extended period of the target knowledge layer segment, and the start time of the extended period of the target knowledge layer segment may be determined according to the end time of the time length of all the knowledge layer segments before the target knowledge layer segment. time. For the specific implementation manner of determining the start time and the length of the extended period according to the first character string, refer to the implementation manner described in the foregoing server, and details are not described herein again.

2) Method 2: When the extension identifier (including the first extension identifier or the second extension identifier) adopted by the server is a second string (for example, ExtSegmentTimeLine), the client may parse the MPD sent by the server, and identify the ExtSegmentTimeLine from the server. Further, information such as a start time of the extended period of the target knowledge layer segment and a length of the extended period may be identified and acquired from the code stream according to the adopted decoding standard. The specific implementation manner of determining the start time and the length of the extended time period according to the second character string may be referred to the implementation manner of the foregoing server description, and details are not described herein again.

3) Method 3: When the extension identifier used by the server is the syntax element Extended, the client may parse the MPD sent by the server, and identify the syntax element Extended. When the Extended is true (ie, Extended=True), the playing period of the segment carrying the extended identifier or the playing period of the target knowledge layer segment carrying the description layer of the extended identifier may be calculated, and the calculated playing period is determined as The extended period of the target knowledge layer fragment.

4) Method 4: When the extension identifier used by the server is the syntax element ExtSegment, the client can parse the MPD sent by the server, and identify the syntax element ExtSegment. When the above ExtSegment is true (ie, ExtSegment=True), the playing period of the segment carrying the extended identifier or the playing period of the at least one knowledge layer segment carrying the description of the description layer of the extended identifier may be calculated, and the calculated playing period is determined. The extension period of the target knowledge layer segment.

In a specific implementation, the client may determine the target knowledge layer segment and the dependent time period of the target knowledge layer segment according to the extended time period information. For the determination of the foregoing target knowledge layer segment and its dependent time period, refer to the corresponding implementation manner of the foregoing server, and details are not described herein again.

S108. Acquire a network storage address of the target knowledge layer segment from the MPD of the code stream, and record a dependent time period and a network storage address of the target knowledge layer segment.

In some feasible implementation manners, the client may further determine the network storage corresponding to the extended period according to the foregoing MPD. The storage address determines the target knowledge layer segment to which each extended period belongs according to the network storage address corresponding to each extended period. Further, the client may determine the storage state of the target knowledge layer segment in the storage device of the client. In a specific implementation, the implementation manner of determining, by the client, the network storage address of each extended period may be determined according to the storage address described by the server in the MPD, and details are not described herein again. The storage state of each of the above-mentioned knowledge layer segments may be determined according to data stored in the storage device of the client, and details are not described herein again.

In some feasible implementation manners, the client may determine a dependent time period of each target knowledge layer segment according to a network storage address of each extended time period. The extended time period in which the network storage addresses are the same may be determined as the extended time period of the same target knowledge layer segment, and the set of the extended time period of the same target knowledge layer segment may be determined as the dependent time period of the target knowledge layer segment.

In some feasible implementation manners, after the client determines the dependent time period of the target knowledge layer segment and the extended time period included in the dependent time period, one or more knowledge layer segment lists may be constructed, and the knowledge layer segment list is The information such as the network storage address, the storage state, and the time period of the dependency of each target knowledge layer segment is recorded. The storage state of the target knowledge layer segment may be represented by a storage status flag. Specifically, if the MPD uses multiple description layers to describe the information of the target knowledge layer segment, the information of the target knowledge layer segments in all the description layers described above is recorded in the knowledge layer segment list. FIG. 17 is a schematic diagram of a knowledge layer segment list. The client may describe the network storage addresses of all target knowledge layer segments, the extended period start time and the extended period length (ie, extension) in the two knowledge layer fragment lists (including the knowledge layer fragment list-1 and the knowledge layer fragment list-2). The duration of the period). Wherein, each list in FIG. 17 records information of the knowledge layer segment described in the description layer of FIG. 10, and the knowledge layer segment of the same network storage address is the same knowledge layer segment. Further, as shown in FIG. 18, FIG. 18 is another schematic diagram of a knowledge layer segment list. The client may also describe the network storage address of all knowledge layer segments in the knowledge layer fragment list, the expansion period start time and the extension period length (ie, the extension period duration). The knowledge layer segment list records information of all the knowledge layer segments described in FIG. 10, and the knowledge layer segments of the same network storage address are the same knowledge layer segment. The K1, K2, and K3 can be set as the target knowledge layer segment provided by the embodiment of the present invention, and K4 is the knowledge layer segment that is dependent on two consecutive sequence layer segments.

Further, the client may construct another knowledge layer segment list (which may be named as a knowledge layer segment storage list), and record the network storage address, the storage state, and the dependent time period of the knowledge layer segment through the knowledge layer segment storage list. Wherein, the above-mentioned dependent time period is a set of extended time periods having the same network storage address. FIG. 19 is another schematic diagram of a knowledge layer segment list. The information of each knowledge layer segment in the knowledge layer segment storage list is determined by the information recorded in FIG. 17 above, and the initial storage state of each knowledge layer segment can be set to be False. The storage state of each knowledge layer segment may be changed in real time according to the state of the knowledge layer segment acquired in the actual application process to better determine whether a certain knowledge layer segment needs to be downloaded again in a certain time period.

S109. When the video on demand request is obtained, it is determined whether the on-demand time carried in the video-on-demand request is included in the time-dependent period of the target knowledge layer segment. If the determination result is yes, step S110 is performed.

In some feasible implementation manners, the client may obtain a video operation request that is triggered when the user plays the video on demand, and may also obtain an on-demand time carried in the video operation request, and further determine whether the time-dependent period of the target knowledge layer segment is included by using the look-up table. The above-mentioned on-demand time. Specifically, whether the extended time period covering the above-mentioned on-demand time exists in the extended time period included in the dependent time period of all the knowledge layer segments may be determined by using a lookup table, and if yes, the extended time period coverage may be determined to be covered by the dependent time period. The knowledge layer fragment of the above operation time. That is, the client can be based on the above-mentioned on-demand And searching for all the knowledge layer segments included in the knowledge layer segment storage list to search for the knowledge layer segment (set as the second knowledge layer segment) of the above-mentioned on-demand time by the dependent time period. The above-mentioned time-dependent period covering the playback time indicates that the start time of one of the extended periods included in the time-dependent period is before the playback time or the above-mentioned on-demand time, and the end time of the extended period is after the above-mentioned on-demand time or For the above-mentioned on-demand time, it can be determined according to the actual application scenario, and no limitation is imposed here. It should be noted that there is a case where the on-demand time does not depend on any knowledge layer segment, that is, the sequence layer segment corresponding to the on-demand time does not depend on the knowledge layer segment, and at this time, the extension period included in the dependent time period of all the knowledge layer segments is not There is an extended period covering the above operation time, and there is no need to request a knowledge layer fragment.

S110. View a storage state of the target knowledge layer segment in a storage space of the client, and determine an acquisition manner of the target knowledge layer segment according to the storage state.

In some feasible implementation manners, after the client determines the second knowledge layer segment by using the lookup table, the client may also determine, according to the storage mark of each knowledge layer segment stored in the knowledge layer segment storage list, each knowledge layer segment in the client. Storage status. Further, the client may determine the manner in which the second knowledge layer segment is acquired according to the storage state of each knowledge layer segment. Specifically, if the storage state of the second knowledge layer segment recorded in the knowledge layer segment storage list is empty (that is, the storage state of the second target knowledge layer segment is marked as False), the client may send the obtained information to the server. A request for a second knowledge layer fragment. After receiving the request sent by the client, the server may send the data of the second knowledge layer segment to the client. The client may receive the second knowledge layer fragment sent by the server, and may change the storage token of the second knowledge layer fragment stored in the knowledge layer fragment storage list to be non-empty (changed to True). If the storage state of the second knowledge layer segment recorded in the knowledge layer segment storage list is not empty (that is, the storage state of the second target knowledge layer segment is marked as True), the client may directly obtain the foregoing from the storage space thereof. The second knowledge layer segment does not need to send an acquisition request to the server, thereby avoiding repeated downloading of the same knowledge layer segment and saving transmission bandwidth.

In some feasible implementation manners, after the client requests and acquires the second knowledge layer segment from the server, the storage state of the knowledge layer segment in the storage space may be adjusted according to the storage space size of the storage device storing the knowledge layer segment. Specifically, if the remaining space size of the storage device for storing the knowledge layer segment in the client is greater than or equal to the data size of the second knowledge layer segment, the storage state of the acquired second knowledge layer segment is changed to not empty. And storing the second knowledge layer segment in the above storage device. If the remaining space size of the storage device for storing the knowledge layer segment in the client is smaller than the data size of the second knowledge layer segment, one or more other knowledge layer segments stored in the storage device may be deleted (ie, the specified target knowledge layer is deleted) Fragment) storing the remaining space size of the storage device to be greater than or equal to the data size of the second knowledge layer segment, and storing the second knowledge layer segment in the storage device. Further, the client may change the storage flag corresponding to the storage state of the deleted knowledge layer segment to null in the knowledge layer segment storage list, and change the storage flag corresponding to the storage state of the second knowledge layer segment to not empty. In order to facilitate the client to determine how to obtain each knowledge layer fragment in subsequent operations.

In a specific implementation, the time interval between the time-dependent period of the specified target knowledge layer segment and the above-mentioned on-demand time is greater than a preset time threshold. Specifically, when the client deletes the knowledge layer segment in the storage space, the knowledge layer segment of the dependent time period before the current on-demand time may be selected according to the dependent time period of the knowledge layer segment, that is, the knowledge layer segment is no longer used. Remove it. Further, the client may also select the knowledge layer segment that is the farthest from the current on-demand time in the next extended period of the dependent time period according to the dependent time period of the knowledge layer segment, that is, the time that the knowledge layer segment needs to wait for the next use. Long, remove it.

In the embodiment of the present invention, the server may add extended period information to the MPD of the code stream to mark information such as an extended period of the target knowledge layer segment. The client may obtain the extended time period information included in the MPD by parsing the MPD of the code stream, determine the dependent time period of the target knowledge layer segment, and store the dependent time period of the target knowledge layer segment and the target knowledge layer segment in the client. The storage status flag for the storage state. Further, when receiving the request for the on-demand video of the user, the client may search for the extended time period including the on-demand time according to the on-demand time carried in the on-demand request, and further determine the target knowledge layer segment corresponding to the extended time period and the storage status flag thereof. . The client may determine whether to request the target knowledge layer segment from the server according to the storage state of the target knowledge layer segment, or obtain the target knowledge layer segment from the local storage space, thereby avoiding multiple loading and storage of the same knowledge layer segment, and saving data transmission. Bandwidth, improve the processing efficiency of the stream data.

FIG. 20 is a schematic structural diagram of a video data processing apparatus according to an embodiment of the present invention. The processing device provided by the embodiment of the present invention includes:

The obtaining unit 201 is configured to acquire segment information of each sequence layer segment in all sequence layer segments in the code stream, where the segment information is used to describe a dependency relationship between the sequence layer segment and the knowledge layer segment in the code stream.

a determining unit 202, configured to determine, according to the segment information of each sequence layer segment acquired by the acquiring unit 201, N sequence layer segments and a first target knowledge layer segment, where the N sequence layer segments are dependent on a first target knowledge layer segment, wherein the N sequence layer segments include at least two discontinuous sequence layer segments, and the first target knowledge layer segment is one of at least one knowledge layer segment included in the code stream .

The acquiring unit 201 is further configured to acquire segment information of the first target knowledge layer segment determined by the determining unit 202.

The adding unit 203 is configured to add, according to the segment information of the first target knowledge layer segment and the segment information of the N sequence layer segments acquired by the acquiring unit 201, in the media expression description MPD of the code stream. Extended period information of the first target knowledge layer segment, the N sequence layer segments being encoded within a period indicated by the extended period information.

The sending unit 204 is configured to send the MPD of the code stream processed by the adding unit 203 to the client.

In some possible implementations, the determining unit 202 is specifically configured to:

In some possible implementations, the N sequence layer segments include at least two grouped sequence layer segments, and the at least two groups include at least a first sequence layer segment group corresponding to the first time period and a second time period corresponding to a second sequence layer segment group;

The MPD of the code stream includes at least two description layers, and the first description layer of the at least two description layers describes the first item a knowledge layer segment, the second description layer describing the sequence layer segment;

The adding unit 203 is specifically configured to:

In some possible implementations, the first extended period information and the second extended period information are both first extended identifiers;

The adding unit 203 is specifically configured to:

In some possible implementations, the first extended period information and the second extended period information are both second extended identifiers;

The adding unit 203 is specifically configured to:

In some possible implementations, if the first sequence layer segment group further depends on the second target knowledge layer segment, the MPD of the code stream further includes a third description layer, where the third description layer describes the The second target knowledge layer fragment.

In some possible implementations, the adding unit 203 is further configured to:

In a specific implementation, the processing device may be a function module in a server or a server provided by the embodiment of the present invention, and the processing device may perform the corresponding implementation manner of the server in each step of the processing method of the video data by using various units built therein. This will not be repeated here.

In the embodiment of the present invention, the server may determine the knowledge layer segment that is dependent on at least two discontinuous sequence layer segments as the target knowledge layer segment, and add extended time period information in the MPD of the code stream to mark the target knowledge layer segment. Information such as extended time period is provided for the client to distinguish between the target knowledge layer segment and the non-target knowledge layer segment, thereby avoiding repeated loading and transmission of the target knowledge layer segment, saving data transmission bandwidth, and enhancing the applicability of video data processing.

FIG. 21 is another schematic structural diagram of a video data processing apparatus according to an embodiment of the present invention. The processing device provided by the embodiment of the present invention includes:

The parsing unit 211 is configured to parse a media expression description MPD of the code stream sent by the server, and determine extended period information carried in the MPD, where the extended period information is used to determine a target knowledge layer segment included in the code stream. The target knowledge layer segment is one of at least one knowledge layer segment included in the code stream, and the target knowledge layer segment is dependent on N sequence layer segments in the code stream.

a determining unit 212, configured to determine, according to the extended period information acquired by the parsing unit 211, a target knowledge layer segment, and determine a dependent period of the target knowledge layer segment, where the N sequence layer segments are in the Target knowledge The layered segments are encoded within the time period of the dependency.

a recording unit 213, configured to acquire a network storage address of the target knowledge layer segment from the MPD of the code stream parsed by the parsing unit 211, and record a dependent time period of the target knowledge layer segment determined by the determining unit 212 And the network storage address.

The determining unit 214 is configured to determine, when the video on demand request is obtained, whether the on-demand time carried in the video-on-demand request is included in the time-dependent period of the target knowledge layer segment recorded by the recording unit 213.

The obtaining unit 215 is configured to: when the determination result of the determining unit 214 is YES, view a storage state of the target knowledge layer segment in a storage space of the client, and determine the target knowledge layer according to the storage state. How to get the fragment.

In some possible implementations, the first extended period information and the second extended period information are first extended identifiers;

The parsing unit is specifically configured to:

The determining unit is specifically configured to:

In some possible implementations, the first extended period information and the second extended period information are second extended identifiers;

The parsing unit is specifically configured to:

The determining unit is specifically configured to:

In some possible implementations, the determining unit is specifically configured to:

Determining, according to the first extended period information, a first extended period of the target knowledge layer segment, and according to the The second extended period information determines a second extended period of the target knowledge layer segment;

In some possible implementations, the recording unit is specifically configured to:

The recording unit is further configured to:

The obtaining unit is specifically configured to:

In some possible implementations, the obtaining unit is further configured to:

Receiving the target knowledge layer segment sent by the server;

In a specific implementation, the foregoing processing device may be a client provided by an embodiment of the present invention, or may be a function module in a client, and the processing device may perform a client in each step of the foregoing video data processing method by using various units built therein. The corresponding implementation manner will not be described here.

In the embodiment of the present invention, the client may obtain the extended time period information included in the MPD by parsing the MPD of the code stream, determine the dependent time period of the target knowledge layer segment, and store the dependent time period and target of the target knowledge layer segment. The storage state flag of the storage layer state of the knowledge layer fragment in the client. Further, when receiving the request for the on-demand video of the user, the client may search for the extended time period including the on-demand time according to the on-demand time carried in the on-demand request, and further determine the target knowledge layer segment corresponding to the extended time period and the storage status flag thereof. . The client may determine whether to request the target knowledge layer segment from the server according to the storage state of the target knowledge layer segment, or obtain the target knowledge layer segment from the local storage space, thereby avoiding multiple loading and storage of the same knowledge layer segment, and saving data transmission. Bandwidth, improve the processing efficiency of the stream data.

FIG. 22 is a schematic structural diagram of a server according to an embodiment of the present invention. The server provided by the embodiment of the present invention may include: a memory 221 and a processor 222, where the memory 221 is connected to the processor 222;

The memory 221 is for storing a set of program codes.

The processor 222 is configured to call the program code stored in the memory 221 to perform the following operations:

Obtaining segment information of each sequence layer segment in all sequence layer segments in the code stream, the segment information being used to describe a dependency relationship between the sequence layer segment and the knowledge layer segment in the code stream;

Obtaining segment information of the first target knowledge layer segment;

Sending the MPD of the code stream to the client.

In some possible implementations, the processor 222 is specifically configured to:

The processor 222 is specifically configured to:

In some possible implementations, the processor 222 is further configured to:

In a specific implementation, the server may perform an implementation manner of the server in each step of the foregoing processing method of the video data, and details are not described herein again.

FIG. 23 is a schematic structural diagram of a server according to an embodiment of the present invention. The server provided by the embodiment of the present invention may include: a memory 231 and a processor 232, where the memory 231 is connected to the processor 232;

The memory 231 is used to store a set of program codes.

The processor 232 is configured to call the program code stored in the memory 231 to perform the following operations:

Determining, by the media expression description MPD of the code stream sent by the server, the extended time period information carried in the MPD, where the extended time period information is used to determine a time-dependent period of the target knowledge layer segment included in the code stream, the target The knowledge layer segment is one of at least one knowledge layer segment included in the code stream, and the target knowledge layer segment is dependent on N sequence layer segments in the code stream;

In some possible implementations, the N sequence layer segments include at least two grouped sequence layer segments, and the at least two groups include at least a first sequence layer segment group corresponding to the first time period and a second time period corresponding to Second sequence layer Fragment group

The processor 232 is specifically configured to:

In some possible implementations, the processor 232 is specifically configured to:

In some possible implementations, the processor 232 is further configured to:

Receiving the target knowledge layer segment sent by the server;

In a specific implementation, the client may perform the implementation manner of the client in each step of the processing method of the foregoing video data, and details are not described herein again.

The terms "first", "second", "third", and "fourth" and the like in the description, the claims, and the drawings of the present invention are used to distinguish different objects, and are not intended to describe a particular order. Furthermore, the terms "comprises" and "comprising" and "comprising" are intended to cover a non-exclusive inclusion. For example, a process, method, system, product, or device that comprises a series of steps or units is not limited to the listed steps or units, but optionally includes steps or units not listed, or alternatively Other steps or units inherent to these processes, methods, systems, products or equipment.

One of ordinary skill in the art can understand that all or part of the process of implementing the foregoing embodiments can be completed by a computer program to instruct related hardware, and the program can be stored in a computer readable storage medium. When executed, the flow of an embodiment of the methods as described above may be included. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).

The above is only the preferred embodiment of the present invention, and the scope of the present invention is not limited thereto, and thus equivalent changes made in the claims of the present invention are still within the scope of the present invention.

Claims

A method for processing video data, comprising:

Obtaining, by the server, fragment information of each sequence layer segment in all sequence layer segments in the code stream, where the segment information is used to describe a dependency relationship between the sequence layer segment and the knowledge layer segment in the code stream;

Determining N sequence layer segments and a first target knowledge layer segment according to the segment information of each sequence layer segment, wherein the N sequence layer segments are dependent on the first target knowledge layer segment, the N sequence layers The segment includes at least two discontinuous sequence layer segments, and the first target knowledge layer segment is one of at least one knowledge layer segment included in the code stream;

Obtaining segment information of the first target knowledge layer segment;

Adding extended period information of the first target knowledge layer segment to the media expression description MPD of the code stream according to the segment information of the first target knowledge layer segment and the segment information of the N sequence layer segments, N sequence layer segments are encoded within a period indicated by the extended period information;

Sending the MPD of the code stream to the client.
The method according to claim 1, wherein the determining the N sequence layer segments and the first target knowledge layer segments according to the segment information of each sequence layer segment comprises:

Determining, according to the identifier of the knowledge layer segment included in the segment information of each sequence layer segment, a knowledge layer segment on which each sequence layer segment depends;

Determining a first target knowledge layer segment and determining N sequence layer segments that depend on the first target knowledge layer segment.
The method according to claim 2, wherein said N sequence layer segments comprise at least two grouped sequence layer segments, said at least two packets comprising at least a first sequence layer segment group corresponding to a first time period and a second sequence layer segment group corresponding to the second time period;

The first sequence layer segment group includes N1 sequence layer segments, the second sequence layer segment group includes N2 sequence layer segments, the N1 sequence layer segments and the N2 sequence layer segments are discontinuous, and N1 +N2<=N;

If the N1>1, the N1 sequence layer segments are consecutive sequence layer segments; if the N2>1, the N2 sequence layer segments are consecutive sequence layer segments;

The MPD of the code stream includes at least two description layers, a first description layer of the at least two description layers describes a first target knowledge layer segment, and a second description layer describes a sequence layer segment;

The adding the extended period information of the first target knowledge layer segment to the MPD of the code stream according to the segment information of the first target knowledge layer segment and the segment information of the N sequence layer segments includes:

Adding first extended period information to the first segment description corresponding to the first period included in the first description layer, and in the second segment description corresponding to the second period included in the first description layer Add the second extended period information.
The method according to claim 3, wherein the first extended period information and the second extended period information are both first extended identifiers;

The adding extended period information of the first target knowledge layer segment in the MPD of the code stream includes:

Adding a first extended identifier to the segment information of the first target knowledge layer segment included in the first segment description, and segment information of the first target knowledge layer segment included in the second segment description Add the first extension ID.
The method according to claim 3, wherein the first extended period information and the second extended period information are both second extended identifiers;

The adding extended period information of the first target knowledge layer segment in the MPD of the code stream includes:

Adding, in the description layer attribute information of the first description layer, a second extension identifier corresponding to the first period and a second extension identifier corresponding to the second period.
The method of claim 3, wherein the method further comprises:

If the first sequence layer segment group further depends on the second target knowledge layer segment, the MPD of the code stream further includes a third description layer, and the third description layer describes the second target knowledge layer segment.
The method of claim 6 wherein the method further comprises:

Adding, in the third segment description corresponding to the first time period included in the third description layer, third extended period information, where the third extended period information is a first extended identifier; or

The third extended period information is added to the description layer attribute information of the third description layer, where the third extended period information is a second extended identifier.
A method for processing video data, comprising:

The media expression description MPD of the code stream sent by the client parsing server determines the extended period information carried in the MPD, and the extended period information is used to determine a time-dependent period of the target knowledge layer segment included in the code stream. The target knowledge layer segment is one of at least one knowledge layer segment included in the code stream, and the target knowledge layer segment is dependent on N sequence layer segments in the code stream;

Determining a target knowledge layer segment according to the extended time period information, and determining a dependent time period of the target knowledge layer segment, the N sequence layer segments being encoded within a dependent time period of the target knowledge layer segment;

Obtaining a network storage address of the target knowledge layer segment from the MPD of the code stream, and recording a dependent time period and a network storage address of the target knowledge layer segment;

Determining, when the video on demand request is received, whether the on-demand time carried in the video on demand request is included in the dependent time period of the target knowledge layer segment;

If the time period of the target knowledge layer segment includes the on-demand time, view a storage state of the target knowledge layer segment in a storage space of the client, and determine the target knowledge layer segment according to the storage state How to get it.
The method according to claim 8, wherein said N sequence layer segments comprise at least two grouped sequence layer segments, said at least two packets comprising at least a first sequence layer segment group corresponding to a first time period and Second period Corresponding second sequence layer segment group;

The extended period information includes first extended period information corresponding to the first period and second extended period information corresponding to the second period;

The first extended period information is used to determine a first extended period of the dependent period of the target knowledge layer segment, and the second extended period information is used to determine a first of the dependent periods of the target knowledge layer segment Two extended time slots.
The method according to claim 9, wherein the first extended period information and the second extended period information are first extended identifiers;

The media expression description MPD of the code stream sent by the client parsing server determines that the extended time period information carried in the MPD includes:

The client parses the MPD, and acquires a first extended identifier that is included in the fragment information that is described by the description layer in the MPD;

Determining, according to the extended period information, the target knowledge layer segment includes:

Determining a segment corresponding to the segment information carrying the first extended identifier as a target knowledge layer segment;

The segment information includes a first segment information corresponding to the first time period and a second segment information corresponding to the second time period, where the first segment information carries the first extended time period information, and the second segment information carries the first Second, extended time period information.
The method according to claim 9, wherein the first extended period information and the second extended period information are second extended identifiers;

The media expression description MPD of the code stream sent by the client parsing server determines that the extended time period information carried in the MPD includes:

The client parses the MPD, and acquires a second extended identifier included in the description layer attribute information of the description layer included in the MPD;

Determining, according to the extended period information, the target knowledge layer segment includes:

Determining, by the description layer description segment carrying the second extended identifier, a target knowledge layer segment;

The first layer extension period information and the second extension period information are included in the description layer attribute information, and the first fragment information and the second fragment information respectively carry a second extension identifier.
The method according to claim 10 or 11, wherein the determining the dependent time period of the target knowledge layer segment comprises:

Determining, according to the first extended period information, a first extended period of the target knowledge layer segment, and determining a second extended period of the target knowledge layer segment according to the second extended period information;

A union of the first extended period and the second extended period is taken as a dependent period of the target knowledge layer segment.
The method of claim 12, wherein the recording of the dependent time period and the network storage address of the target knowledge layer segment comprises:

Generating a knowledge layer segment list according to the network storage address of the target knowledge layer segment, and recording a dependent time period of the target knowledge layer segment in the knowledge layer segment list;

The method further includes:

Adding, in the knowledge layer segment list, a storage state flag of the target knowledge layer segment, to indicate whether the target knowledge layer segment is already in the storage space of the client;

The viewing the storage state of the target knowledge layer segment in the storage space of the client includes:

Viewing a storage status flag of the target knowledge layer segment in the knowledge layer segment list according to a network storage address of the target knowledge layer segment;

If the storage status flag is true, determining that the storage state of the target knowledge layer segment in the storage space is not empty, otherwise it is empty;

Determining, by the storing state, the acquiring manner of the target knowledge layer segment includes:

And if the storage state is not empty, acquiring the target knowledge layer segment from the storage space, and otherwise sending a request for acquiring the target knowledge layer segment to the server.
The method according to claim 13, wherein after the sending the request for acquiring the target knowledge layer segment to the server, the method further comprises:

Receiving the target knowledge layer segment sent by the server;

If the remaining space size of the storage space is not less than the data size of the target knowledge layer segment, storing the target knowledge layer segment into the storage space, and recording the storage state flag of the target knowledge layer segment as true;

If the remaining space size of the storage space is smaller than the data size of the target knowledge layer segment, deleting the specified target knowledge layer segment stored in the storage space, and storing the target knowledge layer segment into the storage space, and Recording the storage status flag of the target knowledge layer segment as true;

The time interval between the time-dependent period of the specified target knowledge layer segment and the on-demand time is greater than a preset time threshold.
A device for processing video data, comprising:

And an acquiring unit, configured to acquire segment information of each sequence layer segment in all sequence layer segments in the code stream, where the segment information is used to describe a dependency relationship between the sequence layer segment and the knowledge layer segment in the code stream;

a determining unit, configured to determine, according to the segment information of each sequence layer segment acquired by the acquiring unit, N sequence layer segments and a first target knowledge layer segment, wherein the N sequence layer segments are dependent on the first a target knowledge layer segment, the N sequence layer segments include at least two discontinuous sequence layer segments, and the first target knowledge layer segment is one of at least one knowledge layer segment included in the code stream;

The acquiring unit is further configured to acquire segment information of the first target knowledge layer segment determined by the determining unit;

An adding unit, configured to add the first in the media expression description MPD of the code stream according to the segment information of the first target knowledge layer segment and the segment information of the N sequence layer segments acquired by the acquiring unit Extended period information of the target knowledge layer segment, the N sequence layer segments being encoded within a period indicated by the extended period information;

And a sending unit, configured to send the MPD of the code stream obtained by the adding unit to the client.
The processing device according to claim 15, wherein the determining unit is specifically configured to:

Determining, according to the identifier of the knowledge layer segment included in the segment information of each sequence layer segment acquired by the acquiring unit, a knowledge layer segment on which each sequence layer segment depends;

Determining a first target knowledge layer segment and determining N sequence layer segments that depend on the first target knowledge layer segment.
The processing apparatus according to claim 16, wherein said N sequence layer segments comprise at least two packet sequence layer segments, said at least two packets including at least a first sequence layer segment group corresponding to a first time period a second sequence layer segment group corresponding to the second time period;

The first sequence layer segment group includes N1 sequence layer segments, the second sequence layer segment group includes N2 sequence layer segments, the N1 sequence layer segments and the N2 sequence layer segments are discontinuous, and N1 +N2<=N;

If the N1>1, the N1 sequence layer segments are consecutive sequence layer segments; if the N2>1, the N2 sequence layer segments are consecutive sequence layer segments;

The MPD of the code stream includes at least two description layers, a first description layer of the at least two description layers describes a first target knowledge layer segment, and a second description layer describes a sequence layer segment;

The adding unit is specifically used to:

Adding first extended period information to the first segment description corresponding to the first period included in the first description layer, and in the second segment description corresponding to the second period included in the first description layer Add the second extended period information.
The processing apparatus according to claim 17, wherein the first extended period information and the second extended period information are both first extended identifiers;

The adding unit is specifically used to:

Adding a first extended identifier to the segment information of the first target knowledge layer segment included in the first segment description, and segment information of the first target knowledge layer segment included in the second segment description Add the first extension ID.
The processing apparatus according to claim 17, wherein the first extended period information and the second extended period information are both second extended identifiers;

The adding unit is specifically used to:

Adding, in the description layer attribute information of the first description layer, a second extension identifier corresponding to the first period and a second extension identifier corresponding to the second period.
The processing apparatus according to claim 17, wherein if the first sequence layer segment group further depends on the second target knowledge layer segment, the MPD of the code stream further includes a third description layer, wherein the The third description layer describes the second target knowledge layer segment.
The processing apparatus according to claim 20, wherein said adding unit is further configured to:

Adding third extended period information to the third segment description corresponding to the first period included in the third description layer, The third extended period information is a first extended identifier; or

The third extended period information is added to the description layer attribute information of the third description layer, where the third extended period information is a second extended identifier.
A device for processing video data, comprising:

a parsing unit, configured to parse a media expression description MPD of the code stream sent by the server, and determine extended period information carried in the MPD, where the extended period information is used to determine that the target knowledge layer segment included in the code stream is dependent a period of time, the target knowledge layer segment is one of at least one knowledge layer segment included in the code stream, and the target knowledge layer segment is dependent on N sequence layer segments in the code stream;

a determining unit, configured to determine a target knowledge layer segment according to the extended time period information acquired by the parsing unit, and determine a dependent time period of the target knowledge layer segment, where the N sequence layer segments are in the target knowledge The slice segment has been encoded within the time period of the dependency;

a recording unit, configured to acquire, from an MPD of the code stream parsed by the parsing unit, a network storage address of the target knowledge layer segment, and record a dependent time period of the target knowledge layer segment determined by the determining unit, and the Network storage address;

a determining unit, configured to determine, when the video on demand request is obtained, whether an on-demand time carried in the video-on-demand request is included in a time-dependent period of the target knowledge layer segment recorded by the recording unit;

An obtaining unit, configured to: when the determination result of the determining unit is yes, view a storage state of the target knowledge layer segment in a storage space of the client, and determine, according to the storage state, the target knowledge layer segment method of obtaining.
The processing apparatus according to claim 22, wherein said N sequence layer segments comprise at least two grouped sequence layer segments, said at least two packets comprising at least a first sequence layer segment group corresponding to a first time period a second sequence layer segment group corresponding to the second time period;

The extended period information includes first extended period information corresponding to the first period and second extended period information corresponding to the second period;

The first extended period information is used to determine a first extended period of the dependent period of the target knowledge layer segment, and the second extended period information is used to determine a first of the dependent periods of the target knowledge layer segment Two extended time slots.
The processing apparatus according to claim 23, wherein the first extended period information and the second extended period information are first extended identifiers;

The parsing unit is specifically configured to:

Parsing the MPD, and acquiring a first extended identifier included in the fragment information including the description of the description layer in the MPD;

The determining unit is specifically configured to:

Determining a segment corresponding to the segment information of the first extended identifier acquired by the parsing unit as a target knowledge layer segment;

The segment information includes a first segment information corresponding to the first time period and a second segment information corresponding to the second time period, where the first segment information carries the first extended time period information, and the second segment information carries the first Second extended period information.
The processing apparatus according to claim 23, wherein the first extended period information and the second extended period information are second extended identifiers;

The parsing unit is specifically configured to:

Parsing the MPD, and acquiring a second extended identifier included in the description layer attribute information of the description layer included in the MPD;

The determining unit is specifically configured to:

Determining, by the parsing unit, a segment of the description layer of the second extended identifier that is obtained by the parsing unit as a target knowledge layer segment;

The first layer extension period information and the second extension period information are included in the description layer attribute information, and the first fragment information and the second fragment information respectively carry a second extension identifier.
The processing device according to claim 24 or 25, wherein the determining unit is specifically configured to:

Determining, according to the first extended period information, a first extended period of the target knowledge layer segment, and determining a second extended period of the target knowledge layer segment according to the second extended period information;

A union of the first extended period and the second extended period is taken as a dependent period of the target knowledge layer segment.
The processing apparatus according to claim 26, wherein said recording unit is specifically configured to:

Generating a knowledge layer segment list according to the network storage address of the target knowledge layer segment, and recording a dependent time period of the target knowledge layer segment in the knowledge layer segment list;

The recording unit is further configured to:

Adding, in the knowledge layer segment list, a storage state flag of the target knowledge layer segment, to indicate whether the target knowledge layer segment is already in the storage space of the client;

The obtaining unit is specifically configured to:

Viewing a storage status flag of the target knowledge layer segment in the knowledge layer segment list according to a network storage address of the target knowledge layer segment;

If the storage status flag is true, determining that the storage state of the target knowledge layer segment in the storage space is not empty, otherwise it is empty;

And if the storage state is not empty, acquiring the target knowledge layer segment from the storage space, and otherwise sending a request for acquiring the target knowledge layer segment to the server.
The processing device according to claim 27, wherein the obtaining unit is further configured to:

Receiving the target knowledge layer segment sent by the server;

If the remaining space size of the storage space is not less than the data size of the target knowledge layer segment, storing the target knowledge layer segment into the storage space, and using the recording unit to segment the target knowledge layer segment The storage status flag is marked as true;

If the remaining space size of the storage space is smaller than the data size of the target knowledge layer segment, deleting the specified target knowledge layer segment stored in the storage space, and storing the target knowledge layer segment into the storage space, and Recording, by the recording unit, a storage status flag of the target knowledge layer segment as true;

The time interval between the time-dependent period of the specified target knowledge layer segment and the on-demand time is greater than a preset time threshold.