CN106341690B

CN106341690B - Method, device and system for realizing layered video coding

Info

Publication number: CN106341690B
Application number: CN201510391111.4A
Authority: CN
Inventors: 黄敦笔; 徐月钢; 胡飞阳
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2015-07-06
Filing date: 2015-07-06
Publication date: 2020-03-13
Anticipated expiration: 2035-07-06
Also published as: CN106341690A

Abstract

The application provides a method, a device and a system for realizing layered video coding, wherein the method comprises the following steps: determining a sampling parameter and a coding parameter according to a coding input parameter; the encoding input parameters include: resolution of a spatial domain, at least two output frame rates and corresponding code rates; configuring a sampler and a single-layer encoder in an encoder group according to the sampling parameters and the encoding parameters, and binding the sampler with the same resolution with the single-layer encoder; the encoder group comprises at least one sampler and at least two single-layer encoders; and controlling a sampler in the encoder group to sample an initial video stream to obtain a sample video stream, outputting the sample video stream to a single-layer encoder, and controlling the single-layer encoder to encode the sample video stream to obtain a layer bit stream. According to the technical scheme, the encoder group is configured and controlled, so that the single-layer encoder can work cooperatively to realize layered video encoding, and the method and the device can be well suitable for heterogeneous networks.

Description

Method, device and system for realizing layered video coding

Technical Field

The present application relates to the field of video coding, and in particular, to a method, an apparatus, and a system for layered video coding based on a single-layer encoder.

Background

Scalable Video Coding (SVC) is a technique that can divide a Video stream into Video compressed streams of different frame rates, resolutions, and Video qualities. SVC partitions a video stream into a base layer and multiple enhancement layers according to different video quality requirements. The base layer provides the most basic video quality, frame rate and resolution to the user. The base layer can be decoded independently and the enhancement layer relies on the base layer for decoding. The larger the number of enhancement layers received by the decoding end is, the higher the quality of the decoded video is. Therefore, the SVC coding mode can support multiple devices and networks to simultaneously access SVC video streams, and has strong flexibility and adaptability to heterogeneous access networks and heterogeneous terminals.

The SVC can reconstruct video sequences of various resolution, code rate, or quality levels by transmitting, extracting, and decoding the corresponding portions of the compressed code stream only once by encoding the original video sequence. An SVC system includes an encoder, an extractor, and a decoder. The encoder is used for obtaining an encoded data stream with a scalable structure through one-time encoding processing of an original video sequence; the extractor is used for extracting a required part from the coded data stream according to the actual requirement of a user or the network bandwidth condition to form a data stream which meets the actual requirement of the user; the decoder is used for decoding the scalable coding data stream transmitted by the extractor to obtain an output video sequence meeting the requirement.

The SVC technology can be well implemented in a device or system having SVC encoder components, but it is currently difficult to implement layered video coding on a device or system having only a single layer encoder, and adaptability in an application environment of a heterogeneous network is not excellent.

Disclosure of Invention

The application provides a method, a device and a system for realizing layered video coding, which utilize a single-layer encoder to realize layered video coding in a cooperative manner, and improve the flexibility and adaptability of equipment or a system of a single-layer encoder component in a heterogeneous access network and a heterogeneous terminal.

In a first aspect of the present application, there is provided a method for implementing layered video coding, the method comprising:

determining sampling parameters according to the coding input parameters; the encoding input parameters include: resolution of a spatial domain, at least two output frame rates and corresponding code rates; the sampling parameters include: the number of samplers to be configured, the corresponding resolution and the output frame rate of the samplers to be configured;

determining a coding parameter according to the coding input parameter; the encoding parameters include: the number of single-layer encoders to be configured, the corresponding resolution, the output frame rate and the corresponding code rate of the single-layer encoders to be configured;

configuring a sampler and a single-layer encoder in an encoder group according to the sampling parameters and the encoding parameters, and binding the sampler with the same resolution with the single-layer encoder; the encoder group comprises at least one sampler and at least two single-layer encoders;

and controlling a sampler in the encoder group to sample an initial video stream to obtain a sample video stream, outputting the sample video stream to a single-layer encoder, and controlling the single-layer encoder to encode the sample video stream to obtain a layer bit stream.

In a second aspect of the present application, there is provided an apparatus for implementing layered video coding, the apparatus comprising:

a sampling parameter determination unit for determining a sampling parameter according to the encoding input parameter; the encoding input parameters include: resolution of a spatial domain, at least two output frame rates and corresponding code rates; the sampling parameters include: the number of samplers to be configured, the corresponding resolution and the output frame rate of the samplers to be configured;

a coding parameter determination unit for determining a coding parameter from the coding input parameter; the encoding parameters include: the number of single-layer encoders to be configured, the corresponding resolution, the output frame rate and the corresponding code rate of the single-layer encoders to be configured;

the configuration unit is used for configuring the samplers and the single-layer encoders in the encoder group according to the sampling parameters and the encoding parameters, and binding the samplers and the single-layer encoders with the same resolution; the encoder group comprises at least one sampler and at least two single-layer encoders;

and the control unit is used for controlling the samplers in the encoder group to sample an initial video stream to obtain a sample video stream, outputting the sample video stream to the single-layer encoder, and controlling the single-layer encoder to encode the sample video stream to obtain a layer bit stream.

In a third aspect of the present application, there is provided a system for implementing layered video coding, the system comprising:

an encoder group comprising at least one sampler and at least two single-layer encoders; and the device for realizing layered video coding is used for configuring and controlling the encoder group to realize layered video coding.

Compared with the prior art, the technical scheme provided by the application has the following beneficial effects: the method comprises the steps of determining sampling parameters and coding parameters according to coding input parameters; configuring a sampler and a single-layer encoder in an encoder group according to the sampling parameters and the encoding parameters, and binding the sampler with the same resolution with the single-layer encoder; the encoder group comprises at least one sampler and at least two single-layer encoders; and controlling a sampler in the encoder group to sample an initial video stream to obtain a sample video stream, outputting the sample video stream to a single-layer encoder, and controlling the single-layer encoder to encode the sample video stream to obtain a layer bit stream. According to the method and the device, corresponding sampling parameters and encoding parameters can be generated according to actual encoding requirements, the parameters are used for configuring the encoder group, and layered video encoding is achieved through the mode of configuring and controlling the encoder group. The technical scheme utilizes a single-layer encoder to independently complete encoding respectively, and has no dependency relationship between layers, so that the technical scheme has better network error robustness and is better suitable for a heterogeneous network and a heterogeneous terminal.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.

Fig. 1 is a flow chart of embodiment 1 of a method of implementing layered video coding of the present application;

FIG. 2 is a schematic diagram of an internal structure of an encoder group provided in the present application;

FIG. 3 is another schematic diagram of the internal structure of the encoder group provided in the present application;

fig. 4 is a flow chart of embodiment 2 of a method of implementing layered video coding of the present application;

fig. 5 is a flow chart of embodiment 3 of a method of implementing layered video coding of the present application;

FIG. 6 is a block diagram of an apparatus for implementing layered video coding according to the present application;

fig. 7 is a schematic diagram of a system for implementing layered video coding according to the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, distributed computing environments that include any of the above systems or devices, and the like.

The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

Referring to fig. 1, fig. 1 is a flowchart of embodiment 1 of a method for implementing layered video coding according to the present application, and as shown in fig. 1, the method includes:

s11: determining sampling parameters according to the coding input parameters; the encoding input parameters include: resolution of a spatial domain, at least two output frame rates and corresponding code rates; the sampling parameters include: the number of samplers to be configured, the resolution corresponding to the samplers to be configured and the output frame rate.

In the embodiment of the present application, the encoding input parameter may be a preset parameter, or a parameter determined according to a video stream requirement of a video receiving end; the encoding input parameters may include resolutions of at least two spatial domains, each spatial domain having at least one output frame rate and a corresponding code rate; the encoding input parameters may also include a spatial domain resolution, at least two output frame rates, and corresponding code rates.

In specific implementation, the number of samplers to be configured may be determined to be the same as the number of spatial domains in the encoding input parameter; determining that the corresponding resolution of the sampler to be configured is the same as the resolution of the spatial domain; and determining that the output frame rate corresponding to the sampler to be configured is the same as the maximum output frame rate in the encoding input parameters.

This step is explained below by taking as an example a case where the encoding input parameter includes three spatial domains, each of which corresponds to one of the output frame rate and the code rate. For example, the encoding input parameters include:

1 st spatial domain: resolution 180P, output frame rate 7.5hz and code rate 256 kbps;

2 nd spatial domain: resolution 360P, output frame rate 15hz and code rate 512 kbps;

3 rd spatial domain: resolution 720P, output frame rate 30hz and code rate 2048 kbps.

Based on the encoding input parameters, it is known that the number of spatial domains is 3, the maximum output frame rate is 30hz, and the determining of the sampling parameters according to the above method includes:

the number of samplers to be configured is 3; the relevant parameters of the 3 samplers to be configured are as follows:

the resolution of the 1 st sampler to be configured is 180P, and the output frame rate is 30 hz;

the resolution of the 2 nd sampler to be configured is 360P, and the output frame rate is 30 hz;

the resolution of the 3 rd sampler to be configured is 720P and the output frame rate is 30 hz.

The present application is directed to a system that includes an encoder group comprising at least one sampler and at least two single-layer encoders; the sampler is used for sampling an initial video stream and outputting a sampled video stream to a corresponding single-layer encoder; and the single-layer encoder completes the encoding processing of the sample video stream according to the encoding parameters.

S12: determining a coding parameter according to the coding input parameter; the encoding parameters include: the number of the single-layer encoders to be configured, the corresponding resolution, the output frame rate and the corresponding code rate of the single-layer encoders to be configured.

In specific implementation, the number of single-layer encoders to be configured may be determined according to the number of spatial domains in the encoding input parameter and the time-domain layering characteristic of the single-layer encoder; determining the resolution, the output frame rate and the code rate corresponding to the single-layer encoder to be configured according to a preset code table and the encoding input parameters; the preset code table comprises code rate values corresponding to different output frame rates under specific resolution.

When the single-layer encoder supports the time-domain layering characteristic, it may be determined that the number of the single-layer encoders to be configured is the same as the number of the spatial domains in the encoding input parameter.

When the single-layer encoder does not support the time-domain layering characteristic, it may be determined that the number of the single-layer encoders to be configured is the same as the sum of the number of the layering of all spatial domains in the encoding input parameter.

In practical application, the time domain layering characteristic of the single-layer encoder can be determined through the port information of the single-layer encoder, and the time domain layering characteristic of the single-layer encoder can also be configured in advance according to actual needs; the above manner of step S12 is explained below on the basis of the above example.

If the single-layer encoders in the encoder group support the time-domain layering characteristic, determining that the number N of the single-layer encoders to be configured is the same as the number N of the spatial domains in the encoding input parameters, namely N is equal to N; the correlation parameter of each single-layer encoder to be configured is equal to the correlation parameter of the corresponding spatial domain.

Still taking the above example as an example, the encoding input parameters are specifically:

1 st spatial domain: resolution 180P, output frame rate 7.5hz, code rate 256 kbps;

2 nd spatial domain: resolution 360P, output frame rate 15hz, and code rate 512 kbps;

3 rd spatial domain: resolution 720P, output frame rate 30hz, code rate 2048 kbps;

the coding input parameter has 3 spatial domains, and N is 3. Correspondingly, the relevant parameters of each single-layer encoder to be configured are as follows:

the resolution corresponding to the 1 st single-layer encoder to be configured is 180P, the output frame rate is 7.5hz, and the code rate is 256 kbps;

the resolution corresponding to the 2 nd single-layer encoder to be configured is 360P, the output frame rate is 15hz, and the code rate is 512 kbps;

the resolution of the 3 rd single-layer encoder to be configured is 720P, the output frame rate is 30hz, and the code rate is 2048 kbps.

If the single-layer encoders in the encoder group do not support the time domain layering characteristic, determining that the number N of the single-layer encoders to be configured is the same as the sum of the number of the divisible layers of all the spatial domains in the encoding input parameter; for this case, N can be calculated as follows:

(formula 1)

L_i＝1+log₂(FrameRate_i/min FrameRate) (equation 2)

min FrameRate＝MIN(FrameRate₀,…,FrameRate_i,…,FrameRate_n-1) (formula 3)

In the above formula 1, N represents the number of single-layer encoders to be configured; li represents the number of delayable layers of the ith spatial domain; the value range of i is (0, n-1), and n represents the number of spatial domains;

in the above formula 2, the Frame Ratei represents the Frame rate of the i-th spatial domain; the min Frame Rate represents the minimum Frame Rate in all spatial domains.

Next, explanation will be given by taking the above-described encoding input parameter as an example.

First, the min Frame Rate equal to 7.5hz can be calculated according to the above formula 3;

then, the number of layers that can be divided for the three spatial domains can be calculated according to the above formula 2:

the number of levels L0 of the 1 st spatial domain is 1;

the number of layers L1 of the 2 nd spatial domain is 2;

the 3 rd spatial domain may have a number L2 of layers equal to 3;

finally, the number N of single-layer encoders to be configured can be calculated to be 6 according to the above formula 1.

When a single-layer encoder in the encoder group does not support the time domain layering characteristic, it is required to traverse each spatial domain, and reduce the original output frame rate of the spatial domain to a code rate corresponding to the minimum output frame rate of all spatial domains, for example, a sub-layer output frame rate that one spatial domain can divide is 1/2 or 1/4 of the original output frame rate of the spatial domain, and the minimum output frame rate is up to the minimum output frame rate of all spatial domains; in general, there are 2 different output frame rates of the spatial domain^m(m is 0, 1, 2 …) magnification relation; in practical application, code rates corresponding to different output frame rates can be determined according to the compression performance of the encoder; and code rates corresponding to different output frame rates can be determined according to a preset code table. For example, for the above-mentioned encoding input parameters and the case that a single-layer encoder in the encoder group does not support the time-domain layering characteristic, the parameters of the encoder to be configured in the encoder group are specifically determined as follows:

the resolution is 720p, the output frame rate is 30hz, and the code rate is 2048 kbps;

the resolution is 720p, the output frame rate is 15hz, and the code rate is 1200 kbps;

the resolution is 720p, the output frame rate is 7.5hz, and the code rate is 800 kbps;

the resolution is 360p, the output frame rate is 15hz, and the code rate is 512 kbps;

the resolution is 360p, the output frame rate is 7.5hz, and the code rate is 320 kbps;

the resolution is 180p, the output frame rate is 7.5hz, and the code rate is 256 kbps.

The relevant parameters of the above 6 single-layer encoders are essentially the parameters of each layer stream from the highest enhancement layer to the base layer.

In practical applications, the three steps S11 and S12 are not performed in a strict order, and fig. 1 is only an example, and S12 may be performed first and then S11 may be performed, or the two steps may be performed in parallel.

S13: configuring a sampler and a single-layer encoder in an encoder group according to the sampling parameters and the encoding parameters, and binding the sampler with the same resolution with the single-layer encoder; the encoder group includes at least one sampler and at least two single-layer encoders.

In the embodiment of the present application, the samplers in the encoder group may adopt a parallel structure or a series structure. The samplers in fig. 2 are connected in a parallel configuration, the samplers are connected to single-layer encoders having the same resolution, and one sampler may be connected to one single-layer encoder or a plurality of single-layer encoders. The samplers in fig. 3 are connected in a series configuration, with the output of a higher layer sampler serving as the input to a next layer sampler.

S14: and controlling a sampler in the encoder group to sample an initial video stream to obtain a sample video stream, outputting the sample video stream to a single-layer encoder, and controlling the single-layer encoder to encode the sample video stream to obtain a layer bit stream.

In practical application, the technical scheme of the embodiment of the application is utilized to configure and control the system with the encoder group, so that the system realizes layered video encoding in the environment of a single-layer encoder, and the system has good flexibility and can be well adapted to a heterogeneous network.

Compared with the SVC coding method in the prior art, the technical scheme of the embodiment of the application does not need an extractor to extract the corresponding layer bit stream any more, and can directly output the layer bit stream generated by the single-layer coder to the corresponding terminal, thereby simplifying the transmission processing process of the video stream and saving the cost of software and hardware.

The technical scheme of the embodiment of the application can be suitable for a system which needs to utilize a single-layer encoder to realize a layered video encoding function, such as a real-time large-scale multipoint video (video) conference system, a video on demand layered transcoding system and the like.

The technical scheme of the embodiment of the application utilizes a single-layer encoder independent coding mode, does not need inter-layer prediction, and has no dependency between layers, so that the system can be well compatible with a single-layer coding standard system. However, considering that the interface of the video receiving end system is an SVC interface, when the technology of the embodiment of the present application is used, a layer bit stream generated by a single layer encoder needs to be converted into an SVC layer stream; referring to fig. 4, the scheme shown in fig. 4 is added with the following steps on the basis of the scheme shown in fig. 1:

s15: and according to a conversion protocol from single-layer video coding to multi-layer video coding, coding the layer bit stream generated by the single-layer coder in the coder group to obtain the multi-layer video coding layer bit stream.

Assuming that the single-layer encoder in the encoder group adopts the itu h.264avc coding standard, the layer bitstream generated by the single-layer encoder is converted into an SVC layer stream according to the itu h.264avc to itu h.264SVC conversion standard (JVT-X201 protocol specification). In practical applications, there are many types of video coding standards (for example, there are also the standards such as ISOMPEG4 and itu h.265), and in the embodiment of the present application, a specific standard type is not limited, and only the conversion from AVC to SVC needs to be implemented according to a corresponding conversion protocol.

In addition, when the video receiving end has a synchronization requirement for video receiving display, when the technical scheme of the embodiment of the application is utilized, the layer bit stream generated by the single-layer encoder needs to be synchronized and then output to the video receiving end; referring to fig. 5, the scheme shown in fig. 5 is added with the following steps on the basis of the scheme shown in fig. 1:

s16: according to the coding frame indication information, carrying out synchronous processing on the layer bit stream generated by the single-layer coder in the coder group, and sending the layer bit stream after synchronization to a terminal; the encoding frame indication information includes an encoding frame index corresponding to a video sequence group.

In practical application, whether the layer bit streams are subjected to synchronization processing or not is controlled in a mode of setting a synchronization enabling flag field in the layer bit streams; the synchronization process is mainly implemented according to Coding frame indication information, where the Coding frame indication information includes a Coding Index (Coding Index) corresponding to a Group of video sequences (GoP), and the polling Coding process from the base layer to the enhancement layer is implemented according to the Coding frame Index, so that all layer bit streams of the video frames under the same timestamp are completely coded, and then the synchronous output is performed.

Corresponding to the method of the foregoing embodiment, the present application further provides an apparatus for implementing layered video coding, as can be seen from fig. 6, the apparatus may include:

a sampling parameter determination unit 61 for determining a sampling parameter from the encoded input parameter; the encoding input parameters include: resolution of a spatial domain, at least two output frame rates and corresponding code rates; the sampling parameters include: the number of samplers to be configured, the corresponding resolution and the output frame rate of the samplers to be configured;

an encoding parameter determination unit 62 for determining an encoding parameter from the encoding input parameter; the encoding parameters include: the number of single-layer encoders to be configured, the corresponding resolution, the output frame rate and the corresponding code rate of the single-layer encoders to be configured;

a configuration unit 63, configured to configure the sampler and the single-layer encoder in the encoder group according to the sampling parameter and the encoding parameter, and bind the samplers and the single-layer encoders with the same resolution; the encoder group comprises at least one sampler and at least two single-layer encoders;

the control unit 64 is configured to control the samplers in the encoder group to sample an initial video stream to obtain a sample video stream, output the sample video stream to a single-layer encoder, and control the single-layer encoder to encode the sample video stream to obtain a layer bit stream.

Optionally, the sampling parameter determining unit includes:

the sampler number determining subunit is used for determining that the number of the samplers to be configured is the same as the number of the spatial domains in the encoding input parameter;

and the sampler parameter determining subunit is used for determining that the resolution corresponding to the to-be-configured sampler is the same as the resolution of the spatial domain, and determining that the output frame rate corresponding to the to-be-configured sampler is the same as the maximum output frame rate in the encoding input parameters.

Optionally, the encoding parameter determining unit includes:

the encoder number determining subunit is used for determining the number of the single-layer encoders to be configured according to the number of the spatial domains in the encoding input parameters and the time domain layering characteristics of the single-layer encoders;

the encoder parameter determining subunit is used for determining the resolution, the output frame rate and the code rate corresponding to the single-layer encoder to be configured according to a preset code table and the encoding input parameters; the preset code table comprises code rate values corresponding to different output frame rates under specific resolution.

Optionally, the encoder number determining subunit includes:

the first determining module is used for determining that the number of the single-layer encoders to be configured is the same as the number of the spatial domains in the encoding input parameters when the single-layer encoders support the time domain layering characteristic; and the number of the first and second groups,

and a second determining module, configured to determine that the number of the single-layer encoders to be configured is the same as a sum of the number of the scalable layers of all spatial domains in the encoding input parameter, when the single-layer encoders do not support the time-domain scalable characteristic.

Optionally, the apparatus further comprises:

a synchronization unit, configured to perform synchronization processing on a layer bit stream generated by a single-layer encoder in the encoder group according to the encoding frame indication information, and send the synchronized layer bit stream to a terminal; the encoding frame indication information includes an encoding frame index corresponding to a video sequence group.

Optionally, the apparatus further comprises:

and the conversion unit is used for carrying out coding processing on the layer bit stream generated by the single-layer coder in the coder group according to a conversion protocol from single-layer video coding to multi-layer video coding to obtain the multi-layer video coding layer bit stream.

In addition, the present application also provides a system for implementing layered video coding, as shown in fig. 7, the system includes:

It should be noted that the samplers in the encoder group shown in fig. 7 are in a serial structure, and in practical applications, the samplers in the encoder group may also be in a parallel structure. In addition, the sampler and the single-layer encoder in the encoder group are implemented in a software form, and can also be implemented in a hardware element.

It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. The method embodiment of the application is described from the system perspective, is basically similar to the system embodiment, is relatively simple to describe, and the relevant points refer to partial description of the system embodiment.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The method, apparatus, and system for implementing layered video coding provided by the present application are introduced in detail above, and specific examples are applied herein to illustrate the principles and embodiments of the present application, and the descriptions of the above embodiments are only used to help understand the method and core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A method for implementing layered video coding, the method comprising:

controlling samplers in the encoder group to sample an initial video stream to obtain a sample video stream, outputting the sample video stream to a single-layer encoder, and controlling the single-layer encoder to encode the sample video stream to obtain a layer bit stream;

and according to a conversion protocol from single-layer video coding to multi-layer video coding, coding the layer bit stream generated by the single-layer coder in the coder group to obtain the multi-layer video coding layer bit stream.

2. The method of claim 1, wherein determining sampling parameters from the encoded input parameters comprises:

determining that the number of samplers to be configured is the same as the number of spatial domains in the encoding input parameter;

and determining that the resolution corresponding to the sampler to be configured is the same as the resolution of the spatial domain, and determining that the output frame rate corresponding to the sampler to be configured is the same as the maximum output frame rate in the encoding input parameters.

3. The method of claim 1, wherein determining the encoding parameters from the encoding input parameters comprises:

determining the number of single-layer encoders to be configured according to the number of spatial domains in the encoding input parameters and the time domain layering characteristics of the single-layer encoders;

determining the resolution, the output frame rate and the code rate corresponding to the single-layer encoder to be configured according to a preset code table and the encoding input parameters; the preset code table comprises code rate values corresponding to different output frame rates under specific resolution.

4. The method of claim 3, wherein determining the number of single-layer encoders to be configured according to the number of spatial domains in the encoded input parameters and the temporal layering characteristics of the single-layer encoders comprises:

when the single-layer encoder supports the time domain layering characteristic, determining that the number of the single-layer encoders to be configured is the same as the number of the spatial domains in the encoding input parameters; and the number of the first and second groups,

and when the single-layer encoder does not support the time domain layering characteristic, determining that the number of the single-layer encoders to be configured is the same as the sum of the number of the layering of all the spatial domains in the encoding input parameter.

5. The method of claim 1, further comprising:

according to the coding frame indication information, carrying out synchronous processing on the layer bit stream generated by the single-layer coder in the coder group, and sending the layer bit stream after synchronization to a terminal; the encoding frame indication information includes an encoding frame index corresponding to a video sequence group.

6. An apparatus for implementing layered video coding, the apparatus comprising:

the control unit is used for controlling the samplers in the encoder group to sample an initial video stream to obtain a sample video stream, outputting the sample video stream to the single-layer encoder, and controlling the single-layer encoder to encode the sample video stream to obtain a layer bit stream;

7. The apparatus of claim 6, wherein the sampling parameter determination unit comprises:

8. The apparatus of claim 6, wherein the encoding parameter determining unit comprises:

9. The apparatus of claim 8, wherein the encoder number determination subunit comprises:

10. The apparatus of claim 6, further comprising:

11. A system for implementing layered video coding, the system comprising:

an encoder group comprising at least one sampler and at least two single-layer encoders; and the apparatus for implementing layered video coding as set forth in any of the preceding claims 6-10, configured to configure and control the encoder group to implement layered video coding.