CN112788340B

CN112788340B - Method and apparatus for adaptively determining the number of frames for a coded group of pictures

Info

Publication number: CN112788340B
Application number: CN201911082343.6A
Authority: CN
Inventors: 张涛
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-11-07
Filing date: 2019-11-07
Publication date: 2024-06-04
Anticipated expiration: 2039-11-07
Also published as: CN112788340A

Abstract

Described herein is a method for adaptively determining a number of frames for a coded group of pictures, comprising: receiving a video clip to be encoded in the number of frames of the group of pictures for encoding, the video clip comprising a plurality of video frames; before each encoding of an uncoded video frame of the video clip in units of a group of pictures, a number of frames of the uncoded video frame is determined, and a number of frames for the encoded group of pictures is adaptively determined based on the number of frames of the uncoded video frame.

Description

Method and apparatus for adaptively determining the number of frames for a coded group of pictures

Technical Field

The present disclosure relates to the field of video processing, and in particular to a method and apparatus for adaptively determining the number of frames for a coded group of pictures.

Background

In encoding video sequences, the video sequences are typically compressed by reducing spatial and temporal redundancy by performing a prediction process in the spatial and/or temporal domain. In actual compression, various algorithms are employed to reduce the amount of data, with I-frames, P-frames, B-frames being the most commonly used. The I-frame is a key frame belonging to an intra-frame prediction frame. P-frames and B-frames are inter-predicted frames, the difference of which is that P-frame prediction predicts the value of the current block based on only one prediction block, whereas B-frame prediction allows for the prediction of the current block based on interpolation based on two previously encoded blocks.

The collection of consecutive frames of pictures is generally referred to in video coding as a group of pictures (group ofpictures, GOP). The encoding process encodes in GOP units. The GOP size determines the basic hierarchy and reference relationships in encoding, which has a large impact on the performance of encoding. A fixed size GOP, such as GOP16, is typically used in existing schemes, which represents a number of video frames of 16 per GOP.

However, video sequences are typically composed of complex video clips and simple video clips. For complex video clips, selecting a smaller GOP can fully reference the close-range frames in the GOP, resulting in better predictions. For simpler video clips, selecting larger GOP can reasonably allocate quality of frames of each level to obtain better coding performance. Therefore, the existing scheme cannot adapt to the characteristics of the video sequence by adopting the fixed-size GOP, so that better performance cannot be obtained.

Disclosure of Invention

In view of the above, the present disclosure provides methods and apparatus for adaptively determining the number of frames for a coded group of pictures, which are expected to overcome some or all of the above-mentioned drawbacks, as well as other possible drawbacks.

According to a first aspect of the present disclosure, there is provided a method for adaptively determining a number of frames for a coded group of pictures, comprising: receiving a video clip to be encoded in the number of frames of the group of pictures for encoding, the video clip comprising a plurality of video frames; before each encoding of an uncoded video frame of the video clip in units of a group of pictures, a number of frames of the uncoded video frame is determined, and a number of frames for the encoded group of pictures is adaptively determined based on the number of frames of the uncoded video frame.

In some embodiments, adaptively determining the number of frames for a group of pictures for encoding based on the number of frames of an unencoded video frame comprises: determining, in response to the number of frames of the unencoded video frame not being less than a first predetermined number of frames, a number of frames of a group of pictures for encoding from a number of frames of a first group of pictures and a number of frames of a second group of pictures based on the number of frames of the unencoded video frame, wherein the first predetermined number of frames is a sum of the number of frames of the first group of pictures and the number of frames of the second group of pictures and the number of frames of the first group of pictures is greater than the number of frames of the second group of pictures; determining, in response to the number of frames of the unencoded video frames being less than the first predetermined number of frames but not less than a second predetermined number of frames, a number of frames of a group of pictures for encoding from a number of frames of a first group of pictures and a number of frames of a second group of pictures based on a video frame of the second predetermined number of frames of the unencoded video frames, wherein the second predetermined number of frames is the same as the number of frames of the first group of pictures; in response to the number of frames of the unencoded video frame being less than a second predetermined number of frames, determining a number of frames of a third group of pictures as a number of frames of a group of pictures for encoding, the third group of pictures being the same as the number of frames of the unencoded video frame.

In some embodiments, determining the number of frames for the encoded group of pictures from the number of frames of the first group of pictures and the number of frames of the second group of pictures based on a first predetermined number of frames of the unencoded video frames comprises: selecting a first predetermined number of video frames from a first frame of the uncoded video frames; and respectively determining the coding cost of each picture group decomposition when the video frame of the first preset frame number is coded by the first picture group decomposition, the second picture group decomposition and the third picture group decomposition. Each group of pictures decomposition represents a form of dividing the first predetermined number of frames of video by group of pictures. The first picture group decomposition comprises a first picture group and a second picture group which are sequentially arranged, the second picture group decomposition comprises a second picture group and a first picture group which are sequentially arranged, and the third picture group decomposition comprises a plurality of second picture groups which are sequentially arranged; determining a frame number of the first picture group as a frame number of the picture group for encoding in response to the first picture group decomposition having the minimum encoding cost; in response to the encoding cost of the second group of pictures decomposition or the encoding cost of the third group of pictures decomposition being minimal, the number of frames of the second group of pictures is determined as the number of frames of the group of pictures for encoding.

In some embodiments, determining the number of frames for the encoded group of pictures from the number of frames of the first group of pictures and the number of frames of the second group of pictures based on a first predetermined number of frames of the unencoded video frames comprises: selecting a first predetermined number of video frames from a first frame of the uncoded video frames; determining coding cost of each picture group decomposition when the video frames of the first preset frame number are coded by a first picture group decomposition and a second picture group decomposition respectively, wherein the first picture group decomposition comprises a first picture group and a second picture group which are sequentially arranged, and the second picture group decomposition comprises a second picture group and a first picture group which are sequentially arranged; determining the encoding cost of a third picture group decomposition when encoding the video frame of the first preset frame number by the third picture group decomposition in response to the encoding cost of the first picture group decomposition being smaller than the encoding cost of a second picture group decomposition, wherein the third picture group comprises a plurality of second picture groups which are sequentially arranged; in response to the encoding cost of the first group of pictures being less than the encoding cost of the third group of pictures, determining a number of frames of the first group of pictures as a number of frames of the group of pictures for encoding; in response to the encoding cost of the first picture group decomposition not being less than the encoding cost of the second picture group decomposition or the encoding cost of the third picture group decomposition, the number of frames of the second picture group is determined as the number of frames of the picture group for encoding.

In some embodiments, determining the number of frames for the encoded group of pictures from the number of frames of the first group of pictures and the number of frames of the second group of pictures based on a second predetermined number of frames of the unencoded video frames comprises: selecting a second predetermined number of video frames from a first frame of the uncoded video frames; determining coding cost of each picture group decomposition when the video frames of the second preset frame number are coded by a fourth picture group decomposition and a fifth picture group decomposition respectively, wherein the fourth picture group decomposition comprises a first picture group and the fifth picture group decomposition comprises a plurality of second picture groups; in response to the coding cost of the fourth picture group decomposition being less than the coding cost of the fifth picture group decomposition, determining the number of frames of the first picture group as the number of frames of the picture group for coding, and otherwise determining the number of frames of the second picture group as the number of frames of the picture group for coding.

In some embodiments, determining the encoding cost for each picture group decomposition includes summing the encoding costs for all of the picture groups in the each picture group decomposition.

In some embodiments, the encoding cost of each of the all of the groups of pictures is the sum of the encoding costs of all of the video frames in each of the groups of pictures.

In some embodiments, the coding cost of each video frame of all video frames is the sum of the coding costs of all coding units in said each video frame, the coding cost of each coding unit being determined by the following formula: where J is the coding cost of the current coding unit, SAD is the sum of absolute errors between the current coding unit and its prediction unit, R is the number of bits estimated by coding the current coding unit using the selected prediction mode, and λ is the lagrangian multiplier.

In some embodiments, the number of frames of the first group of pictures is 16.

In some embodiments, the number of frames of the second group of pictures is 4.

According to a second aspect of the present disclosure, there is provided an apparatus for adaptively determining a number of frames for a coded group of pictures, comprising: a receiving module configured to receive a video clip to be encoded in the number of frames of the group of pictures for encoding, the video clip comprising a plurality of video frames; a determination module configured to determine a number of frames of an uncoded video frame before each encoding of the uncoded video frame of the video clip in units of a group of pictures, and adaptively determine a number of frames of a group of pictures for encoding based on the number of frames of the uncoded video frame.

In some embodiments, the determining module comprises: a first determination sub-module configured to determine a frame number of a group of pictures for encoding from a frame number of a first group of pictures and a frame number of a second group of pictures based on a video frame of a first predetermined frame number of the unencoded video frames, wherein the first predetermined frame number is a sum of the frame number of the first group of pictures and the frame number of the second group of pictures and the frame number of the first group of pictures is greater than the frame number of the second group of pictures, in response to the frame number of the unencoded video frames being not less than a first predetermined frame number; a second determination sub-module configured to determine a number of frames for the encoded group of pictures from a number of frames of the first group of pictures and a number of frames of the second group of pictures based on a video frame of a second predetermined number of frames in the unencoded video frame, in response to the number of frames of the unencoded video frame being less than the first predetermined number of frames but not less than the second predetermined number of frames, wherein the second predetermined number of frames is the same as the number of frames of the first group of pictures; a third determination submodule configured to determine a number of frames of a third group of pictures as a number of frames of a group of pictures for encoding in response to the number of frames of the unencoded video frame being less than a second predetermined number of frames, the third group of pictures being the same as the number of frames of the unencoded video frame.

In some embodiments, the first determination submodule is configured to, in response to the number of unencoded video frames not being less than a first predetermined number of frames: selecting a first predetermined number of video frames from a first frame of the uncoded video frames; determining coding cost of each picture group decomposition when the video frames of the first preset frame number are coded by a first picture group decomposition and a second picture group decomposition respectively, wherein the first picture group decomposition comprises a first picture group and a second picture group which are sequentially arranged, and the second picture group decomposition comprises a second picture group and a first picture group which are sequentially arranged; determining the coding cost when the video frames of the first preset frame number are coded by a third picture group decomposition in response to the coding cost of the first picture group decomposition being smaller than the coding cost of the second picture group decomposition, wherein the third picture group comprises a plurality of second picture groups which are sequentially arranged; in response to the encoding cost of the first group of pictures being less than the encoding cost of the third group of pictures, determining a number of frames of the first group of pictures as a number of frames of the group of pictures for encoding; in response to the encoding cost of the first picture group decomposition not being less than the encoding cost of the second picture group decomposition or the encoding cost of the third picture group decomposition, the number of frames of the second picture group is determined as the number of frames of the picture group for encoding.

According to a third aspect of the present disclosure, there is provided a computing device comprising a processor; and a memory configured to store computer-executable instructions thereon that, when executed by the processor, perform any of the methods described above.

According to a fourth aspect of the present disclosure, there is provided a computer readable storage medium storing computer executable instructions which, when executed, perform any of the methods as described above.

By the method and the device for adaptively determining the frame number of the picture group for coding, which are claimed by the present disclosure, the frame number or the size of the most suitable picture group for coding can be determined based on the frame number of the uncoded video frame in the video segment before the video segment is coded each time in units of the picture group, so that the determined frame number of the picture group for coding can adapt to the characteristics of the video sequence, thereby obtaining better coding performance, effectively improving the compression capability during video coding, and simultaneously not increasing the coding complexity.

These and other advantages of the present disclosure will become apparent from and elucidated with reference to the embodiments described hereinafter.

Drawings

Embodiments of the present disclosure will now be described in more detail and with reference to the accompanying drawings, in which:

FIG. 1 illustrates a schematic flow diagram of a method for adaptively determining a number of frames for a coded group of pictures according to one embodiment of the present disclosure;

FIG. 2 illustrates a schematic flow diagram for adaptively determining the number of frames for a group of pictures for encoding based on the number of frames of an unencoded video frame, according to one embodiment of the present disclosure;

FIG. 3 illustrates a schematic flow diagram of a method for determining a number of frames for a group of pictures for encoding based on a first predetermined number of frames of a video frame in accordance with one embodiment of the present disclosure;

FIG. 4 illustrates a schematic flow diagram of a method for determining a number of frames for a group of pictures for encoding based on a first predetermined number of frames of a video frame in accordance with another embodiment of the present disclosure;

FIG. 5 illustrates a schematic flow diagram of a method for determining a number of frames for a group of pictures for encoding based on a second predetermined number of frames of a video frame in accordance with one embodiment of the present disclosure;

FIG. 6A illustrates a schematic diagram of a first group of pictures decomposition, a second group of pictures decomposition, and a third group of pictures decomposition, according to one embodiment of the present disclosure;

FIG. 6B illustrates a schematic diagram of a fourth group of pictures decomposition, a fifth group of pictures decomposition, according to one embodiment of the present disclosure;

fig. 7 illustrates a schematic diagram of the order and manner in which video clips are encoded based on GOP16 according to an embodiment of the present disclosure;

fig. 8 illustrates a schematic diagram of the order and manner in which video clips are encoded based on GOP4 according to an embodiment of the present disclosure;

FIG. 9 illustrates an exemplary block diagram of an apparatus for adaptively determining a number of frames for an encoded group of pictures according to one embodiment of the present disclosure; and

FIG. 10 illustrates an example system including an example computing device that represents one or more systems and/or devices that can implement the various techniques described herein.

Detailed Description

The following description provides specific details for a thorough understanding and implementation of various embodiments of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these details. In some instances, well-known structures and functions have not been shown or described in detail to avoid unnecessarily obscuring the description of the embodiments of the disclosure. The terminology used in the present disclosure is to be understood in its broadest reasonable manner, even though it is being used in conjunction with a particular embodiment of the present disclosure.

First, some terms involved in the embodiments of the present application will be described so as to be easily understood by those skilled in the art: GOP: a set of consecutive frames of pictures is generally referred to as a group of pictures (group ofpictures, GOP) in video coding; GOPN: representing a group of pictures having N frames, N being a positive integer;

i frame: the method is a key frame, belongs to an intra-frame prediction frame, and can be completed only by the frame data during decoding;

P frame: for inter-frame prediction, the P frame represents the difference between the current frame and the previous I frame or P frame, and the difference defined by the current frame is overlapped by the previously cached picture during decoding to generate a final picture;

B frame: for inter-prediction frames and bi-directional difference frames, it is possible to record the difference between the present frame and the previous and subsequent frames, decode the B frame, acquire not only the previous buffered picture but also the subsequent picture, and acquire the final picture by superimposing the previous and subsequent pictures with the present frame data.

Fig. 1 illustrates a schematic flow diagram of a method 100 for adaptively determining a number of frames for a coded group of pictures according to one embodiment of the present disclosure. As shown in fig. 1, the method comprises the following steps.

In step 101, a video clip to be encoded is received, the video clip comprising a plurality of video frames. The video segment is to be encoded in the number of frames of the group of pictures used for encoding. The number of frames of the group of pictures used for encoding defines a GOP structure that encodes the video segment. For example, the number of frames for the encoded group of pictures is 16, then 16 frames in the video clip are encoded as one GOP. It should be noted that the video clips described in this disclosure may be part or all of a video sequence, while the first frame of the video sequence is always an I-frame and the video sequence may include, but is not limited to, one or more I-frames. In other words, the first frame of a video clip described by embodiments of the present disclosure may or may not be an I frame, which may be a P frame, for example.

In step 102, before each encoding of an unencoded video frame of the video clip in units of a group of pictures, a number of frames of the unencoded video frame is determined and a number of frames of the group of pictures for encoding is adaptively determined based on the number of frames of the unencoded video frame. In other words, each time a group of pictures in an uncoded video frame is encoded, the size of the group of pictures to which the present encoding is applicable needs to be redetermined.

By the method for adaptively determining the number of frames of a group of pictures for encoding described in the embodiments of the present disclosure, the number of frames or the size of the most suitable group of pictures for the present encoding can be determined based on the number of frames of uncoded video frames in a video clip before encoding the video clip each time in units of group of pictures, so that the determined number of frames of a group of pictures for encoding can adapt to the characteristics of a video sequence.

Fig. 2 is a flowchart of a method 200 for adaptively determining the number of frames for a group of pictures for encoding based on the number of frames of an uncoded video frame, according to one embodiment of the present disclosure. The method 200 may be used to implement the step of adaptively determining the number of frames for a group of pictures for encoding based on the number of frames of an uncoded video frame in step 102 described with reference to fig. 1. The method comprises the following steps.

In step 201, in response to the number of frames of the unencoded video frames not being less than the first predetermined number of frames, the number of frames of the group of pictures for encoding is determined from the number of frames of the first group of pictures and the number of frames of the second group of pictures based on the first predetermined number of frames of the unencoded video frames. The first predetermined number of frames is a sum of a number of frames of the first group of pictures and a number of frames of the second group of pictures and the number of frames of the first group of pictures is greater than the number of frames of the second group of pictures. As an example, the number of frames of the first picture group may be 16, the number of frames of the second picture group may be 4, and the first predetermined number of frames may be 20.

In step 202, in response to the number of frames of the unencoded video frames being less than the first predetermined number of frames but not less than the second predetermined number of frames, a number of frames of the group of pictures for encoding is determined from the number of frames of the first group of pictures and the number of frames of the second group of pictures based on the second predetermined number of frames of the unencoded video frames. The second predetermined number of frames is the same as the first group of pictures, and may be 16, for example.

In step 203, in response to the number of frames of the unencoded video frame being less than the second predetermined number of frames, the number of frames of a third group of pictures is determined as the number of frames of the group of pictures for encoding, the third group of pictures being the same as the number of frames of the unencoded video frame. That is, in the case where the number of frames of the uncoded video frame is smaller than the second predetermined number of frames, the uncoded video frame is directly encoded as one third picture group.

By the method for adaptively determining the number of frames for a coded group of pictures based on the number of frames of an uncoded video described in the embodiments of the present disclosure, the number of frames for a coded group of pictures may be determined using different determination manners based on the number of frames of an uncoded video frame, thereby enabling the determined number of frames for a coded group of pictures to better adapt to the characteristics of a video sequence.

As an example, fig. 3 illustrates a schematic flow diagram of a method 300 for determining the number of frames of a group of pictures for encoding from the number of frames of a first group of pictures and the number of frames of a second group of pictures based on a first predetermined number of frames of video frames in accordance with one embodiment of the present disclosure. The 300 includes the following steps.

In step 301, a first predetermined number of frames of video are selected starting from a first frame of the unencoded video frames. The first frame of the unencoded video frames may refer to a first frame of the unencoded video frames ordered in display order. As described above, the first predetermined frame number is a sum of the frame number of the first picture group and the frame number of the second picture group and the frame number of the first picture group is greater than the frame number of the second picture group.

In step 302, the encoding cost of each picture group decomposition when encoding the video frame of the first predetermined frame number in a first picture group decomposition, a second picture group decomposition, and a third picture group decomposition is determined, respectively. The first picture group decomposition comprises a first picture group and a second picture group which are sequentially arranged, the second picture group decomposition comprises a second picture group and a first picture group which are sequentially arranged, and the third picture group decomposition comprises a plurality of second picture groups which are sequentially arranged.

Taking the example that the frame number of the first picture group is 16, the frame number of the second picture group is 4, and the first predetermined frame number is 20, fig. 6A illustrates schematic diagrams of the first picture group decomposition C1, the second picture group decomposition C2, and the third picture group decomposition C3. As shown in fig. 6A, the first group of pictures decomposition C1 includes a first group of pictures GOP16 and a second group of pictures GOP4 arranged in sequence, the second group of pictures decomposition includes a second group of pictures GOP4 and a first group of pictures GOP16 arranged in sequence, and the third group of pictures decomposition includes five second group of picture GOP4 arranged in sequence. Note that GOPN (N is a positive integer) herein denotes a group of pictures having N frames.

In some embodiments, the encoding cost for each picture group decomposition may be determined by summing the encoding costs for all of the picture groups in the each picture group decomposition. As an example, the coding cost for each group of pictures may be the sum of the coding costs of all video frames in said each group of pictures.

In some embodiments, the coding cost of each video frame is the sum of the coding costs of all coding units in said each video frame. The coding unit refers to a coding block used in coding each video frame. Alternatively, the coding cost of each coding unit may be determined as follows:

＝SAD+λ*R

Where J is the coding cost of the current coding unit, SAD is the sum of absolute errors between the current coding unit and its prediction unit, R is the number of bits estimated by coding the current coding unit using the selected prediction mode, and λ is the lagrangian multiplier.

It should be noted that the encoding cost of each of all video frames may be determined entirely before the video clip begins to be encoded, to avoid recalculation each time required by subsequent encoding stages, thereby saving system resources.

In step 303, it is determined whether the coding cost of the first picture group decomposition is minimum among the coding costs of the first picture group decomposition, the second picture group decomposition, and the third picture group decomposition.

In step 304, in response to determining that the encoding cost of the first group of pictures decomposition is minimal, the number of frames of the first group of pictures is determined as the number of frames of the group of pictures for encoding. Otherwise, the number of frames of the second group of pictures is determined as the number of frames of the group of pictures for encoding at step 305.

In the embodiment of the disclosure, by determining the encoding cost of the first picture group decomposition, the second picture group decomposition and the third picture group decomposition respectively, the frame number of the picture group for encoding can be determined efficiently, so that the determined frame number of the picture group for encoding can be better adaptive to the characteristics of the video sequence.

As an example, fig. 4 illustrates a schematic flow diagram of a method 400 for determining the number of frames of a group of pictures for encoding from the number of frames of a first group of pictures and the number of frames of a second group of pictures based on a first predetermined number of frames of video frames according to another embodiment of the present disclosure. The 400 includes the following steps.

In step 401, a first predetermined number of frames of video are selected starting from a first frame of the unencoded video frames. The first frame of the unencoded video frames may refer to a first frame of the unencoded video frames ordered in display order. As described above, the first predetermined frame number is a sum of the frame number of the first picture group and the frame number of the second picture group and the frame number of the first picture group is greater than the frame number of the second picture group.

In step 402, the encoding cost of each picture group decomposition when encoding the first predetermined number of frames of video in a first picture group decomposition, a second picture group decomposition, respectively, is determined. As described above, the first group of pictures decomposition includes the first group of pictures and the second group of pictures arranged in sequence, and the second group of pictures decomposition includes the second group of pictures and the first group of pictures arranged in sequence.

In step 403, it is determined whether the coding cost of the first picture group decomposition is less than the coding cost of the second picture group decomposition. And, in step 404, in response to determining that the encoding cost of the first group of pictures decomposition is less than the encoding cost of the second group of pictures decomposition, determining the encoding cost when encoding the first predetermined number of frames of video with a third group of pictures decomposition, wherein the third group of pictures comprises a plurality of second groups of pictures arranged in sequence.

In step 405, it is determined whether the coding cost of the first picture group decomposition is less than the coding cost of the third picture group decomposition. And in response to determining that the encoding cost of the first group of pictures is less than the encoding cost of the third group of pictures, determining the number of frames of the first group of pictures as the number of frames of the group of pictures for encoding in step 406.

In step 407, in response to determining in step 403 that the encoding cost of the first picture group decomposition is not less than the encoding cost of the second picture group decomposition or in step 405 that the encoding cost of the first picture group decomposition is not less than the encoding cost of the third picture group decomposition, the number of frames of the second picture group is determined as the number of frames of the picture group for encoding.

In the present embodiment, similar to the above description, taking the example that the number of frames of the first picture group is 16, the number of frames of the second picture group is 4, and the first predetermined number of frames is 20, a schematic diagram of the first picture group decomposition C1, the second picture group decomposition C2, and the third picture group decomposition C3 is shown in fig. 6A.

It should be noted that the method of calculating the coding cost of the picture group decomposition described in this embodiment is the same as the method of calculating the coding cost described in the reference method 300, and will not be described in detail here.

By using the described embodiments, it is only necessary to further determine whether the coding cost of the first picture group decomposition is less than the coding cost of the third picture group decomposition when it is determined that the coding cost of the first picture group decomposition is less than the coding cost of the second picture group decomposition. In other words, when it is determined that the encoding cost of the first picture group decomposition is not less than the encoding cost of the second picture group decomposition, the number of frames of the second picture group can be directly determined as the number of frames of the picture group for encoding, so that it is unnecessary to determine the encoding cost when encoding the video frame of the first predetermined number of frames with the third picture group decomposition, which certainly saves the amount of calculation of the method, and can greatly improve the efficiency when adaptively determining the number of frames of the picture group for encoding.

Fig. 5 illustrates a schematic flow diagram of a method 500 of determining a number of frames for a coded group of pictures from a number of frames of a first group of pictures and a number of frames of a second group of pictures based on a second predetermined number of frames different from the first predetermined number of frames described above, according to one embodiment of the disclosure. As shown in fig. 5, the method 500 includes the following steps.

In step 501, a second predetermined number of frames of video are selected starting from a first frame of the unencoded video frames. The first frame of the unencoded video frames may refer to a first frame of the unencoded video frames ordered in display order. As described above, the second predetermined frame number is the same as the frame number of the first picture group.

In step 502, the encoding cost of each picture group decomposition when encoding the video frame of the second predetermined frame number with the fourth picture group decomposition and the fifth picture group decomposition is determined. The fourth group of pictures solution includes a first group of pictures (GOP 16) and the fifth group of pictures solution includes a plurality of second groups of pictures.

Taking the example that the frame number of the first picture group is 16, the frame number of the second picture group is 4, and the first predetermined frame number is 16 as an example, fig. 6B illustrates a schematic diagram of the fourth picture group decomposition C4 and the fifth picture group decomposition C5. As shown in fig. 6B, the fourth group of pictures decomposition C4 includes one first group of GOP16, and the fifth group of pictures decomposition C5 includes four second group of GOPs 4 arranged in sequence. Note that GOPN (N is a positive integer) herein denotes a group of pictures having N frames.

In step 503, it is determined whether the coding cost of the fourth picture group decomposition is less than the coding cost of the fifth picture group decomposition.

In response to determining that the encoding cost of the fourth picture group decomposition is less than the encoding cost of the fifth picture group decomposition, the number of frames of the first picture group is determined to be the number of frames of the picture group for encoding in step 504, otherwise the number of frames of the second picture group is determined to be the number of frames of the picture group for encoding in step 505.

In the embodiments of the present disclosure, by determining the encoding costs of the fourth picture group decomposition and the fifth picture group decomposition, respectively, the number of frames of the picture group for encoding can be efficiently determined in the case where the number of frames of the uncoded video frame is smaller than the first predetermined number of frames but not smaller than the second predetermined number of frames, thereby enabling the determined number of frames of the picture group for encoding to better adapt to the characteristics of the video sequence.

In some embodiments, in the case where the number of frames of a group of pictures for encoding is determined, the received video clip may be encoded in units of a group of pictures, and optionally an encoding method when encoding based on the number of frames of the group of pictures for encoding may also be determined.

As an example, fig. 7 illustrates the order and manner in which video clips are encoded based on GOP 16. As shown in fig. 7, the original sequence of frames in the video clip is the same as the display sequence of the frames shown in the figure, and the display sequence in fig. 7 is from left to right, respectively, the 0 th frame, the 1 st frame, the … … th frame, and the 16 th frame. It should be noted that the GOP16 is shown in the range of 1 st-16 th frames, and that the 0 th frame is shown as a frame of the previous video segment, and is not a frame in the GOP 16. The 0 th frame may be an I frame or a P frame, which is not limiting.

As shown in fig. 7, the encoding order of the first frame in the video clip is 5, and is encoded as a B frame. The second frame in the digital video clip is encoded in order of 4 and is encoded as a B frame, and so on. It should be noted that the frames encoded and formed are I-frames, B-frames, P-frames, respectively, denoted by the symbols "I", "B", "P" in fig. 7. The arrows in fig. 7 indicate the reference relationship at the time of encoding, the frame at the beginning of the arrow is the frame being encoded, and the frame pointed to by the arrow is the frame referenced at the time of encoding thereof. For example, a first frame of the video clip having an encoding order of 5 is predicted by referring to a B frame formed by encoding a 0 th frame (I frame) and a second frame of the video clip having an encoding order of 4 at the time of encoding, and a 16 th frame of the video clip having an encoding order of 1 is predicted by referring to only the 0 th frame (I frame) at the time of encoding.

As an example, fig. 8 illustrates the order and manner in which video clips are encoded based on GOP 4. As shown in fig. 8, the original order of frames in the video clip is the same as the display order of the frames shown in the figure, and the display order in fig. 8 is from left to right, respectively, the 0 th frame, the 1 st frame, the 2 nd frame, the 3 rd frame, and the 4 th frame. It should be noted that GOP4 in the figure ranges from 1 st to 4 th frames shown in the figure, and frame 0 shown in the figure is typically a frame of a previous video clip, not a frame in GOP 4. The 0 th frame may be an I frame or a P frame, which is not limiting.

As shown in fig. 8, the encoding order of the first frame in the video clip is 3 and is encoded as a B frame. The second frame in the digital video clip is encoded in order of 2 and is encoded as a B frame, and so on. It should be noted that the frames encoded and formed are I-frames, B-frames, P-frames, respectively, denoted by the symbols "I", "B", "P" in fig. 8. The arrows in fig. 8 indicate the reference relationship at the time of encoding, the frame at the beginning of the arrow is the frame being encoded, and the frame pointed to by the arrow is the reference frame at the time of encoding thereof. For example, a first frame of the video clip having an encoding order of 3 is predicted by referring to a B frame formed by encoding a 0 th frame (I frame) and a second frame of the video clip having an encoding order of 2 at the time of encoding, and a 4 th frame of the video clip having an encoding order of 1 is predicted by referring to only the 0 th frame (I frame) at the time of encoding.

Fig. 9 illustrates an exemplary block diagram of an apparatus 900 for adaptively determining a number of frames for an encoded group of pictures according to one embodiment of the present disclosure. As shown in fig. 9, the apparatus 900 includes a receiving module 910 and a determining module 920.

The receiving module 910 is configured to receive a video clip to be encoded in the number of frames of the group of pictures for encoding, the video clip comprising a plurality of video frames. The number of frames of the group of pictures used for encoding defines a GOP structure that encodes the video segment. For example, the number of frames for the encoded group of pictures is 16, then 16 frames in the video clip are encoded as one GOP.

The determining module 920 is configured to determine a number of frames of an uncoded video frame before each encoding of the uncoded video frame of the video clip in units of picture groups, and adaptively determine a number of frames for an encoded picture group based on the number of frames of the uncoded video frame. In other words, each time a group of pictures in an uncoded video frame is encoded, the determining module 920 needs to re-determine the size of the group of pictures applicable to the present encoding.

In some embodiments, as shown in fig. 9, the determining module 920 may include a first determining sub-module 921, a second determining sub-module 922, and a third determining sub-module 923.

The first determining submodule 921 is configured to determine a frame number of a picture group for encoding from a frame number of a first picture group and a frame number of a second picture group based on a video frame of a first predetermined frame number among the unencoded video frames, wherein the first predetermined frame number is a sum of the frame number of the first picture group and the frame number of the second picture group and the frame number of the first picture group is greater than the frame number of the second picture group, in response to the frame number of the unencoded video frame being not less than the first predetermined frame number. As an example, the number of frames of the first picture group may be 16, the number of frames of the second picture group may be 4, and the first predetermined number of frames may be 20.

The second determination submodule 922 is configured to determine a frame number of a group of pictures for encoding from a frame number of a first group of pictures and a frame number of a second group of pictures based on a video frame of the second predetermined frame number in the unencoded video frames in response to the frame number of the unencoded video frames being less than the first predetermined frame number but not less than the second predetermined frame number. The second predetermined number of frames is the same as the first group of pictures, and may be 16, for example.

The third determination sub-module 923 is configured to determine, in response to the number of frames of the unencoded video frame being less than a second predetermined number of frames, a number of frames of a third group of pictures as the number of frames of the group of pictures for encoding, the third group of pictures being the same as the number of frames of the unencoded video frame. That is, in the case where the number of frames of the uncoded video frame is less than the second predetermined number of frames, the third determination submodule 923 directly encodes the uncoded video frame as one third picture group.

In some embodiments, the first determining submodule 921 is configured to, in response to the number of uncoded video frames being not less than a first predetermined number of frames: selecting a first predetermined number of video frames from a first frame of the uncoded video frames; determining coding cost of each picture group decomposition when the video frames of the first preset frame number are coded by a first picture group decomposition and a second picture group decomposition respectively, wherein the first picture group decomposition comprises a first picture group and a second picture group which are sequentially arranged, and the second picture group decomposition comprises a second picture group and a first picture group which are sequentially arranged; determining the coding cost when the video frames of the first preset frame number are coded by a third picture group decomposition in response to the coding cost of the first picture group decomposition being smaller than the coding cost of the second picture group decomposition, wherein the third picture group comprises a plurality of second picture groups which are sequentially arranged; in response to the encoding cost of the first group of pictures being less than the encoding cost of the third group of pictures, determining a number of frames of the first group of pictures as a number of frames of the group of pictures for encoding; in response to the encoding cost of the first picture group decomposition not being less than the encoding cost of the second picture group decomposition or the encoding cost of the third picture group decomposition, the number of frames of the second picture group is determined as the number of frames of the picture group for encoding.

In some embodiments, the first determining submodule 921 is configured to, in response to the number of uncoded video frames being not less than a first predetermined number of frames: selecting a first predetermined number of video frames from a first frame of the uncoded video frames; determining coding cost of each picture group decomposition when the video frames of the first preset frame number are coded by a first picture group decomposition and a second picture group decomposition respectively, wherein the first picture group decomposition comprises a first picture group and a second picture group which are sequentially arranged, and the second picture group decomposition comprises a second picture group and a first picture group which are sequentially arranged; determining the encoding cost of a third picture group decomposition when encoding the video frame of the first preset frame number by the third picture group decomposition in response to the encoding cost of the first picture group decomposition being smaller than the encoding cost of a second picture group decomposition, wherein the third picture group comprises a plurality of second picture groups which are sequentially arranged; in response to the encoding cost of the first group of pictures being less than the encoding cost of the third group of pictures, determining a number of frames of the first group of pictures as a number of frames of the group of pictures for encoding; in response to the encoding cost of the first picture group decomposition not being less than the encoding cost of the second picture group decomposition or the encoding cost of the third picture group decomposition, the number of frames of the second picture group is determined as the number of frames of the picture group for encoding.

In some embodiments, the first determination submodule 922 is configured to, in response to the number of unencoded video frames being less than a first predetermined number of frames but not less than a second predetermined number of frames: selecting a second predetermined number of video frames from a first frame of the uncoded video frames; determining coding cost of each picture group decomposition when the video frames of the second preset frame number are coded by a fourth picture group decomposition and a fifth picture group decomposition respectively, wherein the fourth picture group decomposition comprises a first picture group and the fifth picture group decomposition comprises a plurality of second picture groups; in response to the coding cost of the fourth picture group decomposition being less than the coding cost of the fifth picture group decomposition, determining the number of frames of the first picture group as the number of frames of the picture group for coding, and otherwise determining the number of frames of the second picture group as the number of frames of the picture group for coding.

FIG. 10 illustrates an example system 1000 that includes an example computing device 1010 that represents one or more systems and/or devices that can implement the various techniques described herein. Computing device 1010 may be, for example, a server of a service provider, a device associated with a server, a system-on-chip, and/or any other suitable computing device or computing system. The device 900 for adaptively determining the number of frames for an encoded group of pictures described above with respect to fig. 9 may take the form of a computing device 1010. Alternatively, the apparatus 900 for adaptively determining the number of frames for an encoded group of pictures may be implemented as a computer program in the form of a GOP determination application 1016.

The example computing device 1010 as illustrated includes a processing system 1011, one or more computer-readable media 1012, and one or more I/O interfaces 1013 communicatively coupled to each other. Although not shown, computing device 1010 may also include a system bus or other data and command transfer system that couples the various components to one another. A system bus may include any one or combination of different bus structures, such as a memory bus or memory controller, a peripheral bus, a universal serial bus, and/or a processor or local bus that utilizes any of a variety of bus architectures. Various other examples are also contemplated, such as control and data lines.

The processing system 1011 represents functionality that performs one or more operations using hardware. Thus, the processing system 1011 is illustrated as including hardware elements 1014 that may be configured as processors, functional blocks, and the like. This may include implementation in hardware as application specific integrated circuits or other logic devices formed using one or more semiconductors. The hardware elements 1014 are not limited by the materials from which they are formed or the processing mechanisms employed therein. For example, the processor may be comprised of semiconductor(s) and/or transistors (e.g., electronic Integrated Circuits (ICs)). In such a context, the processor-executable instructions may be electronically-executable instructions.

Computer-readable medium 1012 is illustrated as including memory/storage 1015. Memory/storage 1015 represents memory/storage capacity associated with one or more computer-readable media. Memory/storage 1015 may include volatile media such as Random Access Memory (RAM) and/or nonvolatile media such as Read Only Memory (ROM), flash memory, optical disks, magnetic disks, and so forth. The memory/storage 1015 may include fixed media (e.g., RAM, ROM, a fixed hard drive, etc.) and removable media (e.g., flash memory, a removable hard drive, an optical disk, and so forth). The computer-readable medium 1012 may be configured in a variety of other ways as described further below.

The one or more I/O interfaces 1013 represent functions that allow a user to input commands and information to the computing device 1010, and optionally also allow information to be presented to the user and/or other components or devices using various input/output devices. Examples of input devices include keyboards, cursor control devices (e.g., mice), microphones (e.g., for voice input), scanners, touch functions (e.g., capacitive or other sensors configured to detect physical touches), cameras (e.g., motion that does not involve touches may be detected as gestures using visible or invisible wavelengths such as infrared frequencies), and so forth. Examples of output devices include a display device (e.g., a display or projector), speakers, a printer, a network card, a haptic response device, and so forth. Accordingly, computing device 1010 may be configured in a variety of ways to support user interaction as described further below.

The computing device 1010 also includes a GOP determination application 1016.GOP determination application 1016 can be, for example, a software instance of device 900 described in fig. 9 for adaptively determining the number of frames for an encoded group of pictures, and implement the techniques described herein in combination with other elements in computing device 1010.

Various techniques may be described herein in the general context of software hardware elements or program modules. Generally, these modules include routines, programs, objects, elements, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The terms "module," "functionality," and "component" as used herein generally represent software, firmware, hardware, or a combination thereof. The features of the techniques described herein are platform-independent, meaning that the techniques may be implemented on a variety of computing platforms having a variety of processors.

An implementation of the described modules and techniques may be stored on or transmitted across some form of computer readable media. Computer-readable media can include a variety of media that are accessible by computing device 1010. By way of example, and not limitation, computer readable media may comprise "computer readable storage media" and "computer readable signal media".

"Computer-readable storage medium" refers to a medium and/or device that can permanently store information and/or a tangible storage device, as opposed to a mere signal transmission, carrier wave, or signal itself. Thus, computer-readable storage media refers to non-signal bearing media. Computer-readable storage media include hardware such as volatile and nonvolatile, removable and non-removable media and/or storage devices implemented in methods or techniques suitable for storage of information such as computer-readable instructions, data structures, program modules, logic elements/circuits or other data. Examples of a computer-readable storage medium may include, but are not limited to RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical storage, hard disk, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or other storage devices, tangible media, or articles of manufacture adapted to store the desired information and which may be accessed by a computer.

"Computer-readable signal medium" refers to a signal bearing medium configured to hardware, such as to send instructions to computing device 1010 via a network. Signal media may typically be embodied in computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave, data signal, or other transport mechanism. Signal media also include any information delivery media. The term "modulated data signal" means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media.

As previously described, the hardware elements 1014 and computer-readable media 1012 represent instructions, modules, programmable device logic, and/or fixed device logic implemented in hardware that may be used in some embodiments to implement at least some aspects of the techniques described herein. The hardware elements may include integrated circuits or components of a system on a chip, application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs), complex Programmable Logic Devices (CPLDs), and other implementations in silicon or other hardware devices. In this context, the hardware elements may be implemented as processing devices that perform program tasks defined by instructions, modules, and/or logic embodied by the hardware elements, as well as hardware devices that store instructions for execution, such as the previously described computer-readable storage media.

Combinations of the foregoing may also be used to implement the various techniques and modules described herein. Thus, software, hardware, or program modules and other program modules may be implemented as one or more instructions and/or logic embodied on some form of computer readable storage medium and/or by one or more hardware elements 1014. Computing device 1010 may be configured to implement particular instructions and/or functions corresponding to software and/or hardware modules. Thus, for example, by using the computer-readable storage medium of the processing system and/or the hardware elements 1014, a module may be implemented at least in part in hardware as a module executable by the computing device 1010 as software. The instructions and/or functions may be executable/operable by one or more articles of manufacture (e.g., one or more computing devices 1010 and/or processing systems 1011) to implement the techniques, modules, and examples described herein.

In various implementations, the computing device 1010 may take on a variety of different configurations. For example, computing device 1010 may be implemented as a computer-like device including a personal computer, desktop computer, multi-screen computer, laptop computer, netbook, and the like. Computing device 1010 may also be implemented as a mobile appliance-like device including mobile devices such as mobile telephones, portable music players, portable gaming devices, tablet computers, multi-screen computers, and the like. Computing device 1010 may also be implemented as a television-like device that includes devices having or connected to generally larger screens in casual viewing environments. Such devices include televisions, set-top boxes, gaming machines, and the like.

The techniques described herein may be supported by these various configurations of computing device 1010 and are not limited to the specific examples of techniques described herein. The functionality may also be implemented in whole or in part on the "cloud" 1020 through the use of a distributed system, such as through the platform 1022 described below.

Cloud 1020 includes and/or is representative of a platform 1022 for resources 1024. The platform 1022 abstracts underlying functionality of hardware (e.g., servers) and software resources of the cloud 1020. The resources 1024 may include applications and/or data that can be used when executing computer processing on servers remote from the computing device 1010. The resources 1024 may also include services provided over the internet and/or over subscriber networks such as cellular or Wi-Fi networks.

The platform 1022 may abstract resources and functions to connect the computing device 1010 with other computing devices. The platform 1022 may also be used to abstract a hierarchy of resources to provide a corresponding level of hierarchy of encountered demand for resources 1024 implemented via the platform 1022. Thus, in an interconnect device embodiment, implementation of the functionality described herein may be distributed throughout system 1000. For example, the functionality may be implemented in part on the computing device 1010 and by the platform 1022 that abstracts the functionality of the cloud 1020.

It should be understood that for clarity, embodiments of the present disclosure have been described with reference to different functional units. However, it will be apparent that the functionality of each functional unit may be implemented in a single unit, in a plurality of units or as part of other functional units without departing from the present disclosure. For example, functionality illustrated to be performed by a single unit may be performed by multiple different units. Thus, references to specific functional units are only to be seen as references to suitable units for providing the described functionality rather than indicative of a strict logical or physical structure or organization. Thus, the present disclosure may be implemented in a single unit or may be physically and functionally distributed between different units and circuits.

It will be understood that, although the terms first, second, third, etc. may be used herein to describe various devices, elements, components or sections, these devices, elements, components or sections should not be limited by these terms. These terms are only used to distinguish one device, element, component, or section from another device, element, component, or section.

Although the present disclosure has been described in connection with some embodiments, it is not intended to be limited to the specific form set forth herein. Rather, the scope of the present disclosure is limited only by the appended claims. Additionally, although individual features may be included in different claims, these may possibly be advantageously combined, and the inclusion in different claims does not imply that a combination of features is not feasible and/or advantageous. The order of features in the claims does not imply any specific order in which the features must be worked. Furthermore, in the claims, the word "comprising" does not exclude other elements, and the indefinite article "a" or "an" does not exclude a plurality. Reference signs in the claims are provided merely as a clarifying example and shall not be construed as limiting the scope of the claims in any way.

Claims

1. A method for adaptively determining a number of frames for a coded group of pictures, comprising:

Receiving a video clip to be encoded in the number of frames of the group of pictures for encoding, the video clip comprising a plurality of video frames;

Determining a number of frames of an uncoded video frame of the video clip before each encoding of the uncoded video frame in units of a group of pictures, and adaptively determining a number of frames for the encoded group of pictures based on the number of frames of the uncoded video frame;

wherein adaptively determining the number of frames for the encoded group of pictures based on the number of frames of the unencoded video frame comprises:

Determining, in response to the number of frames of the unencoded video frame not being less than a first predetermined number of frames, a number of frames of a group of pictures for encoding from a number of frames of a first group of pictures and a number of frames of a second group of pictures based on the number of frames of the unencoded video frame, wherein the first predetermined number of frames is a sum of the number of frames of the first group of pictures and the number of frames of the second group of pictures and the number of frames of the first group of pictures is greater than the number of frames of the second group of pictures;

determining, in response to the number of frames of the unencoded video frames being less than the first predetermined number of frames but not less than a second predetermined number of frames, a number of frames of a group of pictures for encoding from a number of frames of a first group of pictures and a number of frames of a second group of pictures based on a video frame of the second predetermined number of frames of the unencoded video frames, wherein the second predetermined number of frames is the same as the number of frames of the first group of pictures;

in response to the number of frames of the unencoded video frame being less than a second predetermined number of frames, determining a number of frames of a third group of pictures as a number of frames of a group of pictures for encoding, the third group of pictures being the same as the number of frames of the unencoded video frame.

2. The method of claim 1, wherein determining the number of frames for the encoded group of pictures from the number of frames of the first group of pictures and the number of frames of the second group of pictures based on a first predetermined number of frames of the unencoded video frames comprises:

Selecting a first predetermined number of video frames from a first frame of the uncoded video frames;

Determining coding cost of each picture group decomposition when the video frame of the first preset frame number is coded by a first picture group decomposition, a second picture group decomposition and a third picture group decomposition respectively, wherein the first picture group decomposition comprises a first picture group and a second picture group which are sequentially arranged, the second picture group decomposition comprises a second picture group and a first picture group which are sequentially arranged, and the third picture group decomposition comprises a plurality of second picture groups which are sequentially arranged;

determining a frame number of the first picture group as a frame number of the picture group for encoding in response to the first picture group decomposition having the minimum encoding cost;

In response to the encoding cost of the second group of pictures decomposition or the encoding cost of the third group of pictures decomposition being minimal, the number of frames of the second group of pictures is determined as the number of frames of the group of pictures for encoding.

3. The method of claim 1, wherein determining the number of frames for the encoded group of pictures from the number of frames of the first group of pictures and the number of frames of the second group of pictures based on a first predetermined number of frames of the unencoded video frames comprises:

Determining coding cost of each picture group decomposition when the video frames of the first preset frame number are coded by a first picture group decomposition and a second picture group decomposition respectively, wherein the first picture group decomposition comprises a first picture group and a second picture group which are sequentially arranged, and the second picture group decomposition comprises a second picture group and a first picture group which are sequentially arranged;

Determining the encoding cost of a third picture group decomposition when encoding the video frame of the first preset frame number by the third picture group decomposition in response to the encoding cost of the first picture group decomposition being smaller than the encoding cost of a second picture group decomposition, wherein the third picture group comprises a plurality of second picture groups which are sequentially arranged;

In response to the encoding cost of the first group of pictures being less than the encoding cost of the third group of pictures, determining a number of frames of the first group of pictures as a number of frames of the group of pictures for encoding;

In response to the encoding cost of the first picture group decomposition not being less than the encoding cost of the second picture group decomposition or the encoding cost of the third picture group decomposition, the number of frames of the second picture group is determined as the number of frames of the picture group for encoding.

4. The method of claim 1, wherein determining the number of frames for the encoded group of pictures from the number of frames of the first group of pictures and the number of frames of the second group of pictures based on a second predetermined number of frames of the unencoded video frames comprises:

Selecting a second predetermined number of video frames from a first frame of the uncoded video frames;

Determining coding cost of each picture group decomposition when the video frames of the second preset frame number are coded by a fourth picture group decomposition and a fifth picture group decomposition respectively, wherein the fourth picture group decomposition comprises a first picture group and the fifth picture group decomposition comprises a plurality of second picture groups;

In response to the coding cost of the fourth picture group decomposition being less than the coding cost of the fifth picture group decomposition, determining the number of frames of the first picture group as the number of frames of the picture group for coding, and otherwise determining the number of frames of the second picture group as the number of frames of the picture group for coding.

5. The method of any of claims 2-4, wherein determining the coding cost for each picture group decomposition comprises summing the coding costs for all picture groups in the each picture group decomposition.

6. The method of claim 5, wherein the coding cost of each of the all-picture groups is a sum of the coding costs of all video frames in the each-picture group.

7. The method of claim 6, wherein the coding cost of each video frame of all video frames is the sum of the coding costs of all coding units in said each video frame, the coding cost of each coding unit being determined by the following formula:

J＝SAD+λ*R；

8. The method of claim 1, wherein the number of frames of the first group of pictures is 16.

9. The method of claim 1, wherein the number of frames of the second group of pictures is 4.

10. An apparatus for adaptively determining a number of frames for a coded group of pictures, comprising:

A receiving module configured to receive a video clip to be encoded in the number of frames of the group of pictures for encoding, the video clip comprising a plurality of video frames;

A determining module configured to determine a number of frames of an uncoded video frame before each encoding of the uncoded video frame of the video clip in units of a group of pictures, and adaptively determine a number of frames of a group of pictures for encoding based on the number of frames of the uncoded video frame;

wherein the determining module comprises:

A first determination sub-module configured to determine a frame number of a group of pictures for encoding from a frame number of a first group of pictures and a frame number of a second group of pictures based on a video frame of a first predetermined frame number of the unencoded video frames, wherein the first predetermined frame number is a sum of the frame number of the first group of pictures and the frame number of the second group of pictures and the frame number of the first group of pictures is greater than the frame number of the second group of pictures, in response to the frame number of the unencoded video frames being not less than a first predetermined frame number;

A second determination sub-module configured to determine a number of frames for the encoded group of pictures from a number of frames of the first group of pictures and a number of frames of the second group of pictures based on a video frame of a second predetermined number of frames in the unencoded video frame, in response to the number of frames of the unencoded video frame being less than the first predetermined number of frames but not less than the second predetermined number of frames, wherein the second predetermined number of frames is the same as the number of frames of the first group of pictures;

A third determination submodule configured to determine a number of frames of a third group of pictures as a number of frames of a group of pictures for encoding in response to the number of frames of the unencoded video frame being less than a second predetermined number of frames, the third group of pictures being the same as the number of frames of the unencoded video frame.

11. The device of claim 10, wherein the first determination submodule is configured to, in response to a number of the unencoded video frames not being less than a first predetermined number of frames:

Determining the coding cost when the video frames of the first preset frame number are coded by a third picture group decomposition in response to the coding cost of the first picture group decomposition being smaller than the coding cost of the second picture group decomposition, wherein the third picture group comprises a plurality of second picture groups which are sequentially arranged;

12. A computing device, comprising

A processor; and

A memory configured to store computer-executable instructions thereon that, when executed by a processor, perform the method of any of claims 1-9.

13. A computer readable storage medium storing computer executable instructions which, when executed by a processor, perform the method of any of claims 1-9.