CN112788340A

CN112788340A - Method and apparatus for adaptively determining frame number of picture group for encoding

Info

Publication number: CN112788340A
Application number: CN201911082343.6A
Authority: CN
Inventors: 张涛
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-11-07
Filing date: 2019-11-07
Publication date: 2021-05-11

Abstract

A method for adaptively determining a frame number for a group of pictures to encode is described herein, comprising: receiving a video clip to be encoded at the frame number of the group of pictures for encoding, the video clip comprising a plurality of video frames; before encoding an unencoded video frame of the video clip once in units of picture groups, a frame number of the unencoded video frame is determined, and a frame number of the picture groups used for encoding is adaptively determined based on the frame number of the unencoded video frame.

Description

Method and apparatus for adaptively determining frame number of picture group for encoding

Technical Field

The present disclosure relates to the field of video processing, and in particular, to a method and apparatus for adaptively determining a frame number for a group of pictures to encode.

Background

When encoding a video sequence, the video sequence is typically compressed by reducing spatial and temporal redundancies by performing a prediction process in the spatial and/or temporal domain. In actual compression, various algorithms are used to reduce the amount of data, I-frames, P-frames, and B-frames being the most commonly used. The I frame is a key frame and belongs to an intra-prediction frame. The P-frame and the B-frame are both inter-predicted frames, the difference being that P-frame prediction predicts the value of the current block based on only one predicted block, while B-frame prediction allows prediction of the current block based on interpolation based on two previously encoded blocks.

In video coding, a group of pictures of several consecutive frames is generally called a group of pictures (GOP). The encoding process performs encoding in units of GOPs. The size of the GOP determines the basic hierarchical structure and reference relationship in encoding, and has a large influence on the encoding performance. Fixed size GOPs, such as GOP16, are typically employed in existing schemes, which indicates that the number of video frames in each GOP is 16.

However, video sequences are typically composed of complex video segments and simple video segments. For complex video clips, a smaller GOP is selected, so that close frames in the GOP can be fully referred to, and better prediction can be obtained. For simpler video clips, selecting a larger GOP enables reasonable quality allocation of frames at each level for better coding performance. Therefore, the characteristic that the fixed size GOP adopted in the existing scheme can not be self-adapted to the video sequence, and better performance can not be obtained.

Disclosure of Invention

In view of the above, the present disclosure provides methods and apparatus for adaptively determining a frame number for a group of pictures to encode, which desirably overcome some or all of the above-mentioned deficiencies and possibly others.

According to a first aspect of the present disclosure, there is provided a method for adaptively determining a frame number of a group of pictures for encoding, comprising: receiving a video clip to be encoded at the frame number of the group of pictures for encoding, the video clip comprising a plurality of video frames; before encoding an unencoded video frame of the video clip once in units of picture groups, a frame number of the unencoded video frame is determined, and a frame number of the picture groups used for encoding is adaptively determined based on the frame number of the unencoded video frame.

In some embodiments, adaptively determining a frame number for a group of pictures to encode based on a frame number of an unencoded video frame comprises: in response to the frame number of the non-coded video frame not being less than a first preset frame number, determining the frame number of a picture group for coding from the frame number of a first picture group and the frame number of a second picture group based on the video frame of the first preset frame number in the non-coded video frame, wherein the first preset frame number is the sum of the frame number of the first picture group and the frame number of the second picture group, and the frame number of the first picture group is greater than the frame number of the second picture group; in response to the frame number of the non-coded video frame being less than a first predetermined frame number but not less than a second predetermined frame number, determining a frame number of a picture group for coding from a frame number of a first picture group and a frame number of a second picture group based on a video frame of the second predetermined frame number in the non-coded video frame, wherein the second predetermined frame number is the same as the frame number of the first picture group; in response to the number of frames of the non-encoded video frame being less than a second predetermined number of frames, determining a number of frames of a third group of pictures as the number of frames for the encoded group of pictures, the third group of pictures being the same as the number of frames of the non-encoded video frame.

In some embodiments, determining the number of frames for the group of pictures to encode from the number of frames for the first group of pictures and the number of frames for the second group of pictures based on the first predetermined number of frames of video frames in the unencoded video frames comprises: selecting a video frame with a first preset frame number from a first frame in an uncoded video frame; and respectively determining the coding cost of each picture group decomposition when the first picture group decomposition, the second picture group decomposition and the third picture group decomposition are used for coding the video frame with the first preset frame number. Each group of pictures is divided into a plurality of video frames of the first predetermined number of frames according to the group of pictures. The first picture group decomposition comprises a first picture group and a second picture group which are sequentially arranged, the second picture group decomposition comprises a second picture group and a first picture group which are sequentially arranged, and the third picture group decomposition comprises a plurality of second picture groups which are sequentially arranged; determining the frame number of the first picture group as the frame number of the picture group for encoding in response to the minimum encoding cost of the first picture group decomposition; in response to the coding cost of the second picture group decomposition or the coding cost of the third picture group decomposition being minimum, the frame number of the second picture group is determined as the frame number of the picture group for coding.

In some embodiments, determining the number of frames for the group of pictures to encode from the number of frames for the first group of pictures and the number of frames for the second group of pictures based on the first predetermined number of frames of video frames in the unencoded video frames comprises: selecting a video frame with a first preset frame number from a first frame in an uncoded video frame; respectively determining the coding cost of each picture group decomposition when the video frame with the first preset frame number is coded by a first picture group decomposition and a second picture group decomposition, wherein the first picture group decomposition comprises a first picture group and a second picture group which are sequentially arranged, and the second picture group decomposition comprises a second picture group and a first picture group which are sequentially arranged; in response to the coding cost of the first picture group decomposition being less than the coding cost of the second picture group decomposition, determining the coding cost of a third picture group decomposition when the video frame with the first preset number of frames is coded by the third picture group decomposition, wherein the third picture group comprises a plurality of second picture groups which are arranged in sequence; determining the frame number of the first picture group as the frame number of the picture group for encoding in response to the encoding cost of the first picture group decomposition being less than the encoding cost of the third picture group decomposition; determining the frame number of the second group of pictures as the frame number of the group of pictures for encoding in response to the encoding cost of the first group of pictures being not less than the encoding cost of the second group of pictures or the encoding cost of the third group of pictures.

In some embodiments, determining the number of frames of the group of pictures for encoding from the number of frames of the first group of pictures and the number of frames of the second group of pictures based on the second predetermined number of frames of video frames of the non-encoded video frames comprises: selecting a video frame with a second preset frame number from a first frame in the uncoded video frames; determining coding cost of each picture group decomposition when a fourth picture group decomposition and a fifth picture group decomposition are used for coding the video frame with the second preset frame number respectively, wherein the fourth picture group decomposition comprises a first picture group, and the fifth picture group decomposition comprises a plurality of second picture groups; in response to the coding cost of the fourth group of pictures being less than the coding cost of the fifth group of pictures, determining the frame number of the first group of pictures as the frame number of the group of pictures used for coding, otherwise determining the frame number of the second group of pictures as the frame number of the group of pictures used for coding.

In some embodiments, determining the coding cost of each group of pictures decomposition comprises summing the coding costs of all groups of pictures in said each group of pictures decomposition.

In some embodiments, the coding cost of each of the all groups of pictures is the sum of the coding costs of all video frames in the each group of pictures.

In some embodiments, the coding cost of each of all video frames is the sum of the coding costs of all coding units in said each video frame, and the coding cost of each coding unit is determined by the following formula:

(ii) a Where J is the coding cost of the current coding unit, SAD is the sum of absolute errors between the current coding unit and its prediction unit, R is the number of bits estimated by coding the current coding unit using the selected prediction mode, and λ is the lagrange multiplier.

In some embodiments, the first group of pictures has a frame number of 16.

In some embodiments, the number of frames of the second group of pictures is 4.

According to a second aspect of the present disclosure, there is provided an apparatus for adaptively determining a frame number of a group of pictures for encoding, comprising: a receiving module configured to receive a video clip to be encoded at the frame number of the group of pictures for encoding, the video clip comprising a plurality of video frames; a determining module configured to determine a frame number of an unencoded video frame of the video clip before encoding the unencoded video frame of the video clip every time in units of picture groups, and adaptively determine the frame number of the picture groups for encoding based on the frame number of the unencoded video frame.

In some embodiments, the determining module comprises: a first determining sub-module configured to determine a frame number of a picture group for encoding from a frame number of a first picture group and a frame number of a second picture group based on a video frame of a first predetermined frame number in the non-encoded video frame in response to the frame number of the non-encoded video frame not being less than a first predetermined frame number, wherein the first predetermined frame number is a sum of the frame number of the first picture group and the frame number of the second picture group and the frame number of the first picture group is greater than the frame number of the second picture group; a second determining sub-module configured to determine a frame number of a group of pictures for encoding from a frame number of the first group of pictures and a frame number of a second group of pictures based on a video frame of a second predetermined frame number of the non-encoded video frames in response to the frame number of the non-encoded video frame being less than the first predetermined frame number but not less than a second predetermined frame number, wherein the second predetermined frame number is the same as the frame number of the first group of pictures; a third determining sub-module configured to determine a frame number of a third group of pictures as the frame number of the group of pictures for encoding in response to the frame number of the non-encoded video frame being less than a second predetermined frame number, the third group of pictures being the same as the frame number of the non-encoded video frame.

In some embodiments, the first determining sub-module is configured to, in response to the number of frames of the unencoded video frames not being less than a first predetermined number of frames: selecting a video frame with a first preset frame number from a first frame in an uncoded video frame; respectively determining the coding cost of each picture group decomposition when the video frame with the first preset frame number is coded by a first picture group decomposition and a second picture group decomposition, wherein the first picture group decomposition comprises a first picture group and a second picture group which are sequentially arranged, and the second picture group decomposition comprises a second picture group and a first picture group which are sequentially arranged; in response to the coding cost of the first picture group decomposition being less than the coding cost of the second picture group decomposition, determining the coding cost when the video frame with the first preset frame number is coded by the third picture group decomposition, wherein the third picture group comprises a plurality of second picture groups which are arranged in sequence; determining the frame number of the first picture group as the frame number of the picture group for encoding in response to the encoding cost of the first picture group decomposition being less than the encoding cost of the third picture group decomposition; determining the frame number of the second group of pictures as the frame number of the group of pictures for encoding in response to the encoding cost of the first group of pictures being not less than the encoding cost of the second group of pictures or the encoding cost of the third group of pictures.

According to a third aspect of the present disclosure, there is provided a computing device comprising a processor; and a memory configured to have computer-executable instructions stored thereon that, when executed by the processor, perform any of the methods described above.

According to a fourth aspect of the present disclosure, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed, perform any of the methods described above.

By the method and the device for adaptively determining the frame number of the picture group for coding, which are claimed by the present disclosure, the frame number or the size of the most suitable picture group for the current coding can be determined based on the frame number of the uncoded video frame in the video segment before the video segment is coded by taking the picture group as a unit each time, so that the determined frame number of the picture group for coding can adapt to the characteristics of the video sequence, thereby obtaining better coding performance, effectively improving the compression capability during video coding, and simultaneously not increasing the coding complexity.

These and other advantages of the present disclosure will become apparent from and elucidated with reference to the embodiments described hereinafter.

Drawings

Embodiments of the present disclosure will now be described in more detail and with reference to the accompanying drawings, in which:

FIG. 1 illustrates a schematic flow diagram of a method for adaptively determining a frame number for a group of pictures for encoding in accordance with one embodiment of the present disclosure;

FIG. 2 illustrates a schematic flow diagram for adaptively determining a frame number for a group of pictures to encode based on a frame number of an unencoded video frame, according to one embodiment of the present disclosure;

FIG. 3 illustrates a schematic flow diagram of a method for determining a frame number for a group of pictures to encode based on a first predetermined number of frames of video according to one embodiment of the present disclosure;

FIG. 4 illustrates a schematic flow diagram of a method for determining a frame number for a group of pictures to encode based on a first predetermined number of frames of video in accordance with another embodiment of the present disclosure;

FIG. 5 illustrates a schematic flow chart diagram of a method for determining a frame number for a group of pictures to encode based on a second predetermined number of frames of video according to one embodiment of the present disclosure;

fig. 6A illustrates a schematic diagram of a first picture group decomposition, a second picture group decomposition, and a third picture group decomposition according to an embodiment of the present disclosure;

fig. 6B illustrates a schematic diagram of a fourth group of pictures decomposition, a fifth group of pictures decomposition, according to an embodiment of the present disclosure;

fig. 7 illustrates a schematic diagram of the order and manner in which video segments are encoded based on a GOP16 in accordance with one embodiment of the disclosure;

fig. 8 illustrates a schematic diagram of the order and manner in which video segments are encoded based on a GOP4 in accordance with one embodiment of the present disclosure;

fig. 9 illustrates an exemplary structural block diagram of an apparatus for adaptively determining a frame number for a group of pictures to encode according to one embodiment of the present disclosure; and

fig. 10 illustrates an example system that includes an example computing device that represents one or more systems and/or devices that may implement the various techniques described herein.

Detailed Description

The following description provides specific details for a thorough understanding and enabling description of various embodiments of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these details. In some instances, well-known structures and functions have not been shown or described in detail to avoid unnecessarily obscuring the description of the embodiments of the disclosure. The terminology used in the present disclosure is to be understood in its broadest reasonable manner, even though it is being used in conjunction with a particular embodiment of the present disclosure.

First, some terms referred to in the embodiments of the present application are explained so that those skilled in the art can understand that:

GOP: in video coding, a group of pictures (GOP) is generally referred to as a set of several consecutive frames;

GOPN: representing a group of pictures having N frames, N being a positive integer;

i frame: the method is a key frame, belongs to an intra-frame prediction frame, and can be completed only by the frame data during decoding;

p frame: the frame is predicted between frames, the P frame represents the difference between the current frame and the previous I frame or P frame, the difference defined by the frame is superposed by the picture cached before in decoding, and the final picture is generated;

b frame: for inter-frame prediction and bi-directional difference frames, the difference between the current frame and the previous and subsequent frames is recorded, and when the B frame is decoded, not only the previous buffered picture but also the decoded picture is obtained, and the final picture is obtained by superimposing the previous and subsequent pictures on the current frame data.

Fig. 1 illustrates a schematic flow diagram of a method 100 for adaptively determining a frame number for a group of pictures for encoding according to one embodiment of the present disclosure. As shown in fig. 1, the method includes the following steps.

In step 101, a video segment to be encoded is received, the video segment comprising a plurality of video frames. The video segment is to be encoded with the number of frames of the group of pictures used for encoding. The number of frames of the group of pictures used for encoding defines the GOP structure for encoding the video segment. For example, if the number of frames of a group of pictures used for encoding is 16, 16 frames in the video segment are encoded as one GOP. It should be noted that the video segments described in this disclosure may be part or all of a video sequence, while the first frame of the video sequence is always an I-frame and the video sequence may include, but is not limited to, one or more I-frames. In other words, the first frame of the video segment according to the embodiments of the present disclosure may or may not be an I frame, and may be a P frame, for example.

In step 102, before each time an unencoded video frame of the video clip is encoded in units of picture groups, a frame number of the unencoded video frame is determined and a frame number of the picture group for encoding is adaptively determined based on the frame number of the unencoded video frame. In other words, before encoding a group of pictures in an uncoded video frame, the size of the group of pictures suitable for the current encoding needs to be determined again.

By the method for adaptively determining the frame number of the picture group for encoding described in the embodiments of the present disclosure, the frame number or size of the most suitable picture group for this encoding can be determined based on the frame number of the non-encoded video frame in the video segment before encoding the video segment in units of picture groups each time, so that the determined frame number of the picture group for encoding can adapt to the characteristics of the video sequence.

Fig. 2 illustrates a flow diagram of a method 200 for adaptively determining a frame number for a group of pictures to encode based on a frame number of an unencoded video frame, according to one embodiment of the present disclosure. The method 200 may be used to implement the step of adaptively determining the number of frames for a group of pictures to encode based on the number of frames of the unencoded video frame in step 102 described with reference to fig. 1. The method comprises the following steps.

In step 201, in response to the number of frames of the non-encoded video frame not being less than the first predetermined number of frames, the number of frames for the group of pictures to be encoded is determined from the number of frames of the first group of pictures and the number of frames of the second group of pictures based on the video frame of the first predetermined number of frames in the non-encoded video frame. The first predetermined number of frames is a sum of a number of frames of the first group of pictures and a number of frames of the second group of pictures and the number of frames of the first group of pictures is greater than the number of frames of the second group of pictures. As an example, the frame number of the first picture group may be 16, the frame number of the second picture group may be 4, and the first predetermined frame number is 20.

In step 202, in response to the frame number of the non-encoded video frame being less than the first predetermined frame number but not less than the second predetermined frame number, the frame number of the group of pictures to be used for encoding is determined from the frame number of the first group of pictures and the frame number of the second group of pictures based on the video frame of the second predetermined frame number in the non-encoded video frame. The second predetermined number of frames is the same as the number of frames of the first group of pictures, and may be, for example, 16.

In step 203, in response to the number of frames of the un-encoded video frame being less than a second predetermined number of frames, determining the number of frames of a third group of pictures as the number of frames for the encoded group of pictures, the third group of pictures being the same as the number of frames of the un-encoded video frame. That is, in the case where the number of frames of the non-encoded video frame is less than a second predetermined number of frames, the non-encoded video frame is directly encoded as a third group of pictures.

By the method for adaptively determining the number of frames of a group of pictures for encoding based on the number of frames of an un-encoded video frame described in the embodiments of the present disclosure, the number of frames of the group of pictures for encoding can be determined using different determination manners based on the number of frames of the un-encoded video frame, thereby enabling the determined number of frames of the group of pictures for encoding to better adapt to characteristics of a video sequence.

As an example, fig. 3 illustrates a schematic flow diagram of a method 300 for determining a frame number of a picture group for encoding from a frame number of a first picture group and a frame number of a second picture group based on a video frame of a first predetermined frame number according to one embodiment of the present disclosure. The method 300 includes the following steps.

In step 301, a first predetermined number of video frames is selected starting from the first frame in the unencoded video frames. The first frame of the unencoded video frames may refer to a first frame of the unencoded video frames that is ordered in display order. As described above, the first predetermined frame number is the sum of the frame number of the first group of pictures and the frame number of the second group of pictures and the frame number of the first group of pictures is greater than the frame number of the second group of pictures.

In step 302, the coding cost of each picture group decomposition when the video frame with the first predetermined frame number is coded by the first picture group decomposition, the second picture group decomposition and the third picture group decomposition is respectively determined. The first picture group decomposition includes a first picture group and a second picture group which are sequentially arranged, the second picture group decomposition includes a second picture group and a first picture group which are sequentially arranged, and the third picture group decomposition includes a plurality of second picture groups which are sequentially arranged.

Taking the frame number of the first picture group as 16, the frame number of the second picture group as 4, and the first predetermined frame number as 20 as an example, fig. 6A illustrates a schematic diagram of the first picture group decomposition C1, the second picture group decomposition C2, and the third picture group decomposition C3. As shown in fig. 6A, the first group of pictures C1 includes a first group of pictures GOP16 and a second group of pictures GOP4 arranged in sequence, the second group of pictures GOP4 and a first group of pictures GOP16 arranged in sequence, and the third group of pictures C4 arranged in sequence. Note that, in this document, GOPN (N is a positive integer) denotes a group of pictures having N frames.

In some embodiments, the coding cost of each group of pictures decomposition may be determined by summing the coding costs of all groups of pictures in said each group of pictures decomposition. As an example, the coding cost of each group of pictures may be the sum of the coding costs of all video frames in said each group of pictures.

In some embodiments, the coding cost of each video frame is the sum of the coding costs of all coding units in said each video frame. A coding unit refers to a coding block used when coding each video frame. Alternatively, the coding cost of each coding unit may be determined as follows:

；

wherein

The SAD is the sum of absolute errors between the current coding unit and its prediction unit, R is the number of bits estimated by encoding the current coding unit using the selected prediction mode, and λ is the lagrange multiplier, which is the coding cost of the current coding unit.

It should be noted that the coding cost of each of all video frames can be determined entirely before the video segment is encoded to avoid recalculation every time needed by the subsequent encoding stage, thereby saving system resources.

In step 303, it is determined whether the coding cost of the first picture group decomposition is the minimum among the coding costs of the first picture group decomposition, the second picture group decomposition, and the third picture group decomposition.

In step 304, in response to determining that the coding cost of the first picture group decomposition is minimum, the frame number of the first picture group is determined as the frame number of the picture group for coding. Otherwise, the frame number of the second picture group is determined as the frame number of the picture group for encoding in step 305.

In the embodiment of the present disclosure, by determining the coding costs of the first picture group decomposition, the second picture group decomposition, and the third picture group decomposition, respectively, the frame number of the picture group used for coding can be efficiently determined, so that the determined frame number of the picture group used for coding can better adapt to the characteristics of the video sequence.

As an example, fig. 4 illustrates a schematic flow diagram of a method 400 for determining a frame number of a picture group for encoding from a frame number of a first picture group and a frame number of a second picture group based on a video frame of a first predetermined frame number according to another embodiment of the present disclosure. The process 400 includes the following steps.

In step 401, a first predetermined number of video frames is selected starting with the first frame in the unencoded video frames. The first frame of the unencoded video frames may refer to a first frame of the unencoded video frames that is ordered in display order. As described above, the first predetermined frame number is the sum of the frame number of the first group of pictures and the frame number of the second group of pictures and the frame number of the first group of pictures is greater than the frame number of the second group of pictures.

In step 402, the coding cost of each picture group decomposition when the video frame with the first predetermined frame number is coded by the first picture group decomposition and the second picture group decomposition is determined respectively. As described above, the first group of pictures decomposition includes a first group of pictures and a second group of pictures that are sequentially arranged, and the second group of pictures decomposition includes a second group of pictures and a first group of pictures that are sequentially arranged.

In step 403, it is determined whether the coding cost of the first picture group decomposition is less than the coding cost of the second picture group decomposition. And, in step 404, in response to determining that the coding cost of the first picture group decomposition is less than the coding cost of the second picture group decomposition, determining the coding cost when coding the video frame of the first predetermined frame number with a third picture group decomposition, wherein the third picture group decomposition comprises a plurality of second picture groups which are arranged in sequence.

In step 405, it is determined whether the coding cost of the first picture group decomposition is less than the coding cost of the third picture group decomposition. And in step 406, in response to determining that the coding cost of the first picture group decomposition is less than the coding cost of the third picture group decomposition, determining the frame number of the first picture group as the frame number of the picture group for coding.

In step 407, in response to determining that the coding cost of the first picture group decomposition is not less than the coding cost of the second picture group decomposition in step 403 or determining that the coding cost of the first picture group decomposition is not less than the coding cost of the third picture group decomposition in step 405, the frame number of the second picture group is determined as the frame number of the picture group for coding.

In this embodiment, similar to the above description, the schematic diagrams of the first slice group decomposition C1, the second slice group decomposition C2, and the third slice group decomposition C3 are shown in fig. 6A, taking the frame number of the first slice group as 16, the frame number of the second slice group as 4, and the first predetermined frame number as 20 as an example.

It should be noted that the coding cost of the group of pictures decomposition described in the present embodiment is calculated in the same manner as the coding cost described with reference to the method 300, and is not described in detail herein.

By using the described embodiment, it is only necessary to further determine whether the coding cost of the first group of pictures decomposition is less than the coding cost of the third group of pictures decomposition when it is determined that the coding cost of the first group of pictures decomposition is less than the coding cost of the second group of pictures decomposition. In other words, when it is determined that the coding cost of the first picture group decomposition is not less than the coding cost of the second picture group decomposition, the frame number of the second picture group can be directly determined as the frame number of the picture group for coding, so that the coding cost when the video frame with the first predetermined frame number is coded by the third picture group decomposition does not need to be determined, which undoubtedly saves the calculation amount of the method, and can greatly improve the efficiency when the frame number of the picture group for coding is adaptively determined.

Fig. 5 illustrates an exemplary flow diagram of a method 500 for determining a frame number of a group of pictures for encoding from a frame number of a first group of pictures and a frame number of a second group of pictures based on a second predetermined frame number, different from the first predetermined frame number described above, according to one embodiment of the disclosure. As shown in fig. 5, the method 500 includes the following steps.

In step 501, a second predetermined number of video frames are selected starting from the first frame in the unencoded video frames. The first frame of the unencoded video frames may refer to a first frame of the unencoded video frames that is ordered in display order. As described above, the second predetermined number of frames is the same as the number of frames of the first group of pictures.

In step 502, the coding cost of each picture group decomposition when the video frame of the second predetermined frame number is coded by the fourth picture group decomposition and the fifth picture group decomposition is determined respectively. The fourth group of pictures decomposition includes a first group of pictures (GOP 16) and the fifth group of pictures decomposition includes a plurality of second groups of pictures.

Taking the frame number of the first picture group as 16, the frame number of the second picture group as 4, and the first predetermined frame number as 16 as an example, fig. 6B illustrates a schematic diagram of a fourth picture group decomposition C4 and a fifth picture group decomposition C5. As shown in fig. 6B, the fourth group of pictures C4 includes one first group of pictures GOP16, and the fifth group of pictures C5 includes four second group of pictures GOP4 arranged in sequence. Note that, in this document, GOPN (N is a positive integer) denotes a group of pictures having N frames.

In step 503, it is determined whether the coding cost of the fourth picture group decomposition is less than the coding cost of the fifth picture group decomposition.

In step 504, in response to determining that the coding cost of the fourth group of pictures is less than the coding cost of the fifth group of pictures, the frame number of the first group of pictures is determined as the frame number of the group of pictures used for coding, otherwise, in step 505, the frame number of the second group of pictures is determined as the frame number of the group of pictures used for coding.

In the embodiment of the present disclosure, by determining the coding costs of the fourth and fifth picture group decompositions respectively, the frame number of the picture group used for coding can be efficiently determined in the case that the frame number of the non-coded video frame is less than the first predetermined frame number but not less than the second predetermined frame number, thereby enabling the determined frame number of the picture group used for coding to better adapt to the characteristics of the video sequence.

In some embodiments, in the case where the number of frames of a group of pictures used for encoding is determined, the received video segment may be encoded in units of groups of pictures, and optionally, an encoding method when encoding is performed based on the number of frames of the group of pictures used for encoding may also be determined.

As an example, fig. 7 illustrates the order and manner in which video segments are encoded based on a GOP 16. As shown in fig. 7, the original order of the frames in the video segment is the same as the display order of the frames shown in the figure, and the display order in fig. 7 is, from left to right, the 0 th frame, the 1 st frame, … …, and the 16 th frame, respectively. It should be noted that the GOP16 in the figure ranges from frame 1 to frame 16 as shown, and frame 0 as shown is typically a frame of a previous video segment and not a frame in the GOP 16. The 0 th frame may be an I frame or a P frame, which is not restrictive.

As shown in fig. 7, the first frame in the video segment has an encoding order of 5 and is encoded as a B frame. The second frame in the video segment is coded in 4, as a B-frame, and so on. It should be noted that the symbols "I", "B" and "P" in fig. 7 indicate that the frames formed by encoding are I frames, B frames and P frames, respectively. The arrows in fig. 7 indicate the reference relationship during encoding, the frame at the beginning of the arrow is the encoded frame, and the frame pointed to by the arrow is the frame to which the frame is referred during encoding. For example, the first frame in the video segment whose coding order is 5 is predicted at the time of encoding by referring to the B frame formed by encoding the 0 th frame (I frame) and the second frame whose coding order is 4, and the 16 th frame in the video segment whose coding order is 1 is predicted at the time of encoding by referring to only the 0 th frame (I frame).

As an example, fig. 8 illustrates the order and manner in which video segments are encoded based on a GOP 4. As shown in fig. 8, the original order of the frames in the video segment is the same as the display order of the frames shown in the figure, and the display order in fig. 8 is the 0 th frame, the 1 st frame, the 2 nd frame, the 3 rd frame and the 4 th frame, respectively, from left to right. It should be noted that the range of GOP4 is shown as frames 1-4 in the figure, and frame 0 shown in the figure is typically a frame of a previous video segment and not a frame in GOP 4. The 0 th frame may be an I frame or a P frame, which is not restrictive.

As shown in fig. 8, the first frame in the video segment has a coding order of 3 and is encoded as a B frame. The second frame in the video segment is coded in 2, as a B-frame, and so on. It should be noted that the symbols "I", "B" and "P" in fig. 8 indicate that the frames formed by encoding are I frames, B frames and P frames, respectively. The arrows in fig. 8 indicate the reference relationship during encoding, the frame at the beginning of the arrow is the encoded frame, and the frame pointed to by the arrow is the reference frame during encoding. For example, the first frame in the video segment whose coding order is 3 is predicted at the time of encoding by referring to the B frame formed by encoding the 0 th frame (I frame) and the second frame whose coding order is 2, and the 4 th frame in the video segment whose coding order is 1 is predicted at the time of encoding by referring to only the 0 th frame (I frame).

Fig. 9 illustrates an exemplary structural block diagram of an apparatus 900 for adaptively determining a frame number for a group of pictures to encode according to one embodiment of the present disclosure. As shown in fig. 9, the apparatus 900 includes a receiving module 910 and a determining module 920.

The receiving module 910 is configured to receive a video clip to be encoded with the frame number of the group of pictures for encoding, the video clip comprising a plurality of video frames. The number of frames of the group of pictures used for encoding defines the GOP structure for encoding the video segment. For example, if the number of frames of a group of pictures used for encoding is 16, 16 frames in the video segment are encoded as one GOP.

The determining module 920 is configured to determine the number of frames of the non-encoded video frame before encoding the non-encoded video frame of the video clip every time in units of picture groups, and adaptively determine the number of frames of the picture groups for encoding based on the number of frames of the non-encoded video frame. In other words, before encoding a group of pictures in an uncoded video frame, the determining module 920 needs to determine the size of the group of pictures applicable to the current encoding.

In some embodiments, as shown in fig. 9, the determining module 920 may include a first determining sub-module 921, a second determining sub-module 922, and a third determining sub-module 923.

The first determining sub-module 921 is configured to determine the frame number of the picture group for encoding from the frame number of the first picture group and the frame number of the second picture group based on the video frame of the first predetermined frame number in the non-encoded video frame in response to the frame number of the non-encoded video frame not being less than a first predetermined frame number, wherein the first predetermined frame number is a sum of the frame number of the first picture group and the frame number of the second picture group and the frame number of the first picture group is greater than the frame number of the second picture group. As an example, the frame number of the first picture group may be 16, the frame number of the second picture group may be 4, and the first predetermined frame number is 20.

The second determining sub-module 922 is configured to determine the frame number of the picture group for encoding from the frame number of the first picture group and the frame number of the second picture group based on the video frame of the second predetermined frame number in the non-encoded video frame in response to the frame number of the non-encoded video frame being less than the first predetermined frame number but not less than the second predetermined frame number. The second predetermined number of frames is the same as the number of frames of the first group of pictures, and may be, for example, 16.

The third determining sub-module 923 is configured to determine a frame number of a third group of pictures as the frame number of the group of pictures for encoding in response to the frame number of the non-encoded video frame being less than a second predetermined frame number, the third group of pictures being the same as the frame number of the non-encoded video frame. That is, in case that the number of frames of the non-encoded video frame is less than a second predetermined number of frames, the third determining sub-module 923 directly encodes the non-encoded video frame as a third group of pictures.

In some embodiments, the first determining sub-module 921 is configured to, in response to the number of frames of the unencoded video frame not being less than a first predetermined number of frames: selecting a video frame with a first preset frame number from a first frame in an uncoded video frame; respectively determining the coding cost of each picture group decomposition when the video frame with the first preset frame number is coded by a first picture group decomposition and a second picture group decomposition, wherein the first picture group decomposition comprises a first picture group and a second picture group which are sequentially arranged, and the second picture group decomposition comprises a second picture group and a first picture group which are sequentially arranged; in response to the coding cost of the first picture group decomposition being less than the coding cost of the second picture group decomposition, determining the coding cost when the video frame with the first preset frame number is coded by the third picture group decomposition, wherein the third picture group comprises a plurality of second picture groups which are arranged in sequence; determining the frame number of the first picture group as the frame number of the picture group for encoding in response to the encoding cost of the first picture group decomposition being less than the encoding cost of the third picture group decomposition; determining the frame number of the second group of pictures as the frame number of the group of pictures for encoding in response to the encoding cost of the first group of pictures being not less than the encoding cost of the second group of pictures or the encoding cost of the third group of pictures.

In some embodiments, the first determining sub-module 921 is configured to, in response to the number of frames of the unencoded video frame not being less than a first predetermined number of frames: selecting a video frame with a first preset frame number from a first frame in an uncoded video frame; respectively determining the coding cost of each picture group decomposition when the video frame with the first preset frame number is coded by a first picture group decomposition and a second picture group decomposition, wherein the first picture group decomposition comprises a first picture group and a second picture group which are sequentially arranged, and the second picture group decomposition comprises a second picture group and a first picture group which are sequentially arranged; in response to the coding cost of the first picture group decomposition being less than the coding cost of the second picture group decomposition, determining the coding cost of a third picture group decomposition when the video frame with the first preset number of frames is coded by the third picture group decomposition, wherein the third picture group comprises a plurality of second picture groups which are arranged in sequence; determining the frame number of the first picture group as the frame number of the picture group for encoding in response to the encoding cost of the first picture group decomposition being less than the encoding cost of the third picture group decomposition; determining the frame number of the second group of pictures as the frame number of the group of pictures for encoding in response to the encoding cost of the first group of pictures being not less than the encoding cost of the second group of pictures or the encoding cost of the third group of pictures.

In some embodiments, the first determining sub-module 922 is configured to, in response to the number of frames of the unencoded video frame being less than a first predetermined number of frames but not less than a second predetermined number of frames: selecting a video frame with a second preset frame number from a first frame in the uncoded video frames; determining coding cost of each picture group decomposition when a fourth picture group decomposition and a fifth picture group decomposition are used for coding the video frame with the second preset frame number respectively, wherein the fourth picture group decomposition comprises a first picture group, and the fifth picture group decomposition comprises a plurality of second picture groups; in response to the coding cost of the fourth group of pictures being less than the coding cost of the fifth group of pictures, determining the frame number of the first group of pictures as the frame number of the group of pictures used for coding, otherwise determining the frame number of the second group of pictures as the frame number of the group of pictures used for coding.

Fig. 10 illustrates an example system 1000 that includes an example computing device 1010 that represents one or more systems and/or devices that may implement the various techniques described herein. Computing device 1010 may be, for example, a server of a service provider, a device associated with a server, a system on a chip, and/or any other suitable computing device or computing system. The apparatus 900 for adaptively determining a frame number for a group of pictures to encode described above with respect to fig. 9 may take the form of a computing device 1010. Alternatively, the apparatus 900 for adaptively determining the number of frames for a group of pictures to encode may be implemented as a computer program in the form of a GOP determination application 1016.

The example computing device 1010 as illustrated includes a processing system 1011, one or more computer-readable media 1012, and one or more I/O interfaces 1013 communicatively coupled to each other. Although not shown, the computing device 1010 may also include a system bus or other data and command transfer system that couples the various components to one another. A system bus can include any one or combination of different bus structures, such as a memory bus or memory controller, a peripheral bus, a universal serial bus, and/or a processor or local bus that utilizes any of a variety of bus architectures. Various other examples are also contemplated, such as control and data lines.

Processing system 1011 represents functionality that performs one or more operations using hardware. Thus, the processing system 1011 is illustrated as including hardware elements 1014 that may be configured as processors, functional blocks, and the like. This may include implementation in hardware as an application specific integrated circuit or other logic device formed using one or more semiconductors. The hardware elements 1014 are not limited by the materials from which they are formed or the processing mechanisms employed therein. For example, a processor may be comprised of semiconductor(s) and/or transistors (e.g., electronic Integrated Circuits (ICs)). In such a context, processor-executable instructions may be electronically-executable instructions.

Computer-readable medium 1012 is illustrated as including memory/storage 1015. Memory/storage 1015 represents the memory/storage capacity associated with one or more computer-readable media. Memory/storage 1015 may include volatile media (such as Random Access Memory (RAM)) and/or nonvolatile media (such as Read Only Memory (ROM), flash memory, optical disks, magnetic disks, and so forth). Memory/storage 1015 may include fixed media (e.g., RAM, ROM, a fixed hard drive, etc.) as well as removable media (e.g., flash memory, a removable hard drive, an optical disk, and so forth). The computer-readable medium 1012 may be configured in various other ways as further described below.

One or more I/O interfaces 1013 represent functionality that allows a user to enter commands and information to computing device 1010, and optionally also allows information to be presented to the user and/or other components or devices using various input/output devices. Examples of input devices include a keyboard, a cursor control device (e.g., a mouse), a microphone (e.g., for voice input), a scanner, touch functionality (e.g., capacitive or other sensors configured to detect physical touch), a camera (e.g., motion that may not involve touch may be detected as gestures using visible or invisible wavelengths such as infrared frequencies), and so forth. Examples of output devices include a display device (e.g., a monitor or projector), speakers, a printer, a network card, a haptic response device, and so forth. Thus, the computing device 1010 may be configured in various ways to support user interaction, as described further below.

The computing device 1010 also includes a GOP determination application 1016. The GOP determination application 1016 may be a software instance of the apparatus 900 for adaptively determining frame numbers for a group of pictures being encoded, such as described in fig. 9, and implement the techniques described herein in combination with other elements in the computing device 1010.

Various techniques may be described herein in the general context of software hardware elements or program modules. Generally, these modules include routines, programs, objects, elements, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The terms "module," "functionality," and "component" as used herein generally represent software, firmware, hardware, or a combination thereof. The features of the techniques described herein are platform-independent, meaning that the techniques may be implemented on a variety of computing platforms having a variety of processors.

An implementation of the described modules and techniques may be stored on or transmitted across some form of computer readable media. Computer readable media can include a variety of media that can be accessed by computing device 1010. By way of example, and not limitation, computer-readable media may comprise "computer-readable storage media" and "computer-readable signal media".

"computer-readable storage medium" refers to a medium and/or device, and/or a tangible storage apparatus, capable of persistently storing information, as opposed to mere signal transmission, carrier wave, or signal per se. Accordingly, computer-readable storage media refers to non-signal bearing media. Computer-readable storage media include hardware such as volatile and nonvolatile, removable and non-removable media and/or storage devices implemented in a method or technology suitable for storage of information such as computer-readable instructions, data structures, program modules, logic elements/circuits or other data. Examples of computer readable storage media may include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical storage, hard disks, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or other storage devices, tangible media, or an article of manufacture suitable for storing the desired information and accessible by a computer.

"computer-readable signal medium" refers to a signal-bearing medium configured to transmit instructions to the hardware of computing device 1010, such as via a network. Signal media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave, data signal or other transport mechanism. Signal media also includes any information delivery media. The term "modulated data signal" means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media.

As previously described, the hardware elements 1014 and the computer-readable medium 1012 represent instructions, modules, programmable device logic, and/or fixed device logic implemented in hardware form that may be used in some embodiments to implement at least some aspects of the techniques described herein. The hardware elements may include integrated circuits or systems-on-chips, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), Complex Programmable Logic Devices (CPLDs), and other implementations in silicon or components of other hardware devices. In this context, a hardware element may serve as a processing device that performs program tasks defined by instructions, modules, and/or logic embodied by the hardware element, as well as a hardware device for storing instructions for execution, such as the computer-readable storage medium described previously.

Combinations of the foregoing may also be used to implement the various techniques and modules described herein. Thus, software, hardware, or program modules and other program modules may be implemented as one or more instructions and/or logic embodied on some form of computer-readable storage medium and/or by one or more hardware elements 1014. The computing device 1010 may be configured to implement particular instructions and/or functions corresponding to software and/or hardware modules. Thus, implementing a module as a module executable by the computing device 1010 as software may be implemented at least partially in hardware, for example, using the computer-readable storage medium and/or hardware elements 1014 of a processing system. The instructions and/or functions may be executable/operable by one or more articles of manufacture (e.g., one or more computing devices 1010 and/or processing systems 1011) to implement the techniques, modules, and examples described herein.

In various implementations, the computing device 1010 may assume a variety of different configurations. For example, the computing device 1010 may be implemented as a computer-like device including a personal computer, a desktop computer, a multi-screen computer, a laptop computer, a netbook, and so forth. The computing device 1010 may also be implemented as a mobile device class device that includes mobile devices such as mobile phones, portable music players, portable gaming devices, tablet computers, multi-screen computers, and the like. Computing device 1010 may also be implemented as a television-like device that includes devices with or connected to a generally larger screen in a casual viewing environment. These devices include televisions, set-top boxes, game consoles, and the like.

The techniques described herein may be supported by these various configurations of computing device 1010 and are not limited to specific examples of the techniques described herein. The functionality may also be implemented in whole or in part on the "cloud" 1020 through the use of a distributed system, such as through the platform 1022 described below.

The cloud 1020 includes and/or is representative of a platform 1022 for resources 1024. The platform 1022 abstracts underlying functionality of hardware (e.g., servers) and software resources of the cloud 1020. Resources 1024 may include applications and/or data that may be used when executing computer processes on servers remote from computing device 1010. Resources 1024 may also include services provided over the internet and/or over a subscriber network such as a cellular or Wi-Fi network.

The platform 1022 may abstract resources and functionality to connect the computing device 1010 with other computing devices. The platform 1022 may also be used to abstract a hierarchy of resources to provide a corresponding level of hierarchy encountered for the demand of the resources 1024 implemented via the platform 1022. Thus, in an interconnected device embodiment, implementation of functions described herein may be distributed throughout the system 1000. For example, the functionality may be implemented in part on the computing device 1010 and by the platform 1022 that abstracts the functionality of the cloud 1020.

It will be appreciated that embodiments of the disclosure have been described with reference to different functional units for clarity. However, it will be apparent that the functionality of each functional unit may be implemented in a single unit, in a plurality of units or as part of other functional units without departing from the disclosure. For example, functionality illustrated to be performed by a single unit may be performed by a plurality of different units. Thus, references to specific functional units are only to be seen as references to suitable units for providing the described functionality rather than indicative of a strict logical or physical structure or organization. Thus, the present disclosure may be implemented in a single unit or may be physically and functionally distributed between different units and circuits.

It will be understood that, although the terms first, second, third, etc. may be used herein to describe various devices, elements, components or sections, these devices, elements, components or sections should not be limited by these terms. These terms are only used to distinguish one device, element, component or section from another device, element, component or section.

Although the present disclosure has been described in connection with some embodiments, it is not intended to be limited to the specific form set forth herein. Rather, the scope of the present disclosure is limited only by the accompanying claims. Additionally, although individual features may be included in different claims, these may possibly advantageously be combined, and the inclusion in different claims does not imply that a combination of features is not feasible and/or advantageous. The order of features in the claims does not imply any specific order in which the features must be worked. Furthermore, in the claims, the word "comprising" does not exclude other elements, and the indefinite article "a" or "an" does not exclude a plurality. Reference signs in the claims are provided merely as a clarifying example and shall not be construed as limiting the scope of the claims in any way.

Claims

1. A method for adaptively determining a frame number for a group of pictures to encode, comprising:

receiving a video clip to be encoded at the frame number of the group of pictures for encoding, the video clip comprising a plurality of video frames;

before encoding an unencoded video frame of the video clip once in units of picture groups, a frame number of the unencoded video frame is determined, and a frame number of the picture groups used for encoding is adaptively determined based on the frame number of the unencoded video frame.

2. The method of claim 1, wherein adaptively determining a frame number for a group of pictures to encode based on a frame number of an unencoded video frame comprises:

in response to the frame number of the non-coded video frame not being less than a first preset frame number, determining the frame number of a picture group for coding from the frame number of a first picture group and the frame number of a second picture group based on the video frame of the first preset frame number in the non-coded video frame, wherein the first preset frame number is the sum of the frame number of the first picture group and the frame number of the second picture group, and the frame number of the first picture group is greater than the frame number of the second picture group;

in response to the frame number of the non-coded video frame being less than a first predetermined frame number but not less than a second predetermined frame number, determining a frame number of a picture group for coding from a frame number of a first picture group and a frame number of a second picture group based on a video frame of the second predetermined frame number in the non-coded video frame, wherein the second predetermined frame number is the same as the frame number of the first picture group;

in response to the number of frames of the non-encoded video frame being less than a second predetermined number of frames, determining a number of frames of a third group of pictures as the number of frames for the encoded group of pictures, the third group of pictures being the same as the number of frames of the non-encoded video frame.

3. The method of claim 2, wherein determining the number of frames for the coded picture group from the number of frames of the first picture group and the number of frames of the second picture group based on the first predetermined number of frames of the video frames in the non-coded video frame comprises:

selecting a video frame with a first preset frame number from a first frame in an uncoded video frame;

respectively determining the coding cost of each picture group decomposition when the video frame with the first preset frame number is coded by a first picture group decomposition, a second picture group decomposition and a third picture group decomposition, wherein the first picture group decomposition comprises a first picture group and a second picture group which are sequentially arranged, the second picture group decomposition comprises a second picture group and a first picture group which are sequentially arranged, and the third picture group decomposition comprises a plurality of second picture groups which are sequentially arranged;

determining the frame number of the first picture group as the frame number of the picture group for encoding in response to the minimum encoding cost of the first picture group decomposition;

in response to the coding cost of the second picture group decomposition or the coding cost of the third picture group decomposition being minimum, the frame number of the second picture group is determined as the frame number of the picture group for coding.

4. The method of claim 2, wherein determining the number of frames for the coded picture group from the number of frames of the first picture group and the number of frames of the second picture group based on the first predetermined number of frames of the video frames in the non-coded video frame comprises:

respectively determining the coding cost of each picture group decomposition when the video frame with the first preset frame number is coded by a first picture group decomposition and a second picture group decomposition, wherein the first picture group decomposition comprises a first picture group and a second picture group which are sequentially arranged, and the second picture group decomposition comprises a second picture group and a first picture group which are sequentially arranged;

in response to the coding cost of the first picture group decomposition being less than the coding cost of the second picture group decomposition, determining the coding cost of a third picture group decomposition when the video frame with the first preset number of frames is coded by the third picture group decomposition, wherein the third picture group comprises a plurality of second picture groups which are arranged in sequence;

determining the frame number of the first picture group as the frame number of the picture group for encoding in response to the encoding cost of the first picture group decomposition being less than the encoding cost of the third picture group decomposition;

determining the frame number of the second group of pictures as the frame number of the group of pictures for encoding in response to the encoding cost of the first group of pictures being not less than the encoding cost of the second group of pictures or the encoding cost of the third group of pictures.

5. The method of claim 2, wherein determining the number of frames for the group of pictures to encode from the number of frames of the first group of pictures and the number of frames of the second group of pictures based on a second predetermined number of frames of video frames of the non-encoded video frames comprises:

selecting a video frame with a second preset frame number from a first frame in the uncoded video frames;

determining coding cost of each picture group decomposition when a fourth picture group decomposition and a fifth picture group decomposition are used for coding the video frame with the second preset frame number respectively, wherein the fourth picture group decomposition comprises a first picture group, and the fifth picture group decomposition comprises a plurality of second picture groups;

in response to the coding cost of the fourth group of pictures being less than the coding cost of the fifth group of pictures, determining the frame number of the first group of pictures as the frame number of the group of pictures used for coding, otherwise determining the frame number of the second group of pictures as the frame number of the group of pictures used for coding.

6. The method according to any of claims 3-5, wherein determining the coding cost of each group of pictures decomposition comprises summing the coding costs of all groups of pictures in said each group of pictures decomposition.

7. The method of claim 6, wherein the coding cost of each of the all group of pictures is a sum of the coding costs of all video frames in the each group of pictures.

8. The method of claim 7, wherein the coding cost of each of all video frames is a sum of coding costs of all coding units in the each video frame, the coding cost of each coding unit being determined by the following formula:

；

where J is the coding cost of the current coding unit, SAD is the sum of absolute errors between the current coding unit and its prediction unit, R is the number of bits estimated by coding the current coding unit using the selected prediction mode, and λ is the lagrange multiplier.

9. The method of claim 2, wherein the frame number of the first group of pictures is 16.

10. The method of claim 2, wherein the frame number of the second group of pictures is 4.

11. An apparatus for adaptively determining a frame number for a group of pictures to encode, comprising:

a receiving module configured to receive a video clip to be encoded at the frame number of the group of pictures for encoding, the video clip comprising a plurality of video frames;

a determining module configured to determine a frame number of an unencoded video frame of the video clip before encoding the unencoded video frame of the video clip every time in units of picture groups, and adaptively determine the frame number of the picture groups for encoding based on the frame number of the unencoded video frame.

12. The apparatus of claim 11, wherein the determining module comprises:

a first determining sub-module configured to determine a frame number of a picture group for encoding from a frame number of a first picture group and a frame number of a second picture group based on a video frame of a first predetermined frame number in the non-encoded video frame in response to the frame number of the non-encoded video frame not being less than a first predetermined frame number, wherein the first predetermined frame number is a sum of the frame number of the first picture group and the frame number of the second picture group and the frame number of the first picture group is greater than the frame number of the second picture group;

a second determining sub-module configured to determine a frame number of a group of pictures for encoding from a frame number of the first group of pictures and a frame number of a second group of pictures based on a video frame of a second predetermined frame number of the non-encoded video frames in response to the frame number of the non-encoded video frame being less than the first predetermined frame number but not less than a second predetermined frame number, wherein the second predetermined frame number is the same as the frame number of the first group of pictures;

a third determining sub-module configured to determine a frame number of a third group of pictures as the frame number of the group of pictures for encoding in response to the frame number of the non-encoded video frame being less than a second predetermined frame number, the third group of pictures being the same as the frame number of the non-encoded video frame.

13. The apparatus of claim 12, wherein the first determination submodule is configured to, in response to the number of frames of the unencoded video frame not being less than a first predetermined number of frames:

in response to the coding cost of the first picture group decomposition being less than the coding cost of the second picture group decomposition, determining the coding cost when the video frame with the first preset frame number is coded by the third picture group decomposition, wherein the third picture group comprises a plurality of second picture groups which are arranged in sequence;

14. A computing device comprising

A processor; and

a memory configured to have computer-executable instructions stored thereon that, when executed by the processor, perform the method of any of claims 1-10.

15. A computer-readable storage medium storing computer-executable instructions that, when executed, perform the method of any one of claims 1-10.