CN116033148A

CN116033148A - Video coding method, device, computer equipment and storage medium

Info

Publication number: CN116033148A
Application number: CN202211743154.0A
Authority: CN
Inventors: 张佳
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-12-30
Filing date: 2022-12-30
Publication date: 2023-04-28

Abstract

The embodiment of the application discloses a video coding method, a video coding device, computer equipment and a storage medium, wherein the method comprises the following steps: performing first coding processing on each video frame in the video to be processed to obtain a plurality of coding blocks included in each video frame and a reference frame of each coding block; when the Nth coding process is carried out on a target video frame of the video to be processed, determining a reference coding block with overlapping relation with a target coding block from a plurality of coding blocks included in the target video frame during the target coding process; the target secondary coding process is a coding process between the first secondary coding process and the N-1 th coding process; determining a candidate reference frame set of the target coding block based on the reference frames of each reference coding block in the target secondary coding process; and determining a target reference frame of the target coding block from the candidate reference frame set, and carrying out coding prediction on the target coding block based on the target reference frame. By adopting the method and the device, the coding efficiency can be improved.

Description

Video coding method, device, computer equipment and storage medium

Technical Field

The present disclosure relates to the field of encoding, and in particular, to a video encoding method, apparatus, computer device, and storage medium.

Background

With the continuous development of computer technology, digital multimedia technology, and codec technology, video and image are becoming more and more popular in various fields and people's daily lives. In general, one video needs to be encoded multiple times in a service scenario such as a video stream requiring different resolutions, a rate control algorithm, and the like. In any of multiple encodings of video, the encoder often needs to perform a motion search over all frames that may be referenced to determine which region of which frame to use for prediction, for the encoded block to be encoded in the video; i.e. a final reference frame needs to be selected from all possible frames to be referenced for predicting the coded block to be coded.

At present, motion search is generally performed on each frame in a preset reference frame set corresponding to a coding block to determine a final reference frame, and it can be seen that the determination method of the reference frame needs to traverse each frame in the set, so that the calculation amount is large, and thus the coding efficiency may be low.

Disclosure of Invention

The embodiment of the application provides a video coding method, a video coding device, computer equipment and a storage medium, which can improve coding efficiency.

In a first aspect, an embodiment of the present application provides a video encoding method, including:

acquiring a video to be processed, and performing first coding processing on each video frame in the video to be processed to obtain a plurality of coding blocks included in each video frame and a reference frame of each coding block; the video frame comprises a plurality of coding blocks which are obtained by dividing the video frame, and the reference frame of one coding block is the video frame which is utilized when the coding block is coded and predicted in the video to be processed;

when N-th encoding processing is carried out on a target video frame of the video to be processed, determining one or more reference encoding blocks with overlapping relation with a target encoding block from a plurality of encoding blocks included in the target video frame during the target encoding processing; the target video frame is the video frame where the target coding block is located, the target secondary coding process is one or more times of coding processes between the first coding process and the N-1 th coding process, and N is a positive integer greater than 1;

determining a candidate reference frame set of each reference coding block based on the coding mode of the target coding block and the reference frames of each reference coding block in the target secondary coding process; the coding mode comprises an inter-coding mode or an intra-coding mode;

And determining a target reference frame of the target coding block from the candidate reference frame set, and carrying out coding prediction on the target coding block based on the target reference frame.

In a second aspect, an embodiment of the present application provides a video encoding apparatus, including:

the first coding unit is used for acquiring a video to be processed, and carrying out first coding processing on each video frame in the video to be processed to obtain a plurality of coding blocks included in each video frame and a reference frame of each coding block; the video frame comprises a plurality of coding blocks which are obtained by dividing the video frame, and the reference frame of one coding block is the video frame which is utilized when the coding block is coded and predicted in the video to be processed;

a first determining unit configured to determine, when an nth encoding process is performed on a target video frame of the video to be processed, one or more reference encoding blocks having an overlapping relationship with the target encoding block from a plurality of encoding blocks included in the target video frame at the time of the target encoding process; the target video frame is the video frame where the target coding block is located, the target secondary coding process is one or more times of coding processes between the first coding process and the N-1 th coding process, and N is a positive integer greater than 1;

A second determining unit configured to determine a candidate reference frame set of the target coding block based on a coding mode of each reference coding block and a reference frame of each reference coding block in the target secondary coding process; the coding mode comprises an inter-coding mode or an intra-coding mode;

and the second coding unit is used for determining a target reference frame of the target coding block from the candidate reference frame set and carrying out coding prediction on the target coding block based on the target reference frame.

In a third aspect, embodiments of the present application provide a computer device, the computer device comprising: a processor and a memory, the processor being configured to perform the method according to the first aspect.

In a fourth aspect, embodiments of the present application further provide a computer readable storage medium, where program instructions are stored, the program instructions when executed implement the method according to the first aspect.

In a fifth aspect, embodiments of the present application also provide a computer program product or computer program comprising program instructions which, when executed by a processor, implement the method of the first aspect described above.

In the embodiment of the present application, each video frame in a video to be processed may be subjected to a first encoding process to obtain a plurality of encoding blocks included in each video frame and a reference frame of each encoding block; the video frame comprises a plurality of coding blocks which are obtained by dividing the video frame into blocks, and the reference frame of one coding block is the video frame which is used for carrying out coding prediction on the coding block in the video to be processed. When the Nth encoding process is performed on the target video frame of the video to be processed, one or more reference encoding blocks with overlapping relation with the target encoding blocks can be determined from a plurality of encoding blocks included in the target video frame during the target encoding process; the target video frame may be a video frame in which a target encoding block is located, the target secondary encoding process may be one or more encoding processes between a first encoding process and an N-1 th encoding process, and N is a positive integer greater than 1. Then, a candidate reference frame set of the target coding block may be determined based on the coding mode of each reference coding block and the reference frames of each reference coding block in the target sub-coding process; the coding mode includes an inter coding mode or an intra coding mode. Further, a target reference frame for the target coding block may be determined from the set of candidate reference frames, and the target coding block may be coded predicted based on the target reference frame. By implementing the method, in a scene of multiple times of coding and when the target coding block is coded in the Nth coding process, the reference frame selection result of each coding block in the historical coding process can be multiplexed, so that the calculation complexity of selecting the reference frame is reduced, the calculation resource is saved, the selection time on selecting the reference frame can be reduced, the quick selection of the reference frame is realized, and the coding efficiency can be improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic architecture diagram of a video coding system according to an embodiment of the present application;

fig. 2 is a schematic flow chart of a video encoding method according to an embodiment of the present application;

FIG. 3a is a schematic flow chart of determining that there is a need for multiple encodings according to an embodiment of the present application;

FIG. 3b is a schematic diagram showing a block division result for a video frame according to an embodiment of the present application;

FIG. 3c is a schematic diagram of a determination of a reference encoded block according to an embodiment of the present application;

fig. 4 is a flowchart of another video encoding method according to an embodiment of the present application;

FIG. 5a is a schematic illustration of a unidirectional prediction provided by an embodiment of the present application;

FIG. 5b is a schematic diagram of bi-directional prediction according to an embodiment of the present application;

fig. 5c is a flowchart of yet another video encoding method according to an embodiment of the present application;

Fig. 6 is a schematic structural diagram of a video encoding device according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

Cloud Technology (Cloud Technology) refers to a hosting Technology for integrating hardware, software, network and other series resources in a wide area network or a local area network to realize calculation, storage, processing and sharing of data.

The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied by the cloud computing business mode, can form a resource pool, and is flexible and convenient as required. Cloud computing technology will become an important support. Background services of technical networking systems require a large amount of computing, storage resources, such as video websites, picture-like websites, and more portals. Along with the high development and application of the internet industry, each article possibly has an own identification mark in the future, the identification mark needs to be transmitted to a background system for logic processing, data with different levels can be processed separately, and various industry data needs strong system rear shield support and can be realized only through cloud computing.

Cloud Computing (Cloud Computing) is a Computing model that distributes Computing tasks across a large pool of computer-made resources, enabling various application systems to acquire Computing power, storage space, and information services as needed. The network that provides the resources is referred to as the "cloud". Resources in the cloud are infinitely expandable in the sense of users, and can be acquired at any time, used as needed, expanded at any time and paid for use as needed.

According to the method and the device, data required by video coding are stored in the cloud, the data in the cloud are obtained at any time according to requirements, and the data are expanded at any time, for example, when an encoder encodes the video for multiple times, a plurality of coding blocks included in each video frame and reference frames of each coding block obtained by first coding processing can be stored in the cloud, and if the encoder encodes the video for the second time and then encodes the video, relevant data can be obtained from the cloud.

In one implementation, the embodiments of the present application provide a video coding scheme; specifically, the scheme principle is as follows: first, a video to be processed may be acquired, and each video frame in the video to be processed may be subjected to a first encoding process, so as to obtain a plurality of encoding blocks included in each video frame and a reference frame of each encoding block. When the nth encoding process is performed on the video to be processed, the data obtained by the target nth encoding process may be used to assist the nth encoding process with respect to the video to be processed. Wherein the target secondary encoding process may be one or more encoding processes between the first secondary encoding process and the N-1 th secondary encoding process.

Optionally, when the nth encoding process is performed on the target video frame of the video to be processed, one or more reference encoding blocks having an overlapping relationship with the target encoding blocks may be determined from a plurality of encoding blocks included in the target video frame at the time of the target encoding process; the target video frame is the video frame where the target coding block is located. After the one or more reference encoded blocks are obtained, an nth encoding process of the video to be processed may be based on the reference frames of the respective reference encoded blocks in the target secondary encoding process. In one embodiment, first, a set of candidate reference frames for a target coding block may be determined based on the reference frames of the respective reference coding blocks in a target secondary coding process; for example, a set of candidate reference frames for the target coding block may be determined based on the coding mode of each reference coding block and the reference frames of each reference coding block in the target secondary coding process; the coding modes may include inter coding modes or intra coding modes. Further, a target reference frame for the target encoded block may be determined from the set of candidate reference frames and the target encoded block may be encoded predicted based on the target reference frame.

By implementing the scheme, in a scene of multiple times of coding, when the target coding block is coded in the nth coding process, the reference frame selection result of each coding block in the historical coding process (such as the first coding process) can be multiplexed, so that the initial reference frame set corresponding to the target coding block is reduced, the calculation complexity of selecting the reference frame is reduced, the calculation resource is saved, the selection time on selecting the reference frame can be reduced, and the coding efficiency can be improved.

In a specific implementation, the video coding scheme mentioned above may be performed by a computer device, such as in particular by an encoder in the computer device. The computer device may be a terminal or a server; among them, the terminals mentioned herein may include, but are not limited to: smart phones, tablet computers, notebook computers, desktop computers, intelligent voice interaction equipment, intelligent home appliances, vehicle terminals, aircrafts and the like; a wide variety of clients (APP) may be running within the terminal, such as game-type clients, multimedia play-type clients, social-type clients, browser-type clients, information-flow-type clients, educational-type clients, and so on. The server mentioned herein may be an independent physical server, or may be a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligence platforms, and so on.

In one possible implementation, when the computer device is a server, the embodiments of the present application provide a video encoding system, as shown in fig. 1, where the video encoding system includes at least one terminal and at least one server; the terminal can acquire the video to be processed and upload the acquired video to be processed to a server (i.e. computer equipment), so that the computer equipment can acquire the video to be processed and perform video coding on the video to be processed.

In one implementation, the video coding scheme provided by the embodiment of the application can be applied to video on demand, rate control and other scenes. For example, in video on demand scenarios, it is often necessary to transcode (i.e., encode) a video original provided by a content manufacturer as many times as required by a service to output video streams of different resolutions; in another example, in a rate control scenario, a multiple coding mode is generally adopted in a rate control algorithm to obtain higher rate accuracy and better coding efficiency.

Based on the above description about the video coding scheme, the embodiments of the present application provide a video coding method, where the embodiments of the present application mainly use a computer device as an execution body to describe the method; referring to fig. 2, the video encoding method may include the following steps S201 to S204:

S201, acquiring a video to be processed, and performing first coding processing on each video frame in the video to be processed to obtain a plurality of coding blocks included in each video frame and a reference frame of each coding block.

The video to be processed may be any type of video, such as sports video, education video, film video, and the like.

In one implementation, the video to be processed may be acquired when there is a need for multiple encodings of the video to be processed.

Alternatively, it may be determined that there is currently a need for multiple encodings of the video to be processed when the computer device receives one multiple encoding request. For example, a technician may send a multiple encoding request for a video to be processed to a computer device, such that the computer device receives the multiple encoding request, and after the computer device receives the multiple encoding request, it is determined that multiple encoding requirements for the video to be processed exist. In one possible implementation, when a technician needs to code a certain video multiple times, the technician can perform related operations through a user operation interface output by the terminal, so as to send multiple coding requirements for the video to be processed to the computer device.

See, for example, fig. 3 a: the terminal used by the technician may display a user interface in the terminal screen that may include at least a video setup area labeled 301 and a confirmation control labeled 302. If the technician wants to code a certain video multiple times, the technician can input related information (such as a video or a storage address corresponding to the video) of the video to be processed in the video setting area 301, and then perform a triggering operation (such as a clicking operation, a pressing operation, etc.) on the confirmation control 302, so that a terminal used by the technician can send multiple coding requirements for the video to be processed to the computer device based on the related information in the video setting area 301.

Alternatively, the multiple encoding requirements may also be generated by triggering an encoding timing task. For example, a coding timing task may be set, in which a trigger condition for coding a video a plurality of times is indicated. For example, the triggering condition may be that the current time reaches a preset archiving time, or that the remaining storage space storing the video reaches a preset remaining storage space, or that the storage space storing the video is newly added with a video, or the like.

In one implementation, after the video to be processed is obtained, the video to be processed may be encoded multiple times, e.g., the number of times of encoding may be N, where N is a positive integer greater than 1. For example, N may be a number of 2, 4, etc. Illustratively, if N is 2, this indicates that 2 encodings are required for the video to be processed; if N is 4, it indicates that the video to be processed needs to be encoded 4 times. The following first describes the first encoding process of the video to be processed.

In a specific implementation, the computer device may perform a first encoding process on each video frame in the video to be processed, so as to obtain the video to be processed after the first encoding process. In the first encoding process, a plurality of encoding blocks included in each video frame and a reference frame corresponding to each encoding block may be obtained.

Wherein, the plurality of coding blocks included in one video frame may be obtained by block division of the video frame. In different Coding standards, the names of the Coding blocks may differ, as in the h.264 Coding standard, the Coding blocks may be referred to as sub-macro blocks, and as in the h.265/h.266 Coding standard, the Coding blocks may be referred to as CUs (Coding units), and in the embodiments of the present application, the sub-macro blocks and CUs may be collectively referred to as Coding blocks. The number of coded blocks corresponding to each video frame may be the same or different, and the block division manner of each video frame may be the same or different. For example, the

blocks

31 and 32 in fig. 3b represent block division results of block division of a video frame, respectively, wherein any small box in the

blocks

31 and 32 represents a coding block, and as the block 33, it can be understood that a coding block.

The reference frame of one coding block may be a video frame used when the coding block is predicted in the video to be processed.

S202, when the N-th coding process is carried out on a target video frame of the video to be processed, one or more reference coding blocks with overlapping relation with the target coding blocks are determined from a plurality of coding blocks included in the target video frame during the target coding process.

The target video frame is the video frame where the target coding block is located; the target video frame may be any video frame in the video to be processed, and the target coding block may be any coding block in the target video frame.

Wherein the target sub-encoding process is one or more encoding processes between the first encoding process and the N-1 th encoding process. For example, the target sub-encoding process may be a first encoding process, or may be a first encoding process and a second encoding process, or may be a second encoding process and a third encoding process, or the like. Alternatively, the procedure of each encoding process between the second encoding process and the N-1 th encoding process may be the same as that of the first encoding process; that is, in any one encoding process, the encoding block in any one encoding process does not need to be encoded by using the reference frame selection result of the encoding block in the history encoding process. Alternatively, the procedure of each encoding process between the second encoding process and the N-1 th encoding process may be the same as the procedure for the N-th encoding process (i.e., steps S202 to S204) set forth in the embodiment of the present application; that is, in any one encoding process, the reference frame selection result of the encoding block in the history encoding process is used to encode the encoding block in the any one encoding process. Alternatively, there is a procedure of the encoding process between the second encoding process and the N-1 th encoding process, that is, there is the same procedure as the first encoding process, and there is the same procedure as the N-th encoding process.

It will be appreciated that when the nth encoding process is performed on the video to be processed, it is also necessary to perform the encoding process on each video frame in the video to be processed, and a certain video frame in the video to be processed is taken as an example to perform related description, for example, a target video frame may be described, and the target video frame may be, for example, a video frame that is required to be subjected to the nth encoding process at the current time. In one implementation, when the nth encoding process is performed on a target video frame of a video to be processed, first, block division or encoding block division needs to be performed on the target video frame to obtain basic encoding blocks for performing the encoding process, and then, subsequent encoding processes are performed on each of a plurality of encoding blocks obtained by the block division. Considering that the process of encoding each encoded block is the same, the following description will take one encoded block in the target video frame as an example, and may be described as a target encoded block, which may be an encoded block that needs to be subjected to the nth encoding process at the current time.

In one implementation, the specific implementation of determining the reference coding block corresponding to the target coding block may be: first, the region position of the target coding block in the target video frame at the time of the nth coding process can be acquired. And the region position of each coding block in the target video frame in the plurality of coding blocks corresponding to the target video frame in the target secondary coding process can be obtained. Further, it is possible to determine a coding block having an overlapping region with the region position of the target coding block from among the plurality of coding blocks according to the region position of each coding block in the target video frame, and use the determined coding block as a reference coding block having an overlapping relationship with the target coding block.

In one embodiment, if the target sub-encoding process is one encoding process between the first encoding process and the N-1 th encoding process, and the target sub-encoding process is the first encoding process as an example, a region position of each of a plurality of encoding blocks corresponding to the target video frame in the target video frame at the time of the first encoding process may be acquired. In this case, it is possible to determine a coded block having an overlapping region with the region position of the target coded block from among the plurality of coded blocks at the time of the first encoding process, based on the region position of each coded block in the target video frame at the time of the first encoding process, and use the determined coded block as a reference coded block having an overlapping relationship with the target coded block.

For example, referring to fig. 3c, block 34 in fig. 3c is a video frame, wherein the dashed box in block 34 represents the target coding block and the solid box represents the corresponding coding block at the time of the target secondary coding process. It can be seen that if the encoding blocks 1, 2 and 3 have overlapping regions with the target encoding block, the encoding blocks 1, 2 and 3 can be regarded as reference encoding blocks corresponding to the target encoding block.

In one embodiment, if the target sub-encoding process is a plurality of encoding processes between the first encoding process and the N-1 th encoding process, such as the target sub-encoding process being the first encoding process and the second encoding process, a region position of each of the plurality of encoding blocks corresponding to the target video frame in the target video frame at the time of the first encoding process and a region position of each of the plurality of encoding blocks corresponding to the target video frame in the target video frame at the time of the second encoding process may be obtained. In this case, it is possible to determine a code block having an overlapping region with the region position of the target code block from among the plurality of code blocks at the time of the first encoding process and the plurality of code blocks at the time of the second encoding process, based on the region position of each code block in the target video frame at the time of the first encoding process and at the time of the second encoding process, and use the determined code block as a reference code block having an overlapping relationship with the target code block.

S203, a candidate reference frame set of the target coding block is determined based on the coding mode of each reference coding block and the reference frames of each reference coding block in the target secondary coding process.

The coding modes may include inter-coding modes or intra-coding modes, among others. The inter-frame coding mode or the intra-frame coding mode can be specifically a coding prediction technology, and if intra-frame prediction is used in coding, it is indicated that the coding of the target video frame does not need to refer to the information of other video frames; if inter-frame prediction is used in encoding, information indicating that adjacent video frames need to be used to predict the target video frame.

In one implementation, if there is no reference coding block in each reference coding block that utilizes intra coding mode, a set of candidate reference frames for the target coding block may be determined based on the reference frames of each reference coding block in the target subcode process. For example, if the target sub-coding process is the first-time coding process, the candidate reference frame set of the target coding block may be determined based on the reference frames of the respective reference coding blocks in the first-time coding process. The reference frames in the candidate reference frame set may be understood as reference frames required for prediction in the following.

If the reference coding blocks using the intra-frame coding mode exist in the reference coding blocks, which indicates that the time domain correlation degree of the image area corresponding to the target video frame is not strong at this time, the certainty of the reference frame is weak when the prediction is performed. In this case, then, all video frames in the initial reference frame set corresponding to the reference encoded block may be set to be available, i.e., the initial reference frame set corresponding to the reference encoded block may be taken as the candidate reference frame set for the target encoded block. Wherein the reference frame of any one of the one or more reference encoded blocks is present in the initial set of reference frames corresponding to any one of the reference encoded blocks.

It should be appreciated that in a coding standard in which coding is performed in a block division manner, when an encoder in a coding system predicts in units of coding blocks (e.g., inter-frame prediction), the encoder typically constructs a reference frame set (reference frame list) for a target video frame, so that the encoder can select a frame video frame from the reference frame set as a reference frame at the time of prediction. For convenience of description, the reference frame set herein may be referred to as an initial reference frame set. As previously described, the encoder may construct an initial reference frame set for the target video frame, i.e., each video frame in the video to be processed may correspond to an initial reference frame set; the initial reference frame sets corresponding to different coding blocks in the same video frame are the same, i.e. the different coding blocks in the same video frame are all predicted by using a certain video frame in the same reference frame set as a reference frame. That is, the initial reference frame set for the target coding block, and any reference coding block in the target video frame is the same.

The initial reference frame set of each video frame may be preset, one initial reference frame set corresponding to one video frame includes one or more video frames in the video to be processed, and for one video frame, the video frames in the initial reference frame set corresponding to the video frame may be video frames continuous with the video frames in the video to be processed, or may be video frames discontinuous with the video frames. The number of the initial reference frame sets and the specific video frames are not limited.

Alternatively, the initial reference frame set of one video frame may be divided into a forward reference frame set (or referred to as a forward reference frame list) and a backward reference frame set (or referred to as a backward reference frame list) according to the encoding order and the playing order. The forward reference frame set may include a number of video frames that are in a coding order and a playing order that precede the target video frame. The backward set of reference frames may contain video frames that are encoded in sequence before the target video frame, but played in sequence after the target video frame. That is, in embodiments of the present application, the initial reference frame set for one video frame may include one or more of a forward reference frame set (or referred to as a forward reference frame list) and a backward reference frame set.

S204, determining a target reference frame of the target coding block from the candidate reference frame set, and carrying out coding prediction on the target coding block based on the target reference frame.

In one implementation, a target reference frame of a target coding block may be determined from a set of candidate reference frames based on a motion search, e.g., each reference frame in the set of candidate reference frames may be subjected to a motion search process to obtain a corresponding target reference frame. After the target reference frame is obtained, the target coding block can be coded and predicted based on the target reference frame, so as to obtain the coded target coding block.

As previously described, the candidate reference frame set may include one or more of a forward reference frame set and a backward reference frame set, and when encoding the target encoded block, the encoder may select one frame from the forward candidate reference frame set or the backward candidate reference frame set as a reference frame, in which case the encoding prediction may be understood as unidirectional prediction, or may select one frame from the two candidate reference frame sets (i.e., a total of two frames) as a reference frame, respectively, in which case the encoding prediction may be understood as bidirectional prediction. The reference mode (unidirectional prediction or bidirectional prediction) and which frame is specifically selected as the reference frame are not specified in the coding standard. Each frame selected from the candidate reference frame set may encode a code stream that meets the standard, but different choices may have different coding efficiencies. To obtain the best coding efficiency, the encoder may typically encode all possible reference frame selections in one pass to search for the best reference frame (i.e., the target reference frame), where each pass may involve a very high complexity motion search. That is, the above-mentioned reference may be performed on each reference frame in the candidate reference frame set to obtain a corresponding target reference frame.

It will be appreciated that for a block to be encoded in video, the encoder will typically select a region of most similar pixel distribution in an encoded video frame for prediction, such as may be referred to as a similar region; the encoder then only needs to encode the position information of this similar region, the pixel differences of the encoded block to be encoded and the similar region. In general, the smaller the pixel difference, the fewer the number of bytes that need to be transmitted, the higher the coding efficiency. The encoder may generate a code stream that meets the standard if it finally selects a region that is not the most reasonable but predicts it, but the coding efficiency is damaged, for example, the coding efficiency may be reduced. Finding this most reasonable area is a very computationally complex process that encoders often implement by pixel-by-pixel comparison, which can be understood as motion searching. However, the complexity of coded prediction (e.g., inter-frame prediction) is far more than this, because there is often more than one coded frame for which a motion search can be performed, and the encoder often needs to perform a motion search over all frames that may be referenced to determine which region of which frame to use for prediction.

Based on this, it can be known that if the number of video frames in the candidate reference frame set of the current encoder is larger, the corresponding selection process for the target reference frame of the current encoder is complex, if motion searching needs to be performed on each video frame in the candidate reference frame set, the corresponding encoding complexity will also be higher, and the encoding complexity will also mean that more computing resources and longer encoding time are required for encoding, so that the encoding efficiency is lower. In the embodiment of the present application, the reference frame selection result of the target secondary encoding process (such as the first encoding process) may be used to accelerate the reference frame selection process of the subsequent encoding. Because the candidate reference frame set with a smaller number of frames can be utilized to effectively accelerate the reference frame selection process of the subsequent encoding compared with the selection of the reference frames from the initial reference frame set corresponding to the target encoding block; therefore, the coding efficiency is not damaged basically while the calculation complexity is reduced and the calculation resources are saved, and compared with the original common reference frame selection scheme, the coding efficiency can be effectively improved.

In the embodiment of the application, in a scene of encoding a video to be processed for multiple times, and when encoding a target encoding block in an nth encoding process, reference frame selection results of each encoding block in the target encoding process can be multiplexed, so that an initial reference frame set corresponding to the target encoding block is reduced, the calculation complexity of selecting the reference frame is reduced, the calculation resources are saved, the selection time on selecting the reference frame can be reduced, and the encoding efficiency can be improved.

Fig. 4 is a flowchart of another video encoding method according to an embodiment of the present application. The embodiment of the application mainly uses the computer equipment as an execution main body for explanation; referring to fig. 4, the video encoding method may include the following steps S401 to S405:

s401, acquiring a video to be processed, and performing first coding processing on each video frame in the video to be processed to obtain a plurality of coding blocks included in each video frame and a reference frame of each coding block.

S402, when the N-th coding process is carried out on the target video frame of the video to be processed, one or more reference coding blocks with overlapping relation with the target coding blocks are determined from a plurality of coding blocks included in the target video frame during the target coding process.

The specific implementation of steps S401 and S402 may be referred to the specific implementation of steps S201 and S202, and will not be described herein.

Wherein the target sub-encoding process is one or more encoding processes between the first encoding process and the N-1 th encoding process. In one implementation, the target secondary encoding process may be any one encoding process or any multiple encoding process selected randomly from between the first and N-1 th encoding processes. In another implementation manner, in order to accelerate the reference frame selection process of each encoding process except the first encoding process, the target secondary encoding process may be preferentially set to the first encoding process, that is, the reference frame selection result of the first encoding process may be used to assist in the selection of the reference frame in both the second encoding process and the subsequent encoding process, so that the selection efficiency of the reference frame during each encoding process except the first encoding process may be improved, and further the video encoding efficiency may be effectively improved.

S403, if there is no reference coding block using the intra coding mode in each reference coding block, determining a candidate reference frame set of the target coding block based on the reference frame of each reference coding block in the target sub-coding process.

It will be appreciated that the smaller the quantization parameter (Quantizer Parameter, QP) of a typical video frame, the finer the motion search for that video frame, so the difference between the quantization parameter of the current encoded frame at the nth encoding process and the quantization parameter of the current encoded frame at the target encoding process can be taken into account to determine the candidate reference frame set for the target encoded block. For convenience of description, a quantization parameter of a target video frame at the nth encoding process and a quantization parameter at the target encoding process may be referred to as a target quantization parameter and a reference quantization parameter, respectively.

Based on this, a specific implementation of determining the candidate reference frame set for the target coding block may be: first, a target quantization parameter of a target video frame at the nth encoding process may be acquired, and a reference quantization parameter of the target video frame at the target second encoding process may be acquired. After the target quantization parameter and the reference quantization parameter are obtained, the target quantization parameter and the reference quantization parameter may be compared in size; so that a set of candidate reference frames for the target coding block can be determined based on the comparison result and the reference frames of the respective reference coding blocks in the target subcode process.

Wherein, the comparison result may include: the target quantization parameter is greater than or equal to the reference quantization parameter, or the target quantization parameter is less than the reference quantization parameter. The manner of determining the candidate reference frame sets correspondingly is different for different comparison results, and the following description will be made on the manner of determining the candidate reference frame sets under the two different comparison results.

(1) If the comparison result is that the target quantization parameter is greater than or equal to the reference quantization parameter, the specific manner of determining the candidate reference frame set is as follows:

the reference frames of each reference code block in the target subcode process may be added to the candidate reference frame set of the target code block, i.e., the candidate reference frame set of the target code block may be constructed from the reference frames of each reference code block in the target subcode process.

It will be appreciated that in encoding a video multiple times, the motion information of the video content is generally the same, although the quantization parameters of the video frames in the video vary, as is known from the principles of inter-frame prediction. And when the quantization parameter used by the video frame in the subsequent nth encoding is larger than the quantization parameter used by the target sub-encoding, the motion search result of the video frame may be coarser, so that the reference frame used by the target sub-encoding is multiplexed without basically causing loss of encoding efficiency. Based on this, a candidate reference frame set of the target coding block may be constructed from the reference frames of the respective reference coding blocks in the target sub-coding process, so that the reference frame corresponding to the target coding block (i.e., the target reference frame) may be directly determined from this candidate reference frame set.

It is to be understood that the reference frames of a reference code block in the target subcode process are determined from the initial reference frame set, i.e. the candidate reference frame set is a subset of the initial reference frame set. It can be seen that, in this way, the initial reference frame set corresponding to the target coding block can be effectively reduced based on the reference frames of each reference coding block in the target secondary coding process, so that the target reference frame can be selected based on the candidate reference frame set with a smaller number of frames, thereby effectively reducing the computational complexity, saving the computational resources of the computer equipment, and ensuring that the coding efficiency is not damaged basically.

In one embodiment, as previously described, the initial set of reference frames for one video frame may include one or more of a forward set of reference frames or a backward set of reference frames.

When an initial reference frame set of a video frame includes a forward reference frame set or a backward reference frame set, for a reference coding block, the initial reference frame set corresponding to the reference coding block includes the forward reference frame set or the backward reference frame set, and the reference frame corresponding to the reference coding block is selected from the forward reference frame set or the backward reference frame set. And the initial reference frame sets for the reference coding blocks in the same video frame are the same, i.e. the reference frames of the reference coding blocks are all selected from a forward reference frame set or a backward reference frame set, the candidate reference frame set corresponding to the target coding block may include a forward candidate reference frame set or a backward candidate reference frame set.

For example, if the initial reference frame set of the target video frame is a forward reference frame set, the reference frame corresponding to each reference coding block is selected from the forward reference frame set; further, it can be known that the candidate reference frame set constructed by the reference frames corresponding to the respective reference encoding blocks is a forward candidate reference frame set. For another example, if the initial reference frame set of the target video frame is a backward reference frame set, the reference frame corresponding to each reference coding block is selected from the backward reference frame set; further, it can be known that the candidate reference frame set constructed by the reference frames corresponding to the respective reference encoding blocks is a backward candidate reference frame set.

When an initial reference frame set of one video frame includes a forward reference frame set and a backward reference frame set, for one reference coding block, the initial reference frame set corresponding to the reference coding block includes the forward reference frame set and the backward reference frame set, and then the reference frame corresponding to the reference coding block is selected from the forward reference frame set and the backward reference frame set, if one video frame can be selected from the forward reference frame set and the backward reference frame set as a reference frame, respectively, the candidate reference frame set corresponding to the target coding block can include the forward candidate reference frame set and the backward candidate reference frame set.

For example, if the initial reference frame set of the target video frame is a forward reference frame set and a backward reference frame set, the reference frames corresponding to the respective reference encoded blocks are selected from the forward reference frame set and the backward reference frame set, e.g., the reference frame corresponding to each reference encoded block includes one video frame selected from the forward reference frame set and one video frame selected from the backward reference frame set. Further, it is known that the reference frame selected from the forward reference frame set by each reference coding block may be constructed as a forward candidate reference frame set, and the reference frame selected from the backward reference frame set by each reference coding block may be constructed as a backward candidate reference frame set.

In one embodiment, after the computer device executes the first encoding process for each video frame in the video to be processed in step S401 to obtain a plurality of encoding blocks included in each video frame and a reference frame of each encoding block, reference frame information of the reference frame corresponding to each encoding block may also be stored. Alternatively, at each encoding process between the first encoding process and the N-1 th encoding process, the reference frame information of the reference frame corresponding to each encoding block at each encoding process may be stored, so that at the time of the N-th encoding process, the reference frame information of the reference frame corresponding to each encoding block may be used to determine the corresponding candidate reference frame set at the time of one or more encoding processes between the first encoding process and the N-1 th encoding process.

The reference frame information of the reference frame corresponding to one coding block may include an index of the reference frame in the initial reference frame set, and the index may refer to a number of the reference frame in the initial reference frame set, so that the corresponding reference frame may be determined directly based on the index. Wherein, the number may be a numeric number, an alphabetic number, etc., and is not particularly limited. For example, if the reference frame information of the reference frame corresponding to a certain coding block is 2, the reference frame of the coding block is indicated to be the 2 nd video frame in the corresponding initial reference frame set; if the reference frame information of the reference frame corresponding to a certain coding block is 5, the reference frame of the coding block is indicated to be the 5 th video frame in the corresponding initial reference frame set.

Optionally, if the coding prediction mode is bidirectional prediction, the initial reference frame set includes a forward reference frame set and a backward reference frame set, that is, the reference frame information of the reference frame corresponding to one coding block includes indexes of the reference frame in the forward reference frame set and the backward reference frame set. If the coding prediction mode is unidirectional prediction, the initial reference frame set comprises a forward reference frame set or a backward reference frame set, i.e. the index of a certain reference frame set will not be present. For example, if the initial reference frame set includes a forward reference frame set, the reference frame information of the reference frame corresponding to one encoded block includes an index of the reference frame in the forward reference frame set. For another example, if the initial reference frame set includes a backward reference frame set, the reference frame information of the reference frame corresponding to one encoded block includes an index of the reference frame in the backward reference frame set.

Based on this, it is known that the above determination of the candidate reference frame set for the target coding block can be directly implemented based on the reference frame information (i.e., index). In a specific implementation, the reference frame information of the reference frames of each reference coding block can be acquired first, after the reference frame information is acquired, the initial reference frame set corresponding to the target coding block is filtered based on the reference frame information, and the initial reference frame set obtained by filtering is the candidate reference frame set. Alternatively, the specific implementation of the filtering operation may be: for the initial reference frame set corresponding to the target coding block, reserving the video frames with indexes indicated by the reference frame information corresponding to each reference coding block, and filtering out the video frames which are not indexes indicated by the reference frame information corresponding to each reference coding block.

For example, taking an example that the initial reference frame set includes any one of a forward reference frame set or a backward reference frame set, assume that the initial reference frame set corresponding to the target coding block is the forward reference frame set, and the forward reference frame set is { video frame 1 video frame 2 video frame 3 video frame 4}; the reference coding block corresponding to the target coding block comprises a reference coding block 1 and a reference coding block 2, wherein the index of a reference frame corresponding to the reference coding block 1 is 2, and the index of a reference frame corresponding to the reference coding block 2 is 3. The candidate reference frame set for the target encoded block is { video frame 2 video frame 3}, and the candidate reference frame set may be a forward-pointing candidate reference frame set.

As another example, taking an initial reference frame set including a forward reference frame set and a backward reference frame set as an example, assume that the forward reference frame set corresponding to the target coding block is { video frame 11 video frame 12 video frame 13 video frame 14}, and the backward reference frame set is { video frame 21 video frame 22 video frame 23 video frame 24}; the reference coding block corresponding to the target coding block comprises a reference coding block 1 and a reference coding block 2, wherein the index of a reference frame corresponding to the reference coding block 1 in a forward reference frame set is 12, and the index of a reference frame set in a backward reference frame set is 21; the index of the reference frame corresponding to the reference coding block 2 is 14 in the forward reference frame set, and the index of the reference frame set in the backward reference frame set is 23. The candidate reference frame set for the target encoded block includes a forward candidate reference frame set, and the forward candidate reference frame set is { video frame 12 video frame 14}, and a backward candidate reference frame set, and the backward candidate reference frame set is { video frame 21 video frame 23}.

(2) If the comparison result is that the target quantization parameter is smaller than the reference quantization parameter, the specific manner of determining the candidate reference frame set is as follows:

the video frame with the shortest playing distance from the target video frame in the playing of the video to be processed can be obtained from the initial reference frame set corresponding to the target coding block, and the obtained video frame and the reference frames of each reference coding block in the target secondary coding process are added into the candidate reference frame set of the target coding block. Alternatively, if the initial reference frame set includes a forward reference frame set or a backward reference frame set, one video frame closest to the target video frame may be selected from the forward reference frame set or the backward reference frame set, respectively. If the initial reference frame set includes a forward reference frame set and a backward reference frame set, one video frame closest to the target video frame may be selected from the forward reference frame set and the backward reference frame set, respectively.

It will be appreciated that with smaller quantization parameters, the motion search is finer, and there is a greater likelihood that the search results obtained with larger quantization parameters will be less than optimal. And the two frames before and after the nearest play-out are usually most relevant to the image content of the target video frame, so that the possibility that the video frame is finally selected is also the greatest, and the video frames can be considered to be available to reduce the loss of coding efficiency caused by disabling the optimal reference frame. Then in order to reduce the loss of coding efficiency, the video frames of the initial set of reference frames that are closest to the target video frame may be made available even though they are not in the set of reference frames used by the corresponding one or more reference encoded blocks in the target subcode process.

In summary, it can be seen that, when the comparison result is that the target quantization parameter is smaller than the reference quantization parameter, the initial reference frame set corresponding to the target coding block can be reduced to obtain the corresponding candidate reference frame set, so that the computing complexity is reduced, the computing resource of the computer device is saved, and the coding efficiency is not damaged basically.

It should be noted that, in the case where the comparison result is that the target quantization parameter is smaller than the reference quantization parameter, the candidate reference frame set of the target coding block may also include one or more of the forward candidate reference frame set and the backward candidate reference frame set, and the relevant understanding may refer to the case in (1) above, which is not described herein.

It should be noted that, considering that the target sub-encoding process may be one or more encoding processes between the first encoding process and the N-1 th encoding process, and when the target sub-encoding process is one encoding process between the first encoding process and the N-1 th encoding process, the reference quantization parameter is a parameter, and when the quantization parameter comparison is performed, a comparison result (the target quantization parameter is greater than or equal to the reference quantization parameter, or the target quantization parameter is smaller than the reference quantization parameter) may be obtained. In this case, the candidate reference frame set may be determined as described in (1) or (2) above.

And if the target sub-encoding process is a plurality of encoding processes between the first encoding process and the N-1 th encoding process, the reference quantization parameter is a plurality of parameters, and each reference quantization parameter needs to be compared with the target quantization parameter when the quantization parameter comparison is performed, the comparison result may include one or more of the target quantization parameter being greater than or equal to each reference quantization parameter, or the target quantization parameter being less than each reference quantization parameter, or the target quantization parameter being greater than or equal to a portion of the reference quantization parameter, and the target quantization parameter being less than another portion of the reference quantization parameter. In this case, if the target quantization parameter is greater than or equal to each reference quantization parameter, the candidate reference frame set may be determined as described in (1) above; if the target quantization parameter is less than each reference quantization parameter, the candidate set of reference frames may be determined as described in (2) above; if the target quantization parameter is greater than or equal to a portion of the reference quantization parameter and the target quantization parameter is less than another portion of the reference quantization parameter, the candidate reference frame set may be determined in a manner as described in (1) and (2) above.

S404, if the reference coding blocks using the intra coding mode exist in the reference coding blocks, the initial reference frame set corresponding to the reference coding blocks is used as the candidate reference frame set of the target coding block.

S405, determining a target reference frame of the target coding block from the candidate reference frame set, and performing coding prediction on the target coding block based on the target reference frame.

In one implementation, as previously indicated, if the coding prediction mode of the reference coding block at the target secondary coding process is a unidirectional prediction mode, the candidate reference frame set may include either a forward candidate reference frame set or a backward candidate reference frame set; the specific implementation of determining the target reference frame of the target coding block from the candidate reference frame set in step S405 may be: firstly, each reference frame in a forward candidate reference frame set or a backward candidate reference frame set can be subjected to motion search processing; to determine a target reference frame from the forward reference frame set or the backward reference frame set based on the motion search process. The target reference frame may include one of a forward candidate reference frame set or a backward candidate reference frame set.

In one implementation, if the coding prediction mode of the reference coding block in the target secondary coding process is a bi-directional prediction mode, the candidate reference frame set includes a forward candidate reference frame set and a backward candidate reference frame set; the specific implementation of determining the target reference frame of the target coding block from the candidate reference frame set in step S405 may be: firstly, each reference frame in a forward candidate reference frame set and a backward candidate reference frame set can be subjected to motion search processing; to determine target reference frames from the forward candidate reference frame set and the backward candidate reference frame set, respectively, based on a motion search process. Wherein the target reference frame may include one of a set of forward candidate reference frames and one of a set of backward candidate reference frames.

In one implementation, after determining the target reference frame from the candidate reference frame set, a region most similar to the pixel distribution of the target coding block may be searched for from the target reference frame based on motion search, and the target coding block may be coded and predicted using the searched region to obtain the coded target coding block. For example, referring to fig. 5a, if the candidate reference frame set includes a forward candidate reference frame set or a backward candidate reference frame set, a motion search may be performed on a target reference frame for the candidate reference frame set or the backward candidate reference frame set to search for a region most similar to a pixel distribution of the target coding block, and coding prediction may be performed using the searched region. As another example, referring to fig. 5b, if the candidate reference frame set includes a forward candidate reference frame set and a backward candidate reference frame set, a motion search may be performed on target reference frames corresponding to the candidate reference frame set and the backward candidate reference frame set, respectively, to search for a region most similar to the pixel distribution of the target coding block, and coding prediction may be performed using the searched region.

For a better understanding of the video encoding method according to the embodiments of the present application, the following description will further describe video encoding with reference to fig. 5c, where the target sub-encoding process is taken as an example of the first sub-encoding process. For example, referring to fig. 5c, a description will be made regarding a target coding block in a current video of a video to be processed in an nth coding process. When the target coding block needs to be coded, the reference frame information of the reference frame corresponding to the reference coding block can be acquired, wherein the reference coding block refers to a coding block which has an overlapping relation with the target coding block in a coding block division result of the target video frame in the first coding process.

After the reference frame information of the reference encoded block is acquired, it may be further determined whether there is an encoded block using the intra prediction mode in the reference encoded block. If there is no coding block using intra prediction mode, the initial reference frame set of the target coding block may be reduced using the reference frame corresponding to the reference coding block to obtain a candidate reference frame set. As previously described, the reference frame information may include an index of the reference frame in the initial reference frame set, i.e., the initial reference frame set of the target encoded block is reduced based on the video frame indicated by the index in the reference frame information. If there is a coded block using intra prediction mode, the initial reference frame set corresponding to the reference coded block may be taken as a candidate reference frame set.

Alternatively, in the case where the above-mentioned coded block using the intra prediction mode does not exist, the candidate reference frame set of the target coded block may be further determined based on a size relationship between the target quantization parameter of the target video frame at the nth coding process and the reference quantization parameter of the target video frame at the first coding process. For example, if the target quantization parameter is greater than or equal to the reference quantization parameter, the reference frame of each reference encoded block in the first encoding process may be added to the candidate reference frame set of the target encoded block. If the target quantization parameter is smaller than the reference quantization parameter, the video closest to the playing distance of the target video frame in the initial reference frame set corresponding to the target coding block can be listed as a usable candidate reference frame, namely, the video frame closest to the playing distance of the target video frame in the playing of the video to be processed can be acquired from the initial reference frame set corresponding to the target coding block, and the acquired video frame and the reference frames of each reference coding block in the first coding process are added into the candidate reference frame set of the target coding block.

After the candidate reference frame set is determined, each video frame in the candidate reference frame set may be encoded to determine a reference mode (encoding prediction mode) and a target reference frame from the candidate reference frame set. In particular implementations, a motion search process may be performed on each of the candidate reference frames to determine a reference pattern as well as a target reference frame. The reference mode may be determined based on a specific set included in the candidate reference frame set, and the reference mode is a unidirectional prediction mode if the candidate reference frame set includes a forward candidate reference frame set or a backward reference frame set, and a bidirectional prediction mode if the candidate reference frame set includes a forward candidate reference frame set and a backward reference frame set.

In the embodiment of the application, in a scene of performing multiple times of encoding on a video to be processed, when encoding is performed on a target encoding block in an nth encoding process (such as a second encoding process and a subsequent encoding process), multiplexing reference frame selection results of each encoding block in the target encoding process to realize reduction of an initial reference frame set corresponding to the target encoding block, thereby reducing calculation complexity of selecting the reference frame, reducing code stream encoding time and cost of calculation resources, improving encoding efficiency and saving calculation resources; and the selection time on selecting the reference frame can be reduced, and the coding efficiency can be improved.

Fig. 6 is a schematic structural diagram of a video encoding device according to an embodiment of the present application. The video encoding device described in the present embodiment includes:

a first encoding unit 601, configured to obtain a video to be processed, and perform a first encoding process on each video frame in the video to be processed, to obtain a plurality of encoding blocks included in each video frame and a reference frame of each encoding block; the video frame comprises a plurality of coding blocks which are obtained by dividing the video frame, and the reference frame of one coding block is the video frame which is utilized when the coding block is coded and predicted in the video to be processed;

A first determining unit 602, configured to determine, when an nth encoding process is performed on a target video frame of the video to be processed, one or more reference encoding blocks having an overlapping relationship with a target encoding block from a plurality of encoding blocks included in the target video frame at the time of the target encoding process; the target video frame is the video frame where the target coding block is located, the target secondary coding process is one or more times of coding processes between the first coding process and the N-1 th coding process, and N is a positive integer greater than 1;

a second determining unit 603, configured to determine a candidate reference frame set of the target coding block based on a coding mode of each reference coding block and a reference frame of each reference coding block in the target secondary coding process; the coding mode comprises an inter-coding mode or an intra-coding mode;

a second encoding unit 604, configured to determine a target reference frame of the target coding block from the candidate reference frame set, and perform coding prediction on the target coding block based on the target reference frame.

In one implementation, the second determining unit 603 is specifically configured to:

if the reference coding blocks using the intra-frame coding mode do not exist in the reference coding blocks, determining a candidate reference frame set of the target coding block based on the reference frames of the reference coding blocks in the target secondary coding process;

If the reference coding blocks using the intra-frame coding mode exist in the reference coding blocks, taking an initial reference frame set corresponding to the reference coding blocks as a candidate reference frame set of the target coding block; the reference frame of any one of the one or more reference coding blocks exists in an initial reference frame set corresponding to the any one reference coding block; the initial reference frame sets corresponding to different coding blocks in the same video frame are the same.

acquiring a target quantization parameter of a target video frame in the Nth coding process, and acquiring a reference quantization parameter of the target video frame in the target coding process;

comparing the target quantization parameter with the reference quantization parameter in size;

and determining a candidate reference frame set of the target coding block based on the comparison result and the reference frames of the respective reference coding blocks in the target secondary coding process.

if the comparison result shows that the target quantization parameter is greater than or equal to the reference quantization parameter, adding the reference frames of each reference coding block in the target secondary coding process into a candidate reference frame set of the target coding block;

If the comparison result is that the target quantization parameter is smaller than the reference quantization parameter, acquiring a video frame with the shortest playing distance from the target video frame in the playing of the video to be processed from an initial reference frame set corresponding to the target coding block, and adding the acquired video frame and the reference frames of each reference coding block in the target secondary coding process to a candidate reference frame set of the target coding block.

In one implementation, the first determining unit 602 is specifically configured to:

acquiring the region position of the target coding block in the target video frame when the N-th coding processing is performed;

acquiring the region position of each coding block in the target video frame, wherein the region position is in the target video frame, and the coding block corresponds to the target video frame;

and determining a coding block with an overlapping region with the region position of the target coding block from the plurality of coding blocks according to the region position of each coding block in the target video frame, and taking the determined coding block as a reference coding block with an overlapping relationship with the target coding block.

In one implementation, the coding prediction mode of the reference coding block in the target secondary coding process is a unidirectional prediction mode, and the candidate reference frame set includes any one of a forward candidate reference frame set or a backward candidate reference frame set; the second encoding unit 604 is specifically configured to:

Performing motion search processing on each reference frame in the forward candidate reference frame set or the backward candidate reference frame set;

determining a target reference frame from the set of forward reference frames or the set of backward reference frames based on the motion search process; the target reference frame comprises one of the forward set of candidate reference frames or the backward set of candidate reference frames.

In one implementation, the coding prediction mode of the reference coding block in the target secondary coding process is a bi-directional prediction mode, and the candidate reference frame set includes a forward candidate reference frame set and a backward candidate reference frame set; the second encoding unit 604 is specifically configured to:

performing motion search processing on each reference frame in the forward candidate reference frame set and the backward candidate reference frame set;

determining target reference frames from the forward candidate reference frame set and the backward candidate reference frame set respectively based on the motion search process; the target reference frame includes one of the set of forward candidate reference frames and one of the set of backward candidate reference frames.

It will be appreciated that the division of the units in the embodiments of the present application is illustrative, and is merely a logic function division, and other division manners may be actually implemented. Each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

Fig. 7 is a schematic structural diagram of a computer device according to an embodiment of the present application. The computer device described in the present embodiment includes: a processor 701, a memory 702 and a network interface 703. Data may be interacted between the processor 701, the memory 702, and the network interface 703.

The processor 701 may be a central processing unit (Central Processing Unit, CPU) which may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 702 may include read only memory and random access memory and provides program instructions and data to the processor 701. A portion of the memory 702 may also include non-volatile random access memory. Wherein the processor 701, when calling the program instructions, is configured to execute:

In one implementation, the processor 701 is specifically configured to:

In one implementation, the coding prediction mode of the reference coding block in the target secondary coding process is a unidirectional prediction mode, and the candidate reference frame set includes any one of a forward candidate reference frame set or a backward candidate reference frame set; the processor 701 is specifically configured to:

In one implementation, the coding prediction mode of the reference coding block in the target secondary coding process is a bi-directional prediction mode, and the candidate reference frame set includes a forward candidate reference frame set and a backward candidate reference frame set; the processor 701 is specifically configured to:

It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the described order of action, as some steps may take other order or be performed simultaneously according to the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.

The embodiment of the application further provides a computer readable storage medium, in which program instructions are stored, and when the program instructions are executed, part or all of the steps in the above method are implemented, which is not described herein.

Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program to instruct related hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic or optical disk, and the like.

Embodiments of the present application also provide a computer program product or computer program comprising program instructions which, when executed, e.g. by a processor, may carry out some or all of the steps of the above-described method. Alternatively, the program instructions are stored in a computer-readable storage medium. The program instructions are read from the computer-readable storage medium by a processor of the computer device, and executed by the processor, cause the computer device to perform the steps performed in the embodiments of the methods described above.

The foregoing has described in detail a video encoding method, apparatus, computer device and storage medium provided by the embodiments of the present application, and specific examples have been applied herein to illustrate the principles and implementations of the present application, where the foregoing examples are provided to assist in understanding the methods and core ideas of the present application; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims

1. A method of video encoding, the method comprising:

2. The method of claim 1, wherein the determining the candidate set of reference frames for the target encoded block based on the encoding mode of the respective reference encoded block and the reference frames of the respective reference encoded block in the target secondary encoding process comprises:

3. The method according to claim 1 or 2, wherein said determining a set of candidate reference frames for the target coding block based on the reference frames of the respective reference coding block in the target secondary coding process comprises:

4. A method according to claim 3, wherein said determining a set of candidate reference frames for the target coding block based on the comparison and the reference frames for the respective reference coding block in the target secondary coding process comprises:

5. The method of claim 1, wherein the determining one or more reference encoded blocks having an overlapping relationship with a target encoded block from a plurality of encoded blocks corresponding to the target video frame at the time of the target secondary encoding process comprises:

6. The method according to claim 1, wherein the coding prediction mode of the reference coding block in the target secondary coding process is a unidirectional prediction mode, and the candidate reference frame set includes any one of a forward candidate reference frame set or a backward candidate reference frame set;

the determining the target reference frame of the target coding block from the candidate reference frame set includes:

7. The method of claim 1, wherein the coding prediction mode of the reference coding block at the target secondary coding process is a bi-directional prediction mode, and the set of candidate reference frames comprises a forward set of candidate reference frames and a backward set of candidate reference frames;

8. A video encoding apparatus, comprising:

9. A computer device comprising a processor and a memory, wherein the memory is for storing a computer program, the computer program comprising program instructions, the processor being configured to invoke the program instructions to perform the method of any of claims 1-7.

10. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein program instructions, which when executed, are adapted to carry out the method according to any of claims 1-7.