CN111263151A

CN111263151A - Video encoding method, video encoding device, electronic device, and computer-readable storage medium

Info

Publication number: CN111263151A
Application number: CN202010336379.9A
Authority: CN
Inventors: 张宏顺
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-04-26
Filing date: 2020-04-26
Publication date: 2020-06-09
Anticipated expiration: 2040-04-26
Also published as: CN111263151B

Abstract

The disclosure provides a video coding method and device, electronic equipment and a computer readable storage medium, and relates to the technical field of video processing. The method comprises the following steps: acquiring a current coding unit of a current frame image in a target video; acquiring a first reference frame of a first prediction unit adjacent to a current coding unit; determining an inter-generation coding unit of the current coding unit, wherein the inter-generation coding unit comprises a second prediction unit and acquires a second reference frame of the second prediction unit; acquiring a third reference frame of a third prediction unit of the current coding unit in a target segmentation mode; determining a candidate reference frame corresponding to a current prediction unit of a current coding unit in a current segmentation mode according to a first reference frame, a second reference frame and a third reference frame; determining a first rate-distortion cost of the current prediction unit relative to the candidate reference frame; and coding the current coding unit of the current frame image according to the first rate distortion cost. The method provided by the disclosure can reduce the complexity of video coding.

Description

Video encoding method, video encoding device, electronic device, and computer-readable storage medium

Technical Field

The present disclosure relates to the field of video processing technologies, and in particular, to a video encoding method, an apparatus, an electronic device, and a computer-readable storage medium.

Background

Video coding refers to a way of converting a file in a certain video format into a file in another video format by a specific compression technique.

Currently, video coding includes inter-frame coding and intra-frame coding. One important content of inter-coding a coding unit in video is: and selecting an optimal target reference frame from the plurality of reference frames (the optimal target reference frame is generally regarded as the reference frame with the minimum rate distortion cost in all the reference frames), so as to predict and code the current coding unit by using the target reference frame.

In the inter-frame prediction process, the processing complexity of target reference frame selection has a great influence on the efficiency of video coding, so how to reduce the processing complexity of target reference frame selection, reduce the complexity of video coding, and improve the efficiency of video coding is a problem that is always researched by technical personnel in the field.

It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.

Disclosure of Invention

The embodiment of the disclosure provides a video coding method and device, an electronic device and a computer-readable storage medium, which can reduce the processing complexity of a target reference frame, further reduce the complexity of video coding and improve the efficiency of video coding.

Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.

The disclosed embodiment provides a video coding method, which includes: acquiring a current coding unit of a current frame image in a target video; acquiring a first reference frame of a first prediction unit adjacent to the current coding unit, wherein the first prediction unit is coded; determining an inter-coding unit of the current coding unit, wherein the inter-coding unit comprises a second prediction unit, a second reference frame of the second prediction unit is obtained, and the second prediction unit is coded; acquiring a third reference frame of a third prediction unit of the current coding unit in a target segmentation mode, wherein the third prediction unit is coded; determining a candidate reference frame corresponding to a current prediction unit of a current coding unit in a current segmentation mode according to the first reference frame, the second reference frame and the third reference frame; determining a first rate-distortion cost of the current prediction unit relative to the candidate reference frame, so as to encode the current coding unit of the current frame image according to the first rate-distortion cost.

The disclosed embodiments provide a video encoding apparatus, which may include: the device comprises a current coding unit acquisition module, a first reference frame acquisition module, a second reference frame acquisition module, a third reference frame acquisition module, a candidate reference frame determination module, a rate distortion cost determination module and a coding module.

The current coding unit obtaining module is configured to obtain a current coding unit of a current frame image in a target video. The first reference frame acquisition module is configured to acquire a first reference frame of a first prediction unit adjacent to the current coding unit, the first prediction unit having completed coding. The second reference frame acquisition module is configured to determine an inter-coding unit of the current coding unit, the inter-coding unit including a second prediction unit, acquire a second reference frame of the second prediction unit, the second prediction unit having completed encoding. The third reference frame obtaining module is configured to obtain a third reference frame of a third prediction unit of the current coding unit in the target partition mode, wherein the third prediction unit is coded completely. The candidate reference frame determination module is configured to determine, according to the first reference frame, the second reference frame, and the third reference frame, a candidate reference frame corresponding to a current prediction unit of a current coding unit in a current partition mode. The rate-distortion cost determination module is configured to determine a first rate-distortion cost of the current prediction unit relative to the candidate reference frame. The encoding module is configured to encode a current encoding unit of the current frame image according to the first rate-distortion cost.

In some embodiments, the current segmentation mode is a global segmentation mode.

In some embodiments, the candidate reference frame determination module may include: a candidate reference frame determining unit and a candidate reference frame enhancing unit in the global segmentation mode.

Wherein the candidate reference frame determination unit in the global partition mode is configured to generate a candidate reference frame of a current prediction unit of the current coding unit in the global partition mode according to the first reference frame and the second reference frame. The candidate reference frame enhancement unit is configured to refer to a top k1 reference frame or a bottom k2 reference frame of the current frame picture as candidate reference frames of a current prediction unit of the current coding unit in the global partition mode according to a frame type of the current frame picture, both k1 and k2 being integers greater than or equal to 1.

In some embodiments, the current segmentation mode is a symmetric segmentation mode, and the target segmentation mode comprises an overall segmentation mode.

In some embodiments, the candidate reference frame determination module may include: a first candidate reference frame determination unit in a symmetric split mode.

Wherein the first candidate reference frame determining unit in the symmetric partition mode is configured to use the first reference frame, the second reference frame, and a third reference frame of a third prediction unit of the current coding unit in the global partition mode as candidate reference frames of the current prediction unit of the current coding unit in the current partition mode.

In some embodiments, the current segmentation mode is a symmetric segmentation mode, and the target segmentation mode comprises a prediction segmentation mode of the symmetric segmentation mode and an integer segmentation mode.

In some embodiments, the candidate reference frame determination module may include: and a second candidate reference frame determination unit in a symmetric division mode.

Wherein the second candidate reference frame determination unit in the symmetric split mode is configured to: and using the first reference frame, the second reference frame and a third reference frame of a third prediction unit of the current coding unit in the global partition mode and the predicted partition mode as candidate reference frames of the current prediction unit of the current coding unit in the current partition mode.

In some embodiments, the current segmentation mode is an asymmetric segmentation mode, and the target segmentation mode includes an overall segmentation mode and a symmetric segmentation mode.

In some embodiments, the candidate reference frame determination module may include: and a first candidate reference frame determining unit in an asymmetric division mode.

Wherein the first candidate reference frame determination unit in the asymmetric partitioning mode is configured to use the first reference frame, the second reference frame, and a third reference frame of a third prediction unit of the current coding unit in the global partitioning mode and the symmetric partitioning mode as candidate reference frames of a current prediction unit of the current coding unit in the current partitioning mode.

In some embodiments, the current partition mode is an asymmetric partition mode, the current coding unit comprises a first current prediction unit in the asymmetric partition mode, the target partition mode comprises an integer partition mode and a symmetric partition mode, and the current coding unit comprises a first target prediction unit in the symmetric partition mode.

In some embodiments, the candidate reference frame determination module may include: the device comprises a target first reference frame determining unit, a target second reference frame determining unit and a second candidate reference frame determining unit under an asymmetric division mode.

Wherein the target first reference frame determination unit is configured to determine a target first reference frame with a smallest rate-distortion cost from candidate reference frames of the prediction unit of the current coding unit in the global partition mode. The target second reference frame determination unit is configured to determine a target second reference frame with a smallest rate-distortion cost from the candidate reference frames of the first target prediction unit. The second candidate reference frame determining unit in the asymmetric partitioning mode is configured to use the target first reference frame and the target second reference frame as candidate reference frames corresponding to the first current prediction unit of the current coding unit in the asymmetric partitioning mode.

In some embodiments, the current coding unit further comprises a second current prediction unit in the asymmetric partition mode, the current coding unit further comprises a second target prediction unit in the symmetric partition mode.

In some embodiments, the candidate reference frame determination module may include: a third reference frame determination unit and a third candidate reference frame determination unit in an asymmetric partitioning mode.

Wherein the third reference frame determination unit is configured to determine a target third reference frame with a minimum rate-distortion cost from the candidate reference frames of the second target prediction unit. The asymmetric partition mode third candidate reference frame determination unit is configured to use the target first reference frame and the target third reference frame as candidate reference frames corresponding to the second current prediction unit of the current coding unit in the asymmetric partition mode.

In some embodiments, the video encoding apparatus may include: the device comprises a prediction unit minimum rate distortion cost determination module, each segmentation mode minimum rate distortion cost determination module, a segmentation mode determination module with minimum rate distortion cost and an encoding module.

Wherein the prediction unit minimum rate distortion cost determination module is configured to obtain a minimum rate distortion cost of each prediction unit of the coding unit in each partition mode. And the minimum rate-distortion cost determining module of each segmentation mode is configured to determine the minimum rate-distortion cost corresponding to each segmentation mode according to the minimum rate-distortion cost of each prediction unit. The non-segmentation mode with the minimum rate distortion cost determination module is configured to determine the segmentation mode with the minimum rate distortion cost in the current coding unit according to the minimum rate distortion cost of each segmentation mode. The encoding module is configured to encode the current coding unit according to the partition mode with the smallest rate-distortion cost.

In some embodiments, the encoding module may include: a target reference frame determining unit, a current reference unit determining unit, a target residual determining unit, and a target residual encoding unit.

The target reference frame determining unit is configured to acquire a target reference frame corresponding to each prediction unit in the partition mode with the minimum rate-distortion cost. The current reference unit determination unit is configured to generate a current reference unit of the current coding unit from the target reference frame. The target residual determination unit is configured to generate a target motion vector and a target residual of the current coding unit from the current reference unit. The target residual encoding unit is configured to encode the target motion vector and the target residual to encode the current encoding unit.

An embodiment of the present disclosure provides an electronic device, including: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the video encoding method of any of the above.

The disclosed embodiments provide a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements a video encoding method as described in any of the above.

The video coding method, the video coding device, the electronic device and the computer-readable storage medium provided by the embodiments of the present disclosure adaptively generate a candidate reference frame of a current prediction unit of a current coding unit in a current partition mode according to a first reference frame of a first prediction unit adjacent to the current coding unit, a second reference frame of a second prediction unit of a parent coding unit or a child coding unit of the current coding unit, and a third reference frame of a third prediction unit of the current coding unit in a target partition mode. According to the method, the candidate reference frame determined by the method is used for predicting the current prediction unit, on one hand, the processing complexity of determining the target reference frame with the minimum rate distortion cost through the candidate reference frame is reduced, and the complexity of video coding is reduced, on the other hand, the target reference frame is determined by determining the candidate reference frame which is more similar to the current prediction unit, so that the effect of video coding is improved, and the decoded video can be ensured to accurately restore the video before coding.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. The drawings described below are merely some embodiments of the present disclosure, and other drawings may be derived from those drawings by those of ordinary skill in the art without inventive effort.

Fig. 1 is a schematic diagram illustrating an exemplary system architecture of a video encoding method or a video encoding apparatus applied to an embodiment of the present disclosure.

Fig. 2 is a schematic diagram illustrating a computer system applied to a video encoding apparatus according to an exemplary embodiment.

Fig. 3 is a schematic diagram of a video coding framework shown in accordance with an example embodiment.

Fig. 4 is a diagram illustrating partitioning of a coding tree unit by a quadtree according to an example embodiment.

Fig. 5 is a diagram illustrating partitioning of a coding tree unit by a quadtree according to an example embodiment.

FIG. 6 illustrates various partitions of a coding unit according to an example embodiment.

FIG. 7 is a diagram illustrating a comparison of a current coding unit with multiple reference frames, according to an example embodiment.

Fig. 8 is a flow chart illustrating a method of video encoding according to an example embodiment.

Fig. 9 is a diagram illustrating a method of determining a first prediction unit and a second prediction unit according to an example embodiment.

Fig. 10 is a diagram illustrating a method of determining a first prediction unit and a second prediction unit according to an example embodiment.

Fig. 11 is a diagram illustrating a method of determining a first prediction unit and a second prediction unit according to an example embodiment.

FIG. 12 is a diagram illustrating a sequence of execution of various partitioning approaches according to an example embodiment.

Fig. 13 is a flow chart illustrating a method of video encoding according to an example embodiment.

Fig. 14 is a flow chart illustrating a method of video encoding according to an example embodiment.

Fig. 15 is a flow chart illustrating a method of video encoding according to an example embodiment.

Fig. 16 is a flow chart illustrating a method of video encoding according to an example embodiment.

Fig. 17 is a flowchart of step S05 in fig. 8 in an exemplary embodiment.

FIG. 18 is a diagram illustrating a method of determining candidate reference frames for a current coding unit according to an example embodiment.

FIG. 19 is a diagram illustrating a method of determining candidate reference frames for a current coding unit according to an example embodiment.

Fig. 20 is a flowchart of step S05 in fig. 8 in an exemplary embodiment.

Fig. 21 illustrates a video encoding method according to an example embodiment.

Fig. 22 is a flowchart of step S104 in fig. 21 in an exemplary embodiment.

Fig. 23 is a block diagram illustrating a video encoding apparatus according to an example embodiment.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The same reference numerals denote the same or similar parts in the drawings, and thus, a repetitive description thereof will be omitted.

The described features, structures, or characteristics of the disclosure may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and the like. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the disclosure.

The drawings are merely schematic illustrations of the present disclosure, in which the same reference numerals denote the same or similar parts, and thus, a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and steps, nor do they necessarily have to be performed in the order described. For example, some steps may be decomposed, and some steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.

In this specification, the terms "a", "an", "the", "said" and "at least one" are used to indicate the presence of one or more elements/components/etc.; the terms "comprising," "including," and "having" are intended to be inclusive and mean that there may be additional elements/components/etc. other than the listed elements/components/etc.; the terms "first," "second," and "third," etc. are used merely as labels, and are not limiting on the number of their objects.

The following detailed description of exemplary embodiments of the disclosure refers to the accompanying drawings.

Fig. 1 shows a schematic diagram of an exemplary system architecture of a video encoding method or a video encoding apparatus to which the embodiments of the present disclosure can be applied.

As shown in fig. 1, the system architecture 100 may include

electronic devices

101, 102, 103, a network 104, and a server 105. The network 104 is used to provide a medium for communication links between the

electronic devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

A user may use the

electronic devices

101, 102, 103 to interact with the server 105 over the network 104 to receive or send messages or the like. The

electronic devices

101, 102, 103 may be various electronic devices having display screens and supporting web browsing, including but not limited to smart phones, tablets, laptop portable computers, desktop computers, wearable devices, virtual reality devices, smart homes, and the like.

The server 105 may be a server that provides various services, such as a background management server that provides support for devices operated by users with the

electronic devices

101, 102, 103. The background management server can analyze and process the received data such as the request and feed back the processing result to the electronic equipment.

The server 105 may, for example, obtain a current coding unit of a current frame image in the target video; server 105 may, for example, obtain a first reference frame for a first prediction unit adjacent to the current coding unit, the first prediction unit having completed encoding; server 105 may, for example, determine an inter-coding unit for the current coding unit, the inter-coding unit including a second prediction unit, obtain a second reference frame for the second prediction unit, the second prediction unit having completed encoding; server 105 may, for example, obtain a third reference frame for a third prediction unit of the current coding unit in the target partition mode, the third prediction unit having completed encoding; server 105 may determine, for example, according to the first reference frame, the second reference frame, and the third reference frame, a candidate reference frame corresponding to a current prediction unit of a current coding unit in a current partition mode; server 105 may, for example, determine a first rate-distortion cost for the current prediction unit relative to the candidate reference frame; the server 105 may encode the current coding unit of the current frame image, for example, according to the first rate-distortion cost.

It should be understood that the number of electronic devices, networks, and servers in fig. 1 is merely illustrative, and the server 105 may be a physical server or may be composed of a plurality of servers, and there may be any number of electronic devices, networks, and servers according to actual needs.

In addition, the technical solution provided by the embodiment of the present disclosure may also be completed through Cloud computing (Cloud computing) in Cloud technology (Cloud technology), which is not limited by the present disclosure.

The cloud technology is a hosting technology for unifying series resources such as hardware, software, network and the like in a wide area network or a local area network to realize the calculation, storage, processing and sharing of data.

The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied in the cloud computing business model, can form a resource pool, is used as required, and is flexible and convenient. Cloud computing technology will become an important support. Background services of the technical network system require a large amount of computing and storage resources, such as video websites, picture-like websites and more web portals. With the high development and application of the internet industry, each article may have its own identification mark and needs to be transmitted to a background system for logic processing, data in different levels are processed separately, and various industrial data need strong system background support and can only be realized through cloud computing.

Cloud computing refers to a mode of delivery and use of IT (Internet Technology) infrastructure, and refers to obtaining required resources through a network in an on-demand, easily extensible manner; the generalized cloud computing refers to a delivery and use mode of a service, and refers to obtaining a required service in an on-demand and easily-extensible manner through a network. Such services may be IT and software, internet related, or other services. Cloud Computing is a product of development and fusion of traditional computers and Network Technologies, such as Grid Computing (Grid Computing), distributed Computing (distributed Computing), Parallel Computing (Parallel Computing), Utility Computing (Utility Computing), Network Storage (Network Storage Technologies), Virtualization (Virtualization), Load balancing (Load Balance), and the like.

With the development of diversification of internet, real-time data stream and connecting equipment and the promotion of demands of search service, social network, mobile commerce, open collaboration and the like, cloud computing is rapidly developed. Different from the prior parallel distributed computing, the generation of cloud computing can promote the revolutionary change of the whole internet mode and the enterprise management mode in concept.

Referring now to FIG. 2, shown is a block diagram of a computer system 200 suitable for use in implementing the electronic device of an embodiment of the present application. The electronic device shown in fig. 2 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 2, the computer system 200 includes a Central Processing Unit (CPU) 201 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM) 202 or a program loaded from a storage section 208 into a Random Access Memory (RAM) 203. In the RAM 203, various programs and data necessary for the operation of the system 200 are also stored. The CPU 201, ROM 202, and RAM 203 are connected to each other via a bus 204. An input/output (I/O) interface 205 is also connected to bus 204.

The following components are connected to the I/O interface 205: an input portion 206 including a keyboard, a mouse, and the like; an output section 207 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 208 including a hard disk and the like; and a communication section 209 including a network interface card such as a LAN card, a modem, or the like. The communication section 209 performs communication processing via a network such as the internet. A drive 210 is also connected to the I/O interface 205 as needed. A removable medium 211, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is mounted on the drive 210 as necessary, so that a computer program read out therefrom is installed into the storage section 208 as necessary.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 209 and/or installed from the removable medium 211. The above-described functions defined in the system of the present application are executed when the computer program is executed by the Central Processing Unit (CPU) 201.

It should be noted that the computer readable storage medium shown in the present application can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable storage medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules and/or) units described in the embodiments of the present application may be implemented by software or hardware. The described modules and/or) units may also be provided in a processor, and may be described as: a processor includes a transmitting unit, an obtaining unit, a determining unit, and a first processing unit. Wherein the names of the modules and/or) units do not in some cases constitute a limitation of the modules and/or) units themselves.

As another aspect, the present application also provides a computer-readable storage medium, which may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable storage medium carries one or more programs which, when executed by a device, cause the device to perform functions including: acquiring a current coding unit of a current frame image in a target video; acquiring a first reference frame of a first prediction unit adjacent to the current coding unit, wherein the first prediction unit is coded; determining an inter-coding unit of the current coding unit, wherein the inter-coding unit comprises a second prediction unit, a second reference frame of the second prediction unit is obtained, and the second prediction unit is coded; acquiring a third reference frame of a third prediction unit of the current coding unit in a target segmentation mode, wherein the third prediction unit is coded; determining a candidate reference frame corresponding to a current prediction unit of a current coding unit in a current segmentation mode according to the first reference frame, the second reference frame and the third reference frame; determining a first rate-distortion cost of the current prediction unit relative to the candidate reference frame, so as to encode the current coding unit of the current frame image according to the first rate-distortion cost.

In the field of video transmission, it is necessary to encode a video to be transmitted to realize transmission of the video. Video coding refers to a coding method for converting a file in a certain video format into a file in another video format by a specific compression technology. The main codec standards in video streaming include h.261, h.263, h.264, and HEVC (High Efficiency video coding).

Fig. 3 explains video coding by taking the HEVC video coding framework as an example.

First, the abbreviations referred to in FIG. 3 need to be explained: DB: deblocking filter, shorthand for Deblocking filtering; SAO: sample Adaptive Offset, short for Adaptive pixel compensation; ME: motion Estimation, short for Motion Estimation; MC: motion Compensation, short for Motion Compensation.

Video coding can be divided into intra-coding and inter-coding. In the video coding frame shown in fig. 3, a current frame image may be sent to an encoder, and a predicted value is obtained after intra prediction or inter prediction; subtracting the predicted value from the input data to obtain a residual error; then, performing DCT (Discrete Cosine Transform) transformation and quantization to obtain residual error coefficients; and sending the residual coefficient into an entropy coding module to output a code stream so as to complete video coding. Meanwhile, after the residual coefficient is subjected to inverse quantization and inverse transformation, a residual value of a reconstructed image is obtained; then adding the predicted value of the image frame or the inter-frame to obtain a reconstructed image; and the reconstructed image enters a reference frame queue after being subjected to in-loop filtering, and is used as a reference image of the next frame, so that one frame is coded backwards.

In the field of video Coding, a frame of picture may be divided into a plurality of Coding Tree Units (CTUs), and one CTU may be divided downward in a quadtree manner. As shown in fig. 4, a CTU may be divided into a plurality of CUs (Code units) in a quadtree manner, and each leaf node (each number in fig. 4 represents a leaf node) may be regarded as a coding Unit. By displaying the above-mentioned dividing process in a graph, the dividing effect shown in fig. 5 can be obtained (each number can represent one coding unit).

In some embodiments, one CTU may be divided into a plurality of coding units by a division depth of a quadtree and a minimum Rate Distortion cost (RD cost) corresponding to the coding units.

For example, in the inter-coding process, the CTUs may be divided by the following steps.

The method comprises the following steps: and traversing all the segmentation modes corresponding to the interframe coding on the coding unit a with the size of 64 multiplied by 64 and the depth of 0 to obtain the optimal segmentation mode and the rate distortion cost Ra when the depth is 0.

Step two, performing primary CU division on the a to obtain four sub-CUs: b0, b1, b2, b3, where the coding depth is 1, and traversing all inter-frame partition modes for the coding unit b0 to obtain the corresponding optimal partition mode and rate-distortion cost Rb0 for the coding unit b 0.

Step three, performing further CU division on b0 to obtain four sub-CUs: c0, c1, c2 and c3, where the coded depth is 2, and traverse all the partition modes corresponding to the inter-frame coding for the coding unit c0 to obtain the optimal partition mode and the rate distortion cost Rc0 corresponding to the coding unit c 0.

Step four, performing further CU division on the c0 to obtain four sub-CUs: d0, d1, d2 and d3, when the coded depth is 3, the maximum coded depth is reached, and no more CU partitioning can be performed. Sequentially selecting the splitting modes of d0, d1, d2 and d3 to obtain the corresponding optimal splitting modes and rate-distortion costs Rd0, Rd1, Rd2 and Rd3, calculating the sum of the rate-distortion costs of d0, d1, d2 and d3, comparing the sum with Rc0, selecting the value with smaller rate-distortion as the minimum rate-distortion cost (denoted as Min-Rc 0) of c0, and discarding the quadtree splitting of c0 if the splitting mode and the splitting mode of the coding unit corresponding to the minimum rate-distortion cost are the optimal splitting mode and the splitting mode of c0 (if Rc0 is less than or equal to the sum of the rate-distortion costs of d0, d0 and d 0), and reserving the quadtree splitting of c0 if the Rc0 is greater than the sum of the rate-distortion costs of d0, d0 and d 0).

And step five, following the step four, sequentially performing division on c1, c2 and c3 and selection of a coding unit division mode to respectively obtain an optimal division mode and rate-distortion costs Min-Rc1, Min-Rc2 and Min-Rc3 corresponding to the optimal division mode and the coding unit division mode, calculating the sum of the rate-distortion costs of the four CUs at the current coding depth, comparing the sum with Rb0 to obtain a smaller rate-distortion cost (marked as Min-Rb 0), wherein the division mode and the division mode corresponding to the smaller rate-distortion cost are the optimal division mode and the division mode of b 0.

And step six, according to the steps two to five, sequentially carrying out division and division mode selection on b1, b2 and b3 to respectively obtain an optimal division mode and rate-distortion costs Min-Rb1, Min-Rb2 and Min-Rb3 which respectively correspond to the optimal division mode, the optimal division mode and the optimal division mode of the TCU, calculating the sum of the rate-distortion costs of the four CUs at the current coding depth, comparing the sum with Ra to obtain a smaller rate-distortion cost (marked as Min-Ra), and finding out the optimal division mode and the optimal division mode of the TCU.

The division mode based on the CTU can ensure that the area with larger change in the image is more meticulous in division and the area with smaller change is more rough in division, so that the division resources can be saved, the coding efficiency can be improved, the information loss can be avoided, and the accuracy of video coding is ensured.

In some embodiments, the CU may be further divided into PUs (prediction units) by a certain division manner, and minimum rate distortion costs of the PUs in each division manner are obtained, so as to determine the minimum rate distortion cost of the CU and the optimal division manner according to the minimum rate distortion costs of the PUs.

In some embodiments, the current CU may be divided into PUs in a seven-division manner as shown in fig. 6, but the disclosure is not limited thereto. The seven segmentation methods can be classified into three categories according to the segmentation effect, wherein 2N × 2N can be regarded as an overall segmentation method, 2N × N, N × 2N can be regarded as a symmetric segmentation method, and 2N × nU, 2N × nD, nL × 2N, and nR × 2N can be regarded as an asymmetric segmentation method. Where 2N × 2N is the size of the current coding unit, U, D, L, R is a positive integer smaller than N and greater than 0, which may be set according to actual requirements, and N is a positive number greater than 0.

In the related art, an encoder may configure a plurality of reference frames for a current coding unit, and respectively calculate rate distortion costs of each prediction unit of the current coding unit in each partition mode relative to the plurality of reference frames, so as to further determine a minimum rate distortion cost of the current coding unit, an optimal partition mode corresponding to the minimum rate distortion cost, and a target reference frame corresponding to each PU in the optimal partition mode.

In the inter-frame prediction, whether unidirectional prediction or bidirectional prediction is performed, each prediction unit in the 7-partition mode traverses a plurality of reference frames configured by the encoder to obtain a corresponding rate distortion cost, and then selects a reference frame corresponding to the minimum rate distortion cost as a target reference frame.

As shown in fig. 7, if the encoder configures eight reference frames (N-4 to N + 4) for the current coding unit (the current coding unit is in the current frame), and the current coding unit can be divided according to the seven division modes (which can be divided into 13 PUs), at least 13 × 8 times of rate distortion cost calculation is required to determine the minimum rate distortion cost of the current coding unit and the corresponding optimal coding mode.

Obviously, the above method has the following drawbacks.

First, if the reference frame configured by the encoder for the current coding unit does not include the target in the current coding unit, the encoding effect may be poor when the current coding unit is encoded through the reference frame.

Secondly, the number of reference frames configured by the encoder is too many, wherein redundancy may exist, and the encoding efficiency is reduced.

In practical operation, under the bidirectional 4-reference-frame configuration, the calculation of the reference frame module accounts for about 40% of the calculation amount of the whole video coding, and the proportion is increased along with the increase of the number of the reference frames, so that the calculation load is greatly increased.

The embodiment of the disclosure provides a video coding method, which can improve video coding efficiency and improve video coding effect.

Fig. 8 is a flow chart illustrating a method of video encoding according to an example embodiment. The method provided by the embodiment of the present disclosure may be processed by any electronic device with computing processing capability, for example, the server 105 and/or the

electronic devices

102 and 103 in the embodiment of fig. 1 described above, and in the following embodiment, the server 105 is taken as an execution subject for example, but the present disclosure is not limited thereto.

Referring to fig. 8, a video encoding method provided by an embodiment of the present disclosure may include the following steps.

In step S01, the current coding unit of the current frame image in the target video is acquired.

In some embodiments, when encoding a target video, a current frame image needs to be obtained from the target video, and the current frame image is divided by a quadtree to obtain a current encoding unit.

It is to be understood that the current coding unit may be a coding unit or a coding block that needs to be coded in any video protocol (including but not limited to HEVC protocol), which is not limited by this disclosure.

In step S02, a first reference frame of a first prediction unit adjacent to the current coding unit is obtained, the first prediction unit having completed coding.

As shown in fig. 9 or fig. 10, E is the current prediction unit of the current coding unit G. It is to be understood that the current coding unit may be segmented by a 2N × 2N segmentation mode to obtain the current prediction unit E shown as E in fig. 9, or may be segmented by other segmentation modes to obtain the current prediction unit E shown in fig. 10 (that is, the current prediction unit may be the same as the current coding unit or may be a part of the current coding unit, and here, only fig. 10 is taken as an example, but not limited to the effect shown in fig. 10), which is not limited by this disclosure.

As shown in fig. 9, A, B, C, D, etc. may be spatially adjacent (edge-wise or point-wise) to the current coding unit G and the already coded prediction unit may be the first prediction unit. It is understood that the edges of A, B, C and D may be longer than the edges of the current coding unit G or shorter than the edges of the current coding unit G, which is not limited by this disclosure.

The encoded prediction unit may have acquired a target reference frame corresponding to a minimum rate-distortion cost using a video inter-prediction method and completed prediction based on the target reference frame.

Therefore, the first reference frame mentioned in this disclosure may be a target reference frame corresponding to the minimum rate distortion cost of each first prediction unit.

It is understood that the target reference frames of different first prediction units may be the same or different, and the disclosure is not limited thereto.

In step S03, an inter-coding unit of the current coding unit is determined, the inter-coding unit including a second prediction unit, a second reference frame of the second prediction unit is obtained, and the second prediction unit is completely encoded.

In some embodiments, in the process of inter-coding a video, it is generally required to divide a coding tree unit according to a quadtree, and respectively calculate a sum of a minimum rate distortion cost of a current coding unit and a minimum rate distortion cost of a sub-coding unit to determine whether the current coding unit needs to be divided.

In some embodiments, the inter-coding unit of the current coding unit may refer to a parent coding unit of the current coding unit or a child coding unit of the current coding unit in the coding tree unit, and may also refer to a grandparent coding unit of the current coding unit or a grandchild coding unit of the current coding unit, which is not limited by the present disclosure.

In some embodiments, the rate-distortion cost calculation may be performed from parent to child through a quadtree of coding units. That is, in determining the target reference frame of the current coding unit, the target reference frame of the second prediction unit (which may be one or more) corresponding to the parent coding unit of the current coding unit may have been determined. The second prediction unit corresponding to the parent coding unit of the current coding unit may be a prediction unit obtained by 2N × 2N division, but the disclosure does not limit this.

As shown in fig. 9 or fig. 10, F may be a prediction unit corresponding to a parent coding unit of the current coding unit G (a prediction unit obtained by dividing the parent coding unit through a 2N × 2N division mode, but the present disclosure is not limited to the 2N × 2N division mode).

In some embodiments, the rate-distortion cost calculation may be performed from child to parent through a quadtree of coding units. As shown in fig. 11, when determining candidate reference frames for the current prediction unit of the current coding unit G, the target reference frame of the second prediction unit (which may be one or more) corresponding to the sub-coding units a, b, c, d of the current coding unit G may already be determined. The second prediction unit corresponding to the sub coding unit of the current coding unit may be a prediction unit obtained by 2N × 2N division, but the disclosure does not limit this.

In step S04, a third reference frame of a third prediction unit of the current coding unit in the target partition mode is obtained.

In some embodiments, the current prediction unit may be a prediction unit obtained by dividing the current coding unit according to the current division mode.

In some embodiments, when the current coding unit is coded, the current coding unit may be divided in the order as shown in fig. 12, and then the rate-distortion cost of the current coding unit in each division mode is calculated, so that the division mode with the smallest rate-distortion cost is taken as the optimal division mode. In some embodiments, the seven segmentation patterns shown in fig. 12 may be divided into three broad classes according to the segmentation effect, where 2N × 2N may be considered as an overall segmentation approach, 2N × N, N × 2N may be considered as a symmetric segmentation approach, and 2N × nU, 2N × nD, nL × 2N, and nR × 2N may be considered as an asymmetric segmentation approach.

In some embodiments, the segmentation patterns (2N × N, N × 2N) in the symmetric segmentation patterns may be ranked unsequential, and the segmentation patterns (2N × nU, 2N × nD, nL × 2N, and nR × 2N) in the asymmetric segmentation patterns may also be ranked unsequential.

In some embodiments, the current partition mode may be any one of the partition modes shown in fig. 12, the target partition mode may be all partition modes that are ranked before the current partition mode, and the minimum rate-distortion cost of each prediction unit of the current coding unit in the target partition mode and the corresponding target reference frame are determined. Assuming that the current segmentation pattern is 2N × N, the target segmentation pattern may include 2N × 2N and N × 2N.

In some embodiments, the target reference frame of the third prediction unit in the target partition mode for which the minimum rate-distortion cost is determined is the target reference frame of the third prediction unit.

In step S05, a candidate reference frame corresponding to the current prediction unit of the current coding unit in the current partition mode is determined according to the first reference frame, the second reference frame, and the third reference frame.

In some embodiments, the first reference frame, the second reference frame, and the third reference frame may be used as candidate reference frames for the current coding unit.

In step S06, a first rate-distortion cost of the current prediction unit with respect to the candidate reference frame is determined, so as to encode the current coding unit of the current frame image according to the first rate-distortion cost.

In some embodiments, rate-distortion costs (i.e., first rate-distortion costs) of the current prediction unit with respect to the respective candidate reference frames may be calculated, and then the candidate reference frame with the smallest first rate-distortion cost is used as the target reference frame of the current prediction unit, where the smallest rate-distortion cost is the smallest rate-distortion cost of the current prediction unit.

The minimum rate distortion cost of each prediction unit of the current coding unit in the current segmentation mode can be obtained through the method; then adding the minimum rate distortion cost of each prediction unit as the minimum rate distortion cost of the current coding unit in the current segmentation mode; then, acquiring the minimum rate distortion cost of the current coding unit in each segmentation mode, and taking the segmentation mode corresponding to the minimum rate distortion cost as the optimal segmentation mode of the current coding unit; the prediction of the current coding unit, the estimation of the motion vector and the determination of the residual coefficient can be completed through the optimal partition mode of the current coding unit and the target reference frame corresponding to each prediction unit in the optimal partition mode, so as to realize the coding of the current coding unit.

In the technical solution provided by this embodiment, the candidate reference frame of the current prediction unit is determined by the first reference frame of the first prediction unit adjacent to the current coding unit, the second reference frame of the second prediction unit of the parent coding unit or the child coding unit, and the third reference frame of the third prediction unit determined by the current coding unit in the target partition mode. On one hand, the high probability of the target body in the candidate reference frame determined by the method is the same as that of the target body in the current prediction unit, so that the target reference frame of the current prediction unit can be efficiently, accurately and adaptively determined by the candidate reference frame, and the compression rate of video coding can be improved; on the other hand, the number of the candidate reference frames determined by the method is small, so that the computing resources can be greatly saved, and the video coding efficiency is improved.

Fig. 13 illustrates a video encoding method according to an example embodiment.

In some embodiments, the inter-frame coding may include seven partition modes as shown in fig. 12, where the seven partition modes may be further divided into three categories according to the partition effect, that is, an overall partition mode, a symmetric partition mode, and an asymmetric partition mode, and the execution sequence of each partition mode in each category may not be sequential, or may be directly executed according to the sequence shown in fig. 12, which is not limited by the present disclosure.

In some embodiments, the current segmentation pattern may be an overall segmentation pattern (e.g., a 2N by 2N segmentation pattern).

Referring to fig. 13, the above-described video encoding method may include the following steps.

In step S051, a candidate reference frame of the current prediction unit of the current coding unit in the global partition mode is generated according to the first reference frame and the second reference frame.

In some embodiments, since the global partition mode is generally preferentially performed, the current coding unit may not have a prediction unit for which a target reference frame has been determined before the global partition mode is performed, so a candidate reference frame for the current prediction unit may be generated directly from the first reference frame and the second reference frame.

In some embodiments, if the current coding unit is segmented according to the target segmentation mode before the overall segmentation mode is executed, and a target reference frame corresponding to a third prediction unit of the target segmentation mode is obtained, the target reference frame of the third prediction unit may be used as the candidate reference frame, which is not limited by the present disclosure, subject to actual operation requirements of a skilled person.

In step S052, the front k1 reference frame or the rear k2 reference frame of the current frame picture is taken as a candidate reference frame of the current prediction unit of the current coding unit in the global partition mode according to the frame type of the current frame picture, both k1 and k2 being integers greater than or equal to 1.

In actual operation, the probabilities of the global segmentation mode, the symmetric segmentation mode, and the asymmetric segmentation mode being selected are 50%, 20%, and 10%, respectively. Therefore, the probability that the whole segmentation mode is selected is the highest, but the number of candidate reference frames generated according to the first reference frame and the second reference frame is relatively small, so that the candidate reference frames can be subjected to enhancement processing according to the frame type of the current frame where the current coding unit is located.

The frame type may refer to a P frame (inter prediction coding frame), a B frame (bidirectional prediction coding frame), a GBP frame (Generalized P frame), and the like in an inter coding process, where the P frame and the GBP frame may become reference frames of many frame images in a video coding process and may contribute to many frame images, so the P frame and the GBP frame are important in inter video coding. Therefore, there is a need to enhance candidate reference frames for a prediction unit within a P frame or GBP frame to ensure that the prediction unit can find the optimal reference frame.

In some embodiments, if the frame type of the current frame is a P frame or a GBP frame, two frames before or two frames after the current frame may be considered to be added to the candidate reference frame, and if the frame type of the current frame is a B frame, two frames before or one frame after the current frame may be considered to be added to the candidate reference frame, which is not limited by the present disclosure.

The first two frames of the current frame may refer to the first two frames adjacent to the current frame in time, or may refer to the last two frames of images in the reference frame queue. When a video is coded, each time a frame of image is coded, the frame of image can be added into a reference frame queue as a reference frame to be a reference frame of other frames.

Assuming that there is a reference frame in the reference frame queue with the coding order of 195324768, the first two frames of the current frame may refer to the two-frame image with the number of 68, and the last two frames of the current frame may refer to the two-frame image with the number of 19, but the disclosure is not limited thereto.

It is to be understood that the candidate reference frame of the current prediction unit of the current coding unit in the global partition mode may also be generated directly from the first reference frame and the second reference frame without performing enhancement processing thereon, which is not limited by the present disclosure.

According to the technical scheme provided by the embodiment, the target reference frames corresponding to the first prediction unit and the second prediction unit with the same target body and the same current prediction unit in high probability are used as the candidate reference frames of the current prediction unit, the importance of the current segmentation mode in inter-frame prediction and the importance of the frame type of the current frame image in video coding are fully considered, the candidate reference frames with the same target body in high probability are determined for the current prediction unit, video coding is performed based on the candidate reference frames, and the coding speed can be improved and the coding effect can be enhanced.

Fig. 14 illustrates a video encoding method according to an example embodiment.

In some embodiments, the current segmentation mode may be any one of the symmetric segmentation modes shown in fig. 12, and the target segmentation mode may include an overall segmentation mode.

Referring to fig. 14, the above-described video encoding method may include the following steps.

In step S04, a third reference frame of a third prediction unit of the current coding unit in the global partition mode is obtained, the third prediction unit having completed coding.

In some embodiments, the global partition pattern (2N × 2N) may have been completed before the current partition pattern is executed, and the target reference frame (i.e., the third reference frame) of the prediction unit corresponding to the 2N × 2N partition pattern is also determined.

In step S053, the first reference frame, the second reference frame, and a third reference frame of a third prediction unit of the current coding unit in the global partition mode are used as candidate reference frames of the current prediction unit of the current coding unit in the current partition mode.

According to the technical scheme provided by the embodiment, the first reference frame, the second reference frame and the third reference frame of the third prediction unit of the current coding unit in the integral segmentation mode, which have extremely high target body similarity and correspond to the current prediction unit, are used as the candidate reference frames of the current prediction unit, and the rate distortion cost and the target reference frame are determined according to the candidate reference frames, so that the comparison times can be greatly reduced, the calculation resources can be saved, and the compression rate and the video coding effect can be improved.

Fig. 15 illustrates a video encoding method according to an example embodiment.

In some embodiments, the current segmentation mode may be a symmetric segmentation mode, and the target segmentation mode may include a global segmentation mode and a predicted segmentation mode of the symmetric segmentation mode.

In some embodiments, the symmetric split mode may include at least two split modes, and the two split modes may be executed in a non-sequential order, so that when one split mode is executed, the other split mode may have been executed. Thus, a partition mode that has been executed before the current mode is executed may be a predicted partition mode. It is noted that the predicted partition mode does not necessarily have completed prediction, but rather has completed partitioning of PUs and determination of target reference frames for individual PUs.

Referring to fig. 15, the above-described video encoding method may include the following steps.

In step S04, a third reference frame of a third prediction unit of the current coding unit in the integer partition mode and the symmetric partition mode is obtained, the third prediction unit having completed coding.

In step S054, the first reference frame, the second reference frame, and a third reference frame of a third prediction unit of the current coding unit in the global partition mode and the predicted partition mode are used as candidate reference frames of the current prediction unit of the current coding unit in the current partition mode.

The technical solution provided by this embodiment considers not only the target reference frame of the prediction unit in the global partition mode whose execution sequence is before the symmetric partition mode, but also the target reference frame of the prediction unit in the symmetric partition mode whose execution sequence is before the current partition mode, thereby ensuring both the search range of the target reference frame and the efficiency of video coding.

Fig. 16 illustrates a video encoding method according to an example embodiment.

In some embodiments, the current segmentation mode may be an asymmetric segmentation mode, and the target segmentation mode may include all segmentation modes in the global segmentation mode and the symmetric segmentation mode.

Referring to fig. 16, the above-described video encoding method may include the following steps.

In step S04, a third reference frame of a third prediction unit of the current coding unit in the target partition mode is obtained, the third prediction unit having completed coding.

In step S055, the first reference frame, the second reference frame, and a third reference frame of a third prediction unit of the current coding unit in the global partition mode and the symmetric partition mode are used as candidate reference frames of the current prediction unit of the current coding unit in the current partition mode.

In some embodiments, the symmetric split modes may include at least four split modes, and the four split modes may be executed in a non-sequential order, so that when one split mode is executed, the other split modes may have been executed. Accordingly, a partition mode that has been executed before the current mode is executed may be a predicted partition mode among the asymmetric partition modes. It is noted that the predicted partition mode does not necessarily have completed prediction, but rather has completed partitioning of PUs and determination of target reference frames for individual PUs.

Therefore, the third reference frame of the third prediction unit in the predicted partition mode in the asymmetric partition mode may also be used as a candidate reference frame of the current prediction unit of the current coding unit in the current partition mode.

The technical solution provided in this embodiment adaptively determines a candidate reference frame with extremely high similarity for a prediction unit corresponding to each partition mode in an asymmetric partition mode, which is beneficial to providing a coding speed and a coding effect.

Fig. 17 is a flowchart of step S05 in fig. 8 in an exemplary embodiment.

In some embodiments, the current partition mode may be an asymmetric partition mode shown in the right side of fig. 18 or fig. 19 (fig. 18 is exemplified by 2N × nU, fig. 19 is exemplified by nL × 2N, but is not limited thereto), the current coding unit includes the first current prediction unit in the asymmetric partition mode (e.g., 0 prediction unit in the 2N × nU partition mode in fig. 18 or 0 prediction unit in the nL × 2N partition mode in fig. 19), the target partition mode may include an overall partition mode and a symmetric partition mode, and the current coding unit may include the first target prediction unit in the symmetric partition mode (e.g., 0 unit obtained in the 2N × N partition mode in fig. 18 or 0 unit obtained in the N × 2N partition mode).

Referring to fig. 17, the above-mentioned step S05 may include the following steps.

In step S056, a target first reference frame with the smallest rate distortion cost is determined from the candidate reference frames of the prediction unit of the current coding unit in the global partition mode.

In step S057, a target second reference frame with the smallest rate-distortion cost is determined from the candidate reference frames of the first target prediction unit.

In fig. 18, a target second reference frame with the smallest rate distortion cost corresponding to the first target prediction unit (for example, the prediction unit corresponding to 0) of the current coding unit in the 2N × N partition mode may be obtained.

In fig. 19, a target second reference frame with the smallest rate distortion cost corresponding to the first target prediction unit (0 prediction unit) of the current coding unit in the N × 2N partition mode may be obtained.

In step S058, the target first reference frame and the target second reference frame are used as candidate reference frames corresponding to the first current prediction unit of the current coding unit in the asymmetric partition mode.

As shown in fig. 18 or 19, the target first reference frame in global partition mode may be merged with the target second reference frame in symmetric partition mode to determine a candidate reference frame for the first current prediction unit. It is to be understood that there should be a spatial correspondence between the first current prediction unit and the first target prediction unit. For example, if the first current prediction unit is on the left side of the current coding unit, then the second target prediction unit should also be on the left side of the current coding unit, and if the first current prediction unit is on the upper side of the current coding unit, then the second target prediction unit should also be on the upper side of the current coding unit.

In the technical scheme provided by this embodiment, considering that the asymmetric partitioning mode is not high in the seven partitioning modes, in order to save computational resources, the candidate reference frame of the first current prediction unit in the asymmetric partitioning mode can be determined by the prediction unit in the integral partitioning mode and the prediction unit in the symmetric partitioning mode, so that high similarity between the target object in the candidate reference frame and the target object in the first current prediction unit can be ensured, the computation process can be reduced, and the efficiency of video coding is improved.

Fig. 20 is a flowchart of step S05 in fig. 8 in an exemplary embodiment.

In some embodiments, the current coding unit may further include a second current prediction unit in the asymmetric partitioning mode (e.g., 1 prediction unit in the 2N × nU partitioning mode in fig. 18 or 1 prediction unit in the nL × 2N partitioning mode in fig. 19), and the current coding unit may further include a second target prediction unit in the symmetric partitioning mode (e.g., 1 unit obtained in the 2N × N partitioning mode in fig. 18 or 1 unit obtained in the N2N partitioning mode).

Referring to fig. 20, the above-mentioned step S05 may include the following steps.

In step S059, a target third reference frame with the smallest rate-distortion cost is determined from the candidate reference frames of the second target prediction unit.

In fig. 18, a target third reference frame with the smallest rate distortion cost corresponding to the first target prediction unit (1 prediction unit) of the current coding unit in the 2N × N partition mode may be obtained.

In fig. 19, a target third reference frame with the smallest rate distortion cost corresponding to the first target prediction unit (1 prediction unit) of the current coding unit in the N × 2N partition mode may be obtained.

In step S0510, the target first reference frame and the target third reference frame are used as candidate reference frames corresponding to the second current prediction unit of the current coding unit in the asymmetric partition mode.

As shown in fig. 18 or 19, the target first reference frame in global partition mode may be merged with the target third reference frame in symmetric partition mode to determine a candidate reference frame for the second current prediction unit. It is to be understood that there should be a spatial correspondence between the first current prediction unit and the first target prediction unit. For example, if the first current prediction unit is on the right side of the current coding unit, then the second target prediction unit should also be on the right side of the current coding unit, and if the first current prediction unit is on the lower side of the current coding unit, then the second target prediction unit should also be on the lower side of the current coding unit.

In the technical scheme provided by this embodiment, considering that the asymmetric partitioning mode is not high in the seven partitioning modes, in order to save computational resources, the subsequent reference frame of the current prediction unit in the asymmetric partitioning mode can be determined by the prediction unit in the integral partitioning mode and the prediction unit in the symmetric partitioning mode, so that high similarity between the target object in the candidate reference frame and the target object in the current prediction unit can be ensured, the computation process can be reduced, and the efficiency of video coding is improved.

Fig. 21 illustrates a video encoding method according to an example embodiment.

Referring to fig. 21, the above-described video encoding method may include the following steps.

In step S101, the minimum rate-distortion cost of each prediction unit of the coding unit in each partition mode is obtained.

In some embodiments, the current coding unit may correspond to multiple partition units, each of which may correspond to at least one prediction unit, each of which may in turn have multiple candidate reference frames. The minimum rate-distortion cost of each prediction unit can be sequentially solved.

In step S102, a minimum rate-distortion cost corresponding to each partition mode is determined according to the minimum rate-distortion cost of each prediction unit.

In some embodiments, a minimum rate-distortion cost corresponding to each prediction unit and a target reference frame corresponding to the minimum rate-distortion cost may be determined; and then obtaining the sum of the minimum rate distortion cost of each prediction unit in the segmentation mode as the minimum rate distortion cost of the segmentation mode.

In step S103, the partition mode with the smallest rate distortion cost in the current coding unit is determined according to the smallest rate distortion cost of each partition mode.

In step S104, the current coding unit is coded according to the partition mode with the minimum rate-distortion cost.

In some embodiments, encoding the current coding unit according to the partition mode with the least rate-distortion cost may include: acquiring a partition mode with the minimum rate distortion cost as an optimal partition mode of a current coding unit; then splicing target reference frames of the prediction units obtained in the optimal segmentation mode to serve as reference units of the current coding units; and determining a motion vector and a residual error of the current coding unit relative to the target reference unit, wherein the coding of the current coding unit can be completed according to the motion vector and the residual error.

Fig. 22 is a flowchart of step S104 in fig. 21 in an exemplary embodiment.

Referring to fig. 20, the above-described step S104 may include the following steps.

In step S1041, a target reference frame corresponding to each prediction unit in the partition mode with the minimum rate-distortion cost is obtained.

In some embodiments, the current coding unit may be divided by the division mode with the smallest rate-distortion cost to obtain one prediction unit or two prediction units, which is not limited in this disclosure.

In step S1042, a current reference unit of the current coding unit is generated according to the target reference frame.

In some embodiments, if a current coding unit is segmented by a segmentation mode with the minimum rate-distortion cost to obtain a prediction unit, a target reference frame corresponding to the prediction unit is a current reference unit; if at least two prediction units are obtained by dividing the current coding unit through the division mode with the minimum rate-distortion cost, splicing target reference frames corresponding to the at least two prediction units to obtain the current reference unit of the current coding unit

In step S1043, a target motion vector and a target residual of the current coding unit are generated according to the current reference unit.

In step S1044, the target motion vector and the target residual are encoded to encode the current coding unit.

According to the technical scheme provided by the embodiment, the candidate reference frame highly similar to the target object in the current prediction unit is determined through the first prediction unit adjacent to the current coding unit, the second prediction unit of the parent coding unit or the child coding unit of the current coding unit and the prediction unit which is predicted in the current coding unit. The candidate reference frames are not only highly similar to the current prediction unit, but also are fewer in number, so that the coding effect can be guaranteed, and the coding efficiency can also be improved.

Fig. 23 is a block diagram illustrating a video encoding apparatus according to an example embodiment. Referring to fig. 23, a video encoding apparatus 2300 provided by an embodiment of the present disclosure may include: a current encoding unit acquiring module 2301, a first reference frame acquiring module 2302, a second reference frame acquiring module 2303, a third reference frame acquiring module 2304, a candidate reference frame determining module 2305, a rate-distortion cost determining module 2306, and an encoding unit 2307.

The current coding unit acquiring module 2301 may be configured to acquire a current coding unit of a current frame image in a target video. The first reference frame acquiring module 2302 may be configured to acquire a first reference frame of a first prediction unit adjacent to the current coding unit, the first prediction unit having completed encoding. The second reference frame acquiring module 2303 may be configured to determine an inter-coding unit of the current coding unit, the inter-coding unit including a second prediction unit, acquire a second reference frame of the second prediction unit, the second prediction unit having completed encoding. The third reference frame acquiring module 2304 may be configured to acquire a third reference frame of a third prediction unit of the current coding unit in the target partition mode, the third prediction unit having been encoded. The candidate reference frame determining module 2305 may be configured to determine a candidate reference frame corresponding to a current prediction unit of a current coding unit in a current partition mode according to the first reference frame, the second reference frame and the third reference frame. The rate-distortion cost determination module 2306 may be configured to determine a first rate-distortion cost of the current prediction unit relative to the candidate reference frame. The encoding unit 2307 may be configured to encode a current encoding unit of the current frame image according to the first rate-distortion cost.

In some embodiments, the candidate reference frame determination module 2305 may include: a candidate reference frame determining unit and a candidate reference frame enhancing unit in the global segmentation mode.

Wherein the candidate reference frame determination unit in the global partition mode may be configured to generate a candidate reference frame of a current prediction unit of the current coding unit in the global partition mode according to the first reference frame and the second reference frame. The candidate reference frame enhancement unit may be configured to treat a front k1 reference frame or a rear k2 reference frame of the current frame picture as a candidate reference frame of the current prediction unit of the current coding unit in the global partition mode according to a frame type of the current frame picture, both k1 and k2 being integers greater than or equal to 1.

In some embodiments, the candidate reference frame determination module may 2305 may include: a first candidate reference frame determination unit in a symmetric split mode.

Wherein the first candidate reference frame determining unit in the symmetric partition mode may be configured to use the first reference frame, the second reference frame, and a third reference frame of a third prediction unit of the current coding unit in the global partition mode as candidate reference frames of a current prediction unit of the current coding unit in the current partition mode.

In some embodiments, the current segmentation mode is a symmetric segmentation mode, and the target segmentation mode comprises an overall segmentation mode and a predicted segmentation mode of the symmetric segmentation mode.

In some embodiments, the candidate reference frame determination module may 2305 may include: and a second candidate reference frame determination unit in a symmetric division mode.

Wherein the second candidate reference frame determination unit in the symmetric split mode may be configured to: and taking the first reference frame, the second reference frame and a third reference frame of a third prediction unit of the current coding unit in the global partition mode and the predicted partition mode as candidate reference frames of the current prediction unit of the current coding unit in the current partition mode.

In some embodiments, the candidate reference frame determination module 2305 may include: and a first candidate reference frame determining unit in an asymmetric division mode.

Wherein the first candidate reference frame determining unit in the asymmetric partitioning mode may be configured to use the first reference frame, the second reference frame, and a third reference frame of a third prediction unit of the current coding unit in the global partitioning mode and the symmetric partitioning mode as candidate reference frames of a current prediction unit of the current coding unit in the current partitioning mode.

In some embodiments, the candidate reference frame determination module 2305 may include: the device comprises a target first reference frame determining unit, a target second reference frame determining unit and a second candidate reference frame determining unit under an asymmetric division mode.

Wherein the target first reference frame determination unit may be configured to determine a target first reference frame with a smallest rate-distortion cost from among the candidate reference frames of the prediction unit of the current coding unit in the global partition mode. The target second reference frame determination unit may be configured to determine a target second reference frame having a smallest rate-distortion cost from among the candidate reference frames of the first target prediction unit. The second candidate reference frame determination unit in the asymmetric partitioning mode may be configured to use the target first reference frame and the target second reference frame as candidate reference frames corresponding to the first current prediction unit of the current coding unit in the asymmetric partitioning mode.

In some embodiments, the candidate reference frame determination module 2305 may include: a third reference frame determination unit and a third candidate reference frame determination unit in an asymmetric partitioning mode.

Wherein the third reference frame determining unit may be configured to determine a target third reference frame with a minimum rate-distortion cost from the candidate reference frames of the second target prediction unit. The asymmetric partition mode third candidate reference frame determining unit may be configured to use the target first reference frame and the target third reference frame as candidate reference frames corresponding to the second current prediction unit of the current coding unit in the asymmetric partition mode.

Wherein the prediction unit minimum rate distortion cost determination module may be configured to obtain a minimum rate distortion cost of each prediction unit of the coding unit in each partition mode. The respective partition mode minimum rate-distortion cost determination module may be configured to determine a minimum rate-distortion cost corresponding to each partition mode according to the minimum rate-distortion cost of each prediction unit. The rate-distortion-cost-minimum non-partition mode determination module may be configured to determine a partition mode with a minimum rate-distortion cost in the current coding unit according to a minimum rate-distortion cost of each partition mode. The encoding module may be configured to encode the current coding unit according to the partition mode in which the rate-distortion cost is minimum.

The target reference frame determination unit may be configured to acquire a target reference frame corresponding to each prediction unit in the partition mode with the smallest rate-distortion cost. The current reference unit determination unit may be configured to generate a current reference unit of the current coding unit from the target reference frame. The target residual determination unit may be configured to generate a target motion vector and a target residual of the current coding unit from the current reference unit. The target residual encoding unit may be configured to encode the target motion vector and the target residual to encode the current encoding unit.

Since each functional module of the video encoding apparatus 2300 of the exemplary embodiment of the present disclosure corresponds to the steps of the exemplary embodiment of the video encoding method described above, it is not repeated herein.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution of the embodiment of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computing device (which may be a personal computer, a server, a mobile terminal, or a smart device, etc.) to execute the method according to the embodiment of the present disclosure, such as one or more of the steps shown in fig. 8.

Furthermore, the above-described figures are merely schematic illustrations of processes included in methods according to exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the disclosure is not limited to the details of construction, the arrangements of the drawings, or the manner of implementation that have been set forth herein, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. A video encoding method, comprising:

acquiring a current coding unit of a current frame image in a target video;

acquiring a first reference frame of a first prediction unit adjacent to the current coding unit, wherein the first prediction unit is coded;

determining an inter-coding unit of the current coding unit, wherein the inter-coding unit comprises a second prediction unit, a second reference frame of the second prediction unit is obtained, and the second prediction unit is coded;

acquiring a third reference frame of a third prediction unit of the current coding unit in a target segmentation mode;

determining a candidate reference frame corresponding to a current prediction unit of the current coding unit in a current segmentation mode according to the first reference frame, the second reference frame and the third reference frame;

determining a first rate-distortion cost for the current prediction unit relative to the candidate reference frame;

and coding the current coding unit of the current frame image according to the first rate distortion cost.

2. The method of claim 1, wherein the current segmentation mode is a global segmentation mode; determining a candidate reference frame corresponding to a current prediction unit of a current coding unit in a current partition mode includes:

generating a candidate reference frame of a current prediction unit of the current coding unit in the global partition mode according to the first reference frame and the second reference frame;

and taking a front k1 reference frame or a rear k2 reference frame of the current frame picture as a candidate reference frame of a current prediction unit of the current coding unit in the global partition mode according to the frame type of the current frame picture, wherein k1 and k2 are integers greater than or equal to 1.

3. The method of claim 1, wherein the current segmentation mode is a symmetric segmentation mode, and the target segmentation mode comprises a global segmentation mode; determining a candidate reference frame corresponding to a current prediction unit of a current coding unit in a current partition mode includes:

and taking the first reference frame, the second reference frame and a third reference frame of a third prediction unit of the current coding unit in the overall partition mode as candidate reference frames of the current prediction unit of the current coding unit in the current partition mode.

4. The method of claim 1, wherein the current partition mode is a symmetric partition mode, and the target partition mode comprises a global partition mode and a predicted partition mode of the symmetric partition mode; determining a candidate reference frame corresponding to a current prediction unit of a current coding unit in a current partition mode includes:

and using the first reference frame, the second reference frame and a third reference frame of a third prediction unit of the current coding unit in the global partition mode and the predicted partition mode as candidate reference frames of the current prediction unit of the current coding unit in the current partition mode.

5. The method of claim 1, wherein the current segmentation mode is an asymmetric segmentation mode, and the target segmentation mode comprises a global segmentation mode and a symmetric segmentation mode; determining a candidate reference frame corresponding to a current prediction unit of a current coding unit in a current partition mode includes:

and taking the first reference frame, the second reference frame and a third reference frame of a third prediction unit of the current coding unit in the global partition mode and the symmetric partition mode as candidate reference frames of the current prediction unit of the current coding unit in the current partition mode.

6. The method of claim 1, wherein the current partition mode is an asymmetric partition mode, wherein the current coding unit comprises a first current prediction unit in the asymmetric partition mode, wherein the target partition mode comprises an integer partition mode and a symmetric partition mode, and wherein the current coding unit comprises a first target prediction unit in the symmetric partition mode; determining a candidate reference frame corresponding to a current prediction unit of a current coding unit in a current partition mode includes:

determining a target first reference frame with the minimum rate distortion cost from candidate reference frames of a prediction unit of the current coding unit in the global partition mode;

determining a target second reference frame with the minimum rate distortion cost from the candidate reference frames of the first target prediction unit;

and taking the target first reference frame and the target second reference frame as candidate reference frames corresponding to the first current prediction unit of the current coding unit in the asymmetric partition mode.

7. The method of claim 6, wherein the current coding unit further comprises a second current prediction unit in the asymmetric partition mode, and wherein the current coding unit further comprises a second target prediction unit in the symmetric partition mode; wherein determining the candidate reference frame corresponding to the current prediction unit of the current coding unit in the current partition mode further comprises:

determining a target third reference frame with the minimum rate distortion cost from the candidate reference frames of the second target prediction unit;

and taking the target first reference frame and the target third reference frame as candidate reference frames corresponding to the second current prediction unit of the current coding unit in the asymmetric partition mode.

8. The method of any of claims 1 to 7, further comprising:

acquiring the minimum rate distortion cost of each prediction unit of the current coding unit in each partition mode;

determining the minimum rate distortion cost corresponding to each segmentation mode according to the minimum rate distortion cost of each prediction unit;

determining the partition mode with the minimum rate distortion cost in the current coding unit according to the minimum rate distortion cost of each partition mode;

and coding the current coding unit according to the partition mode with the minimum rate-distortion cost.

9. The method of claim 8, wherein encoding the current coding unit according to the partition mode with the smallest rate-distortion cost comprises:

acquiring a target reference frame corresponding to each prediction unit of the coding unit in the partition mode with the minimum rate distortion cost;

generating a current reference unit of the current coding unit according to the target reference frame corresponding to each prediction unit;

generating a target motion vector and a target residual of the current coding unit according to the current reference unit;

and coding the current coding unit according to the target motion vector and the target residual.

10. A video encoding apparatus, comprising:

the current coding unit acquisition module is configured to acquire a current coding unit of a current frame image in a target video;

a first reference frame obtaining module configured to obtain a first reference frame of a first prediction unit adjacent to the current coding unit, the first prediction unit having completed coding;

a second reference frame obtaining module configured to determine an inter-coding unit of the current coding unit, the inter-coding unit including a second prediction unit, obtain a second reference frame of the second prediction unit, the second prediction unit having completed coding;

a third reference frame obtaining module configured to obtain a third reference frame of a third prediction unit of the current coding unit in the target partition mode, wherein the third prediction unit is coded;

a candidate reference frame determining module configured to determine, according to the first reference frame, the second reference frame, and the third reference frame, a candidate reference frame corresponding to a current prediction unit of a current coding unit in a current partition mode;

a rate-distortion cost determination module configured to determine a first rate-distortion cost of the current prediction unit relative to the candidate reference frame;

and the coding module is configured to code the current coding unit of the current frame image according to the first rate distortion cost.

11. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-9.

12. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method according to any one of claims 1-9.