CN111901591A

CN111901591A - Method, device, server and storage medium for determining coding mode

Info

Publication number: CN111901591A
Application number: CN202010739370.2A
Authority: CN
Inventors: 麻莉雅
Original assignee: You Peninsula Beijing Information Technology Co ltd
Current assignee: You Peninsula Beijing Information Technology Co ltd
Priority date: 2020-07-28
Filing date: 2020-07-28
Publication date: 2020-11-06
Anticipated expiration: 2040-07-28
Also published as: CN111901591B

Abstract

The embodiment of the invention discloses a method, a device, a server and a storage medium for determining an encoding mode. Wherein, the method comprises the following steps: determining an adaptive prediction mode and a depth division indication of a target coding unit at each coding depth based on reference coding parameters of each mapping coding unit of the target coding unit in a corresponding reference video frame at each coding depth; and determining a target coding mode which enables the coding cost of the target coding unit to reach the optimal value from each adaptive prediction mode and the depth division indication-oriented coding division mode under the coding depth. The technical scheme provided by the embodiment of the invention greatly reduces the coding overhead in the video coding process, fully utilizes the reference coding parameters of the reference video frame to carry out coding guidance on the target coding unit under other code rates, reduces the coding complexity of the target coding unit on the basis of ensuring the coding quality of the target coding unit, and improves the coding efficiency of the target coding unit.

Description

Method, device, server and storage medium for determining coding mode

Technical Field

The embodiment of the invention relates to the technical field of video transcoding, in particular to a method, a device, equipment and a storage medium for determining an encoding mode.

Background

With the rapid development of the internet technology, along with the increase of the demand of users for high-definition videos, the amount of video data interacted with multimedia resources is also increasing continuously, and at the moment, a video compression decoding technology is usually adopted to effectively extract redundant information in the video data, so that the rapid transmission and offline storage of the video data in the internet are realized. In order to adapt to the bandwidth and equipment conditions of different audiences, the server side performs transcoding on the source stream video at different resolutions and different code rates, and distributes the transcoded video to corresponding audiences for downloading; the existing transcoding process comprises three parts, namely source stream decoding, scaling to a resolution specified by transcoding and re-encoding, wherein under the same transcoding resolution, if the encoding complexity is higher, the encoding quality is better, but the encoding time consumption is longer, and the long encoding time consumption causes the video played by a viewer to be jammed, so that on the basis of ensuring high-quality encoding, the encoding speed needs to be increased to reduce the encoding time consumption in the video transcoding process.

At present, fast transcoding of source stream Video at different Code rates is generally achieved by High Efficiency Video Coding (HEVC), in HEVC, multiple predictive Coding modes exist for each Coding block under bidirectional predictive Coding frames and forward predictive Coding frames, such as INTRA _2Nx2N, INTRA _ NxN, and Pulse Code Modulation (PCM) under the INTRA-predictive Coding mode, INTER _2Nx2N, INTER _2NxN, er _ Nx2N, INTER _ NxN, INTER _2NxnU, INTER _2NxnD, INTER _ nLx2N, and INTER _ nRx2N determined according to macroblock partition under the INTER-predictive Coding mode, and Merge mode and skip mode under INTER-predictive Coding.

Therefore, when multi-channel transcoding is performed on the same source video stream, for each coding unit under each recursive depth, the rate distortion cost of the coding unit under each predictive coding mode corresponding to the recursive depth and the rate distortion cost of each transformation unit divided by the coding unit under each predictive coding mode need to be calculated in a traversing manner, and then the optimal coding mode with the minimum cost is screened out for the coding unit; at this time, under each recursive depth, rate distortion cost of each coding unit under each predictive coding mode needs to be calculated, which causes a large amount of calculation burden and increases coding cost in the video transcoding process.

Disclosure of Invention

The embodiment of the invention provides a method, a device, equipment and a storage medium for determining a coding mode, which reduce the calculation cost of video coding and ensure the high efficiency of the video coding.

In a first aspect, an embodiment of the present invention provides a method for determining a coding mode, where the method includes:

determining an adaptive prediction mode and a depth division indication of a target coding unit at each coding depth based on reference coding parameters of each mapping coding unit of the target coding unit in a corresponding reference video frame at each coding depth;

and determining a target coding mode which enables the coding cost of the target coding unit to reach the optimal value from each adaptive prediction mode under the coding depth and the coding partition mode facing the depth partition indication.

In a second aspect, an embodiment of the present invention provides an apparatus for determining an encoding mode, where the apparatus includes:

the encoding adaptation module is used for determining an adaptation prediction mode and a depth division indication of a target coding unit under each encoding depth on the basis of reference encoding parameters of each mapping coding unit of the target coding unit in a corresponding reference video frame under each encoding depth;

and the coding mode determining module is used for determining a target coding mode which enables the coding cost of the target coding unit to reach the optimal value from each adaptive prediction mode under the coding depth and the coding partitioning mode facing the depth partitioning indication.

In a third aspect, an embodiment of the present invention provides a server, where the server includes:

one or more processors;

storage means for storing one or more programs;

when the one or more programs are executed by the one or more processors, the one or more processors implement the method for determining the encoding mode according to any embodiment of the present invention.

In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for determining an encoding mode according to any embodiment of the present invention.

In the method, the apparatus, the server, and the storage medium for determining a coding mode provided in the embodiments of the present invention, when video transcoding at multiple code rates is implemented, a reference video frame has been encoded at a certain code rate, so video transcoding at other code rates can refer to the encoding condition of the reference video frame, and at this time, by searching each mapping coding unit of a target coding unit at each coding depth in a corresponding reference video frame and referring to a reference coding parameter of each mapping coding unit, an adaptive prediction mode and a depth partition indication of the target coding unit at the coding depth can be determined, where the depth partition indication can indicate whether the target coding unit needs to be continuously partitioned at the coding depth to determine whether the target coding unit has a coding partition mode at the coding depth, and further from the coding partition modes oriented by each adaptive prediction mode and depth partition indication at the coding depth, determining a target coding mode which enables the coding cost of a target coding unit to reach the optimum, and at the moment, not considering the coding influence of other non-adaptive prediction modes and unnecessary division conditions on the target coding unit, namely not calculating the coding cost of the target coding unit in the non-adaptive prediction modes and the unnecessary coding division modes, so that the coding cost in the video coding process is greatly reduced, and therefore, the reference coding parameters of a reference video frame are utilized to carry out coding guidance on the target coding unit at other code rates, on the basis of ensuring the coding quality of the target coding unit, the coding complexity of the target coding unit is reduced, and the coding efficiency of the target coding unit is improved.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments made with reference to the following drawings:

fig. 1A is a flowchart of a method for determining a coding mode according to an embodiment of the present invention;

fig. 1B is a schematic diagram of a determination process of an encoding mode according to an embodiment of the present invention;

fig. 2A is a flowchart of a method for determining a coding mode according to a second embodiment of the present invention;

fig. 2B is a schematic diagram illustrating a process of determining a target coding mode of a target coding unit in a method according to a second embodiment of the present invention;

fig. 3A is a flowchart of a method for determining a coding mode according to a third embodiment of the present invention;

fig. 3B is a schematic diagram illustrating a principle of a process of determining an adaptive prediction mode and a depth partition indication of a target coding unit at the coded depth in the method according to the third embodiment of the present invention;

fig. 4 is a schematic structural diagram of an apparatus for determining an encoding mode according to a fourth embodiment of the present invention;

fig. 5 is a schematic structural diagram of a server according to a fifth embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures. In addition, the embodiments and features of the embodiments in the present invention may be combined with each other without conflict.

Example one

Fig. 1A is a flowchart of a method for determining a coding mode according to an embodiment of the present invention, which is applicable to a scene with a multi-rate coding requirement for any video. The method for determining the encoding mode provided by this embodiment may be performed by an apparatus for determining the encoding mode provided by the embodiment of the present invention, where the apparatus may be implemented in a software and/or hardware manner, and is integrated in a server that executes the method, and the server may be a background server that participates in video data interaction.

Specifically, referring to fig. 1A, the method may include the steps of:

and S110, determining an adaptive prediction mode and a depth division indication of the target coding unit at each coding depth based on the reference coding parameters of the target coding unit at each coding depth in each mapping coding unit in the corresponding reference video frame.

In particular, video compression coding technology is usually adopted in the internet to perform fast transmission and offline storage on video data, in order to adapt to the bandwidth and equipment conditions of different audiences, aiming at each uploaded video uploaded to the server by other users (such as a main broadcast and the like), the server needs to transcode the uploaded video at different resolutions and different code rates, and then the video after transcoding the uploaded video at multiple code rates is distributed to corresponding audiences for downloading and playing, at the moment, in order to facilitate the rapid transcoding of the uploaded video at multiple code rates, the server performs corresponding scaling operation on each video frame in the uploaded video according to each code rate to be transcoded, so as to respectively convert the video into corresponding multi-channel source videos, each channel of source video only corresponds to one to-be-transcoded rate, when transcoding is carried out at each to-be-transcoded rate, and directly correspondingly encoding one path of source video at the rate to be transcoded in the multi-path source video. However, when multiple source videos are respectively encoded correspondingly, there is a great computational overhead, and the multiple source videos belong to the same video content under different resolutions, so to improve the video encoding efficiency under multi-rate transcoding, in this embodiment, a source video can be screened from the multiple source videos according to transcoding requirements, and the source video is encoded once in advance by using the existing video compression encoding technology, and then each video frame in the video in which the source video is encoded is taken as a reference video frame that needs to be referred to correspondingly when encoding each video frame in other source videos that have not been encoded in the multiple source videos, at this time, the encoding mode adopted in the reference video frame can ensure that the encoding quality of the source video is higher, and the other multiple source videos that have not been encoded and the reference video belong to different displays of the same video content under different resolutions, therefore, the embodiment effectively guides the encoding of the corresponding video frames in the other multi-source videos by fully utilizing the encoding parameters adopted by the reference video frames.

It should be noted that, because each video frame in the multiple source videos is obtained by performing different scaling operations on each video frame in the same upload video according to different corresponding to-be-transcoded rates, resolutions between each video frame in other source videos that are not encoded in the multiple source videos and a corresponding reference video frame having the same video content in the reference video are in a certain ratio.

Meanwhile, currently, video Coding usually uses Coding Units (CU) as a basic Unit for Coding, and Coding units CU have four sizes of 64 × 64, 32 × 32, 16 × 16, and 8 × 8, and in a video Coding process, it is necessary to continuously perform a quadtree recursion attempt from a Coding Unit with the largest size until a Coding Unit with the smallest size recurses, so that a Coding Unit with the optimal efficiency is selected from Coding units before and after the recursion in each size, at this time, according to the recursion condition of the Coding Unit, there are different Coding depths, and the Coding Unit sizes of the Coding units in different Coding depths are different, for example, the Coding depth of a Coding Unit CU with a size of 64 × 64 is 0, and the Coding depth of a Coding Unit CU with a size of 8 × 8 is 3 every time of recursion.

In this embodiment, in order to achieve successful encoding of each video frame in a multi-channel source video, when a current video frame in a certain channel source video is encoded, first, each target coding unit of the current video frame under a specific size corresponding to a coding depth needs to be determined according to each coding depth, which can be understood as that the current video frame is divided according to the size corresponding to each coding depth to obtain a target coding unit under each coding depth, and then, each target coding unit is taken as a unit to encode each target coding unit in the current video frame one by one, so as to achieve successful encoding of the current video frame, at this time, for each target coding unit under the coding depth in the current video frame, because scaling ratios during conversion of the multi-channel source video are different, the target coding unit corresponds to a different mapping coding unit in a corresponding reference video frame of the reference video, the reference video frame in this embodiment may be a video frame in a reference video that has the same video content as the video frame in which the target coding unit is located, that is, a video frame having the same video content is selected from the reference video by determining the video content in the video frame in which the target coding unit is located, and is used as a corresponding reference video frame of the target coding unit in this embodiment, and as shown in fig. 1B, for each target coding unit at each coding depth, each target coding unit may be mapped correspondingly according to a resolution ratio between the video frame in which the target coding unit is located and the corresponding reference video frame of the target coding unit, so as to determine each mapping coding unit of the target coding unit in the corresponding reference video frame, and according to coding information adopted by the corresponding reference video frame, determining a reference coding parameter of each mapping coding Unit, wherein the reference coding parameter can indicate an optimal PU (Prediction Unit, PU) Prediction mode of a Prediction Unit (PU) adopted by each mapping coding Unit when the coding quality is ensured to be high, an optimal reference frame adopted when the coding cost is predicted, an optimal motion vector and the like, and at the moment, because the video content of each mapping coding Unit in a corresponding reference video frame is approximately the same as that of the target coding Unit, the reference coding parameter of each mapping coding Unit is also suitable for the coding of the target coding Unit to a certain extent, and the coding quality of the target coding Unit can be ensured, therefore, by referring to the suitability degree of each reference coding parameter adopted by each mapping coding Unit in the corresponding reference video frame to different Prediction modes during the coding and the recursive division condition of each mapping coding Unit during the coding, the adaptive prediction mode and the depth division indication of the target coding unit under the coding depth can be determined, at this time, the adaptive prediction mode is a prediction mode which is possibly adopted by the target coding unit when the coding cost is predicted under the coding depth, the depth division indication can indicate whether the target coding unit needs to continue recursive division under the coding depth, the coding cost under the corresponding mode is calculated only according to the requirements of the adaptive prediction mode and the depth division indication, the coding cost of the target coding unit under each predictive coding mode does not need to be calculated under each recursive depth, and the calculation cost in the coding process is greatly reduced.

It should be noted that there are multiple prediction modes for each coding unit, such as INTRA _2Nx2N, INTRA _ NxN, PCM, etc. in INTRA prediction mode, INTER _2Nx2N, INTER _2NxN, INTER _ Nx2N, INTER _ NxN, INTER _2NxnU, INTER _2NxnD, INTER _ nLx2N, INTER _ nRx2N, etc. in INTER prediction mode, and Merge mode and skip mode in special INTER prediction coding, in this embodiment, the coding cost calculation can be performed by referring to the selected adaptive prediction mode in the reference coding parameters of each mapping coding unit in the corresponding reference video frame.

Meanwhile, in the video coding process, since a quadtree recursion attempt is continuously performed from a coding unit under the maximum size until the coding unit is recurred to a coding unit under the minimum size, for a target coding unit under each coding depth, if the coding depth is an initial coding depth, that is, the coding depth is 0, the target coding unit under the coding depth is each coding unit obtained by dividing the current video frame according to 64 × 64 size, and if the coding depth is a non-initial coding depth, that is, other coding depths except the coding depth of 0 in the quadtree recursion process, the target coding unit under the coding depth may include each division subunit obtained by dividing a target coding unit under the previous coding depth adjacent to the coding depth according to a depth division instruction under the previous coding depth, for example, when the coding depth is 0, there are 2 target coding units with 64 × 64 sizes, and 1 target coding unit with 64 × 64 size can be divided into 4 division sub-units with 32 × 32 sizes according to the depth division instruction, where the 4 division sub-units with 32 × 32 sizes are the target coding units with the coding depth of 1. Furthermore, the size of the CU is continuously divided from 64 × 64 to 8 × 8, that is, the size of the smallest CU is 8 × 8, the CU under 64 × 64 size may not be divided, and the whole block under 64 × 64 size is directly predicted, while the CU under 8 × 8 size is the smallest CU size and therefore cannot be divided continuously, but when the CU under 8 × 8 size is predicted under the current coding depth, there are different Prediction modes, such as NxN, when the CU under 8 × 8 size is divided, a Prediction block with 4 × 4 size is generated.

And S120, determining a target coding mode which enables the coding cost of the target coding unit to reach the optimal value from the adaptive prediction modes and the depth division indication-oriented coding division modes under the coding depth.

Specifically, after determining an adaptive prediction mode and a depth division indication of a target coding unit at each coding depth, firstly, respectively calculating coding costs generated when the target coding unit adopts each adaptive prediction mode for coding at the coding depth before the target coding unit is divided, without considering coding influences of other non-adaptive prediction modes on the target coding unit, that is, without calculating coding costs of the target coding unit in the non-adaptive prediction mode; meanwhile, according to the depth division indication of the target coding unit under the coding depth, whether the target coding unit needs to be continuously divided to the next coding depth adjacent to the coding depth is judged, a coding division mode facing the depth division indication is determined when the target coding unit needs to be divided, then the coding cost generated when the target coding unit adopts each divided unit under the coding division mode is calculated again, the sum of the coding costs of each divided unit is taken as the coding cost of the target coding unit under the coding division mode, at this time, if the target coding unit does not need to be divided, the coding cost of each divided unit does not need to be calculated, the coding influence of the unnecessary division condition on the target coding unit is not considered, namely the coding cost of the target coding unit under the unnecessary coding division mode is not calculated, the coding cost in the video coding process is greatly reduced, and then according to the coding cost of the target coding unit in each adaptive prediction mode and the coding partitioning mode oriented by the depth partitioning indication, the target coding mode which enables the coding cost of the target coding unit to reach the optimal can be determined, so that the reference coding parameters of each mapping coding unit of each target coding unit in the corresponding reference video frame are fully utilized to carry out coding guidance on the target coding unit, and on the basis of ensuring the coding quality, the coding complexity of the target coding unit is reduced, and the coding efficiency of the target coding unit is improved.

It should be noted that, because the depth partition indication of the target coding unit at the coding depth may accurately indicate that the target coding unit must be partitioned to the next coding depth adjacent to the coding depth, it indicates that the target coding unit must not use each adaptive prediction mode at the coding depth for coding, in this embodiment, it is not necessary to calculate the coding cost of the target coding unit at the coding depth when using each adaptive prediction mode for coding, and directly determine the target coding mode that optimizes the coding cost of the target coding unit as the coding partition mode for the depth partition indication, that is, divide the target coding unit into 4 small coding units, thereby further reducing the coding cost in the video coding process.

Meanwhile, when the coding cost is calculated by adopting different prediction modes, the corresponding search frames are usually required to be referred to, in the embodiment, the search frame candidate set referred to by the target coding unit when the coding cost corresponding to each prediction mode is predicted is composed of the optimal reference frames of each mapping coding unit, and the search frames referred to by the target coding unit do not need to be screened one by one from the corresponding source video, so that the calculation process of the coding cost is simplified, and the corresponding coding efficiency is improved.

In the technical solution provided by this embodiment, when video transcoding at multiple code rates is implemented, a reference video frame has been encoded at a certain code rate, so video transcoding at other code rates can refer to the encoding condition of the reference video frame, at this time, by searching each mapping coding unit of a target coding unit at each coding depth in a corresponding reference video frame and referring to a reference coding parameter of each mapping coding unit, an adaptive prediction mode and a depth partition indication of the target coding unit at the coding depth can be determined, where the depth partition indication can indicate whether the target coding unit needs to be continuously partitioned at the coding depth, so as to determine whether the target coding unit has a coding partition mode at the coding depth, and further, from among the coding partition modes oriented by each adaptive prediction mode and depth partition indication at the coding depth, determining a target coding mode which enables the coding cost of a target coding unit to reach the optimum, and at the moment, not considering the coding influence of other non-adaptive prediction modes and unnecessary division conditions on the target coding unit, namely not calculating the coding cost of the target coding unit in the non-adaptive prediction modes and the unnecessary coding division modes, so that the coding cost in the video coding process is greatly reduced, and therefore, the reference coding parameters of a reference video frame are utilized to carry out coding guidance on the target coding unit at other code rates, on the basis of ensuring the coding quality of the target coding unit, the coding complexity of the target coding unit is reduced, and the coding efficiency of the target coding unit is improved.

Example two

Fig. 2A is a flowchart of a method for determining a coding mode according to a second embodiment of the present invention, and fig. 2B is a schematic diagram of a process for determining a target coding mode of a target coding unit according to the second embodiment of the present invention. The embodiment is optimized on the basis of the embodiment. Specifically, as shown in fig. 2B, this embodiment mainly explains in detail a specific calculation process of a coding cost generated when each target coding unit adopts each adaptive prediction mode at the coded depth or a depth partition indication-oriented coding partition mode for coding.

Optionally, as shown in fig. 2A, the present embodiment may include the following steps:

s210, based on the reference coding parameters of the target coding unit at each coding depth in the corresponding reference video frame, the adaptive prediction mode and the depth division indication of the target coding unit at the coding depth are determined.

And S220, predicting the first coding cost of the target coding unit in each adaptive prediction mode.

Optionally, after the reference coding parameters of the target coding unit at each coding depth in each mapping coding unit in the corresponding reference video frame are fully utilized to determine the adaptive prediction mode of the target coding unit at the coding depth, in order to accurately obtain the target coding mode that optimizes the coding cost of the target coding unit, firstly, each adaptive prediction mode of the target coding unit at the coding depth is adopted to perform analog coding on the target coding unit, for example, INTRA _2Nx2N or INTRA _ NxN in the INTRA-frame prediction mode, or INTER _2Nx2N, INTER _2NxN or INTER _ Nx2N in the INTER-frame prediction mode, respectively, so as to calculate the first coding cost generated by the target coding unit when the target coding unit is coded in each adaptive prediction mode one by one, so as to select the target coding mode that optimizes the coding cost of the target coding unit from the plurality of adaptive prediction modes at the coding depth in the subsequent step, thereby ensuring the coding efficiency of the target coding unit.

It should be noted that, in order to further improve the determination efficiency of the target coding mode in this embodiment, after determining the depth partition indication of the target coding unit at the coding depth, it is further determined whether it is necessary to calculate a coding cost generated when the target coding unit performs coding in each adaptive prediction mode at the coding depth, because the depth partition indication of the target coding unit at the coding depth may accurately indicate that the target coding unit must be partitioned to a next coding depth adjacent to the coding depth for coding, it is indicated that the target coding unit must not perform coding by using each adaptive prediction mode at the coding depth, in this case, it is unnecessary to calculate a first coding cost when the target coding unit performs coding by using each adaptive prediction mode at the coding depth, that is, omitting the execution of S220, the target coding mode that optimizes the coding cost of the target coding unit is directly determined as the depth partition indication-oriented coding partition mode, that is, the target coding unit is directly partitioned into 4 small coding units, thereby further reducing the coding cost in the video coding process.

And S230, determining a target coding mode which enables the coding cost of the target coding unit to reach the optimal value based on at least one of the first coding cost and the second coding cost in the depth division indication oriented coding division mode.

Optionally, after determining the depth partition instruction of the target coding unit at the coding depth by fully utilizing the reference coding parameters of the target coding unit at each coding depth in the corresponding reference video frame, it can be accurately determined whether the target coding unit has a requirement of continuously partitioning to the next coding depth adjacent to the coding depth at the coding depth, where the following 3 partitioning conditions exist: 1) the target coding unit must be divided; 2) the target coding unit may be divided or not; 3) the target coding unit must not be divided. And then under each division condition, judging whether the current suitable coding mode of the target coding unit comprises each adaptive prediction mode under the coding depth and a coding division mode facing to the depth division indication, if the depth division indication requires that the target coding unit needs to be divided, further calculating the coding cost of the divided unit of the target coding unit, taking the sum of the coding costs of the divided units as the second coding cost of the target coding unit under the coding division mode facing to the depth division indication, thereby selecting at least one item of reference information for subsequently judging whether the coding cost reaches the optimal coding cost from the first coding cost under each adaptive prediction mode and the second coding cost under the coding division mode facing to the depth division indication by analyzing the current suitable coding mode of the target coding unit, and according to the selected coding cost, determining a target coding mode which can enable the coding cost of the target coding unit to reach the optimal value from the adaptive prediction modes and the depth division indication oriented coding division modes under the coding depth.

For example, the present embodiment can be described for three division cases existing when the target coding unit is coded:

in the partitioning case of 1), if the depth partitioning indication at the coded depth requires that the target coding unit must be partitioned to a next coded depth adjacent to the coded depth for coding, which indicates that the depth partitioning indication is non-empty for the coding partitioning mode, the target coding unit needs to be partitioned into partitioning sub-units at the next coded depth adjacent to the coded depth, and a second coding cost after combination of the partitioning sub-units is predicted; it can be understood that, if the depth partition indicates that the oriented coding partition mode is non-empty, the target coding unit is directly partitioned into 4 partition subunits under the next coding depth adjacent to the coding depth, and the coding cost generated when each partition subunit is coded is calculated at the same time, so that the sum of the coding costs of the partition subunits is taken as the second coding cost of the target coding unit in the coding partition mode; it should be noted that, in the present embodiment, for the calculation of the coding cost generated when each partition subunit is coded, the partition subunit may be used as the target coding unit at the next coding depth in the present embodiment, and the above-described calculation step of the coding cost of the target coding unit at the coding depth is adopted to calculate the coding cost of each partition subunit.

In 2) division cases, the depth division indication of the target coding unit at the coded depth requires that the target coding unit can be divided into the next coded depth adjacent to the coded depth for coding, or can be directly coded at the coded depth without division, and at this time, the depth division indication-oriented coding division mode is also non-empty, so that the target coding unit also needs to be divided into division sub-units at the next coded depth adjacent to the coded depth, and predict the second coding cost after each division sub-unit is combined.

At this time, in the 1) partitioning case and the 2) partitioning case, the depth partitioning indicates that the oriented coding partitioning mode is non-empty, but the requirements for whether the target coding unit needs to adopt the adaptive prediction mode at the coding depth for coding are different, so that the target coding mode capable of optimizing the coding cost of the target coding unit can be determined at least according to the second coding cost of the target coding unit in the depth partitioning indication oriented coding partitioning mode. For example, in the partitioning case of 1) the depth partitioning instruction of the target coding unit at the coded depth requires that the target coding unit must be partitioned to the next coded depth adjacent to the coded depth for coding, which means that the target coding unit does not adopt each adaptive prediction mode of the target coding unit at the coded depth for coding, and therefore, only the target coding mode capable of optimizing the coding cost of the target coding unit needs to be determined according to the second coding cost of the target coding unit at the coded partitioning mode oriented by the depth partitioning instruction; in the case of the division of the second category 2), the depth division indication of the target coding unit at the coding depth requires that the target coding unit can be divided into the next coding depth adjacent to the coding depth for coding, or can be directly coded at the coding depth without division, which indicates that the target coding unit may adopt each adaptive prediction mode at the coding depth for coding, or adopt a coding division mode oriented by the depth division indication for coding, and therefore, the target coding mode capable of optimizing the coding cost of the target coding unit needs to be determined jointly according to the first coding cost of the target coding unit in each adaptive prediction mode and the second coding cost of the target coding unit in the coding division mode oriented by the depth division indication.

In the partitioning case of the type 3), the depth partitioning instruction of the target coding unit at the coded depth requires that the target coding unit is not allowed to be partitioned to the next coded depth adjacent to the coded depth for coding, and only adaptive prediction modes at the coded depth can be used for coding, which indicates that the coding partitioning mode oriented by the depth partitioning instruction is empty, so that the target coding mode capable of optimizing the coding cost of the target coding unit is determined only according to the first coding cost of the target coding unit in each adaptive prediction mode.

It should be noted that, in this embodiment, S220 and S230 respectively calculate the coding cost of the target coding unit in each adaptive prediction mode and the depth partition indication-oriented coding partition mode at the coding depth, and there is no specific precedence order, so that the execution order of S220 and S230 is not limited, and may be executed successively or simultaneously.

And S240, sequentially integrating the target coding modes of the target coding units under each coding depth to obtain the overall coding mode under the optimal coding cost.

Optionally, after determining the target coding mode of each target coding unit at each coding depth, the target coding mode may optimize the coding cost of the corresponding target coding unit, and at this time, for each adjacent coding depth, the size of each partition subunit partitioned by the target coding unit at the previous coding depth in the adjacent coding depth pair is the same as the size of the target coding unit at the next coding depth in the adjacent coding depth pair, so that the partition subunits at the coding depth partitioned by the target coding unit at the previous coding depth in the same coding depth and the target coding mode of the target coding unit at the coding depth may be sequentially subjected to coding cost optimization integration, thereby obtaining the overall coding mode at the optimal coding cost.

The technical solution provided by this embodiment searches for each mapping coding unit of the target coding unit at each coded depth in the corresponding reference video frame, and referring to the reference coding parameter of each mapping coding unit, an adaptive prediction mode and a depth partitioning indication of the target coding unit at the coded depth can be determined, the coding cost calculation step of the target coding unit under the conditions of the non-adaptive prediction mode and unnecessary partition is accurately skipped according to the adaptive prediction mode and the depth partition indication, thereby greatly reducing the coding cost in the video coding process, thereby using the reference coding parameter of the reference video frame to carry out coding guidance for the target coding unit under other code rates, on the basis of ensuring the coding quality of the target coding unit, the coding complexity of the target coding unit is reduced, and the coding efficiency of the target coding unit is improved.

EXAMPLE III

Fig. 3A is a flowchart of a method for determining a coding mode according to a third embodiment of the present invention, and fig. 3B is a schematic diagram of a principle of a process for determining an adaptive prediction mode and a depth partition indication of a target coding unit at a coded depth according to the third embodiment of the present invention. The embodiment is optimized on the basis of the embodiment. Specifically, as shown in fig. 3A, the present embodiment explains the specific determination process of the adaptive prediction mode and the depth partition indication for the target coding unit at each coded depth in detail.

Optionally, as shown in fig. 3A, the present embodiment may include the following steps:

and S310, calculating the coding reference depth and the prediction adaptation reference item of the target coding unit based on the reference coding parameters of each mapping coding unit of the target coding unit in the corresponding reference video frame.

Optionally, after determining the reference coding parameters of each mapping coding unit in the corresponding reference video frame by the target coding unit, the reference coding parameters may include the current coding depth, the optimal prediction mode, and the optimal reference frame and the optimal motion vector used in the optimal prediction mode by each mapping coding unit, at this time, since the reference coding parameters of each mapping coding unit in the corresponding reference video frame by the target coding unit are also applicable to the optimal coding of the target coding unit, in this embodiment, the average coding depth at the current coding depth used by each mapping coding unit may be used as the coding reference depth of the target coding unit, however, since the sizes of the mapping coding units may be different, the mapping area occupation ratios of each mapping coding unit and the target coding unit are also different, as shown in fig. 3B, in this embodiment, the mapping area ratio of each mapping coding unit under the target coding unit is used as a corresponding weight, and then the weighted average calculation is performed on the current coded depth adopted by each mapping coding unit, so as to obtain the coded reference depth of the target coding unit, which is represented by pred _ avgDepth, so as to determine the depth partition indication of the target coding unit under the coded depth according to the coded reference depth in the following step.

Meanwhile, by analyzing in advance the relevant parameters of each existing prediction mode existing in the prior art that can be skipped during encoding without coding cost prediction, corresponding adaptation conditions can be set for each existing prediction mode, and whether each existing prediction mode satisfies the corresponding adaptation conditions is analyzed through the prediction adaptation reference item of the target coding unit, at this time, the prediction adaptation reference item in this embodiment may include the skip macroblock mode occupation ratio (i.e., skip mode occupation ratio) of the target coding unit under each mapping coding unit in the corresponding reference video frame and the motion reference vector of each prediction region under a specific INTER prediction mode, which may be only two modes, i.e., INTER _2NxN and INTER _ Nx2N, under the INTER prediction mode, the target coding unit may be divided into an upper region and a lower region or a left region and a right region under the coding depth according to corresponding different division formats for encoding, and subsequently, determining the adaptive prediction mode of the target coding unit at the coding depth by judging whether each prediction adaptive reference item meets the corresponding adaptive condition.

And S320, determining the depth division indication of the target coding unit under the coded depth according to the coded reference depth.

Optionally, in order to enable the target coding unit to encode in the same coding mode as the overall coding module after each mapping coding unit is combined as much as possible, so as to improve the coding efficiency of the target coding unit, in this embodiment, it may be configured to analyze whether the target coding unit needs to be continuously divided into the next coding depth adjacent to the coding depth by determining the difference between the current coding depth of the target coding unit and the coding reference depth determined by the current coding depth actually adopted by each mapping coding unit, so as to ensure the similarity between the coding mode finally adopted by the target coding unit and the coding mode of each mapping coding unit, thereby fully utilizing the reference coding parameters of each mapping coding unit, and improving the dividing accuracy when the target coding unit encodes.

Exemplarily, the coded depth at which the target coding unit is currently located is denoted by depth, and at this time, if the coded depth of the target coding unit satisfies depth < pred _ avgDepth-2, which indicates that the coded depth at which the target coding unit is currently located is higher than a coded reference depth, partitioning is necessary, and at this time, since coding is not necessarily performed at the coded depth, it is not necessary to calculate coding costs of the target coding unit in each prediction mode at the coded depth, the target coding unit is directly partitioned into 4 partition subunits at a next coded depth adjacent to the coded depth, and then the partitioned 4 partition subunits are used as the target coding unit at the next coded depth to calculate corresponding coding costs; if the coded depth of the target coding unit meets depth > pred _ avgDepth +2, the coded depth of the target coding unit is lower than a coded reference depth, and the target coding unit cannot be continuously divided at the moment, so that the coding cost of the target coding unit under each prediction mode under the coded depth only needs to be calculated; if the coded depth of the target coding unit satisfies pred _ avgddepth-2 < depth < pred _ avgddepth +2, which indicates that the difference between the coded depth of the target coding unit and the coded reference depth is not large, the target coding unit may be divided or not divided at this time, and therefore, the coding cost of the target coding unit in each prediction mode at the coded depth and the coding cost of the divided 4 division sub-units at the next coded depth need to be calculated.

And S330, determining an adaptive prediction mode of the target coding unit under the coding depth according to the prediction adaptive reference item.

Optionally, after the reference coding parameters of each mapping coding unit are fully utilized to calculate the prediction adaptation reference item of the target coding unit, whether each prediction adaptation reference item enables each existing prediction mode of the target coding unit at the coding depth to meet the corresponding adaptation condition may be judged, and then the existing prediction mode when each prediction adaptation reference item meets the corresponding adaptation condition at the coding depth may be screened out as the adaptation prediction mode of the target coding unit at the coding depth in this embodiment, and subsequently, only the coding cost of the target coding unit at each adaptation prediction mode needs to be calculated.

For example, the skip macroblock (skip) mode ratio in the prediction adaptation reference item may indicate a ratio of each mapping and coding unit to be coded in a skip mode, and the motion reference vector of each prediction region in the prediction adaptation reference item in the specific inter prediction mode may indicate whether it is suitable for being coded in the specific inter prediction mode in the prediction region partition manner according to a distance between the motion reference vectors of each prediction region, at this time, when calculating the motion reference vector of each prediction region in the specific inter prediction mode, the ratio of the mapping area occupied by each mapping and coding unit in the corresponding prediction region may be used as the motion weight of the mapping and coding unit, and the weighted average calculation may be performed on the optimal motion vector adopted by each mapping and coding unit, that is, after performing the corresponding scaling operation on the optimal motion vector adopted by each mapping and coding unit, and calculating the motion vector average value as the motion reference vector of each prediction area of the target coding unit in the specific inter prediction mode.

At this time, in this embodiment, determining an adaptive prediction mode of the target coding unit at the coded depth according to the prediction adaptation reference item may specifically include: if the skipped macroblock mode occupation ratio is a preset full occupation ratio, determining that the adaptive prediction mode of the target coding unit at the coding depth is the skipped macroblock mode; and if the skipped macroblock mode occupation ratio is a non-preset full occupation ratio value and exceeds a preset occupation ratio upper limit, determining that the target coding unit excludes the asymmetric partition mode in the inter-frame prediction mode in the adaptive prediction mode at the coding depth.

Specifically, if the skip mode occupancy is the preset full occupancy, it indicates that each mapping coding unit adopts the skip mode for coding, and the target coding unit must also adopt the skip mode for coding at the coding depth, so that the adaptive prediction mode of the target coding unit at the coding depth can be directly determined to be the skip macroblock mode; if the skip macroblock mode occupation ratio is a non-preset full occupation ratio value, but the skip macroblock mode occupation ratio exceeds a preset occupation ratio upper limit, it is described that most mapping coding units in each mapping coding unit adopt a skip mode for coding, and the difference between coding contents of the skip mode and the asymmetric partition mode in the inter-frame prediction mode is large, so that when the skip macroblock mode occupation ratio exceeds the preset occupation ratio upper limit, it can be determined that the asymmetric partition mode in the inter-frame prediction mode does not exist in the adaptive prediction mode of the target coding unit at the coding depth, and the coding cost of the target coding unit in the asymmetric partition mode does not need to be calculated subsequently. In addition, for a specific INTER prediction mode, such as INTER _2NxN and INTER _ Nx2N, the target coding unit is divided into 2 prediction regions, up, down, or left and right, for encoding, at this time, according to the best motion vector adopted by each mapping and coding module, a motion reference vector in each prediction region of the target coding unit in the INTER _2NxN mode and a motion reference vector in each prediction region in the INTER _ Nx2N mode are respectively calculated, at this time, for the INTER _2NxN mode, a distance between the motion reference vectors in the two prediction regions is calculated, if the distance is greater than a certain threshold, it is determined that the target coding unit is in INTER _2NxN and INTER _ Nx2N, it is very likely that the INTER _2NxN mode is selected for encoding, and thus it is determined that the INTER _ Nx2N mode does not exist in the adaptive prediction mode of the target coding unit at the encoding depth; similarly, if the distance between the motion reference vectors in the left and right prediction regions in INTER _ Nx2N mode is greater than a certain threshold, indicating that the target coding unit is in INTER _2NxN and INTER _ Nx2N, it is highly likely to select INTER _ Nx2N mode for encoding, and thus it is determined that the target coding unit does not have INTER _2NxN mode in the adaptive prediction mode at the coding depth; otherwise the target CU may include INTER _2NxN and INTER _ Nx2N in the adapted prediction mode at the coded depth.

It should be noted that, in this embodiment, when it is determined that the target coding unit must be divided into the next coded depth adjacent to the coded depth according to the depth division instruction of the target coding unit at the coded depth, it is determined that the target coding unit must not perform coding at the coded depth, and therefore the coding cost of the target coding unit in each prediction mode of the coded depth is not calculated, so in order to further improve the coding efficiency of the target coding unit, S330 may be omitted in this embodiment, and the adaptive prediction mode of the target coding unit at the coded depth is not determined.

S340, determining a target coding mode that optimizes the coding cost of the target coding unit from the adaptive prediction modes and the depth partition indication-oriented coding partition modes at the coding depth.

The technical solution provided by this embodiment fully utilizes the reference coding parameters of each mapping coding unit of the target coding unit in the corresponding reference video frame at each coding depth, calculates the coding reference depth and the prediction adaptation reference item of the target coding unit, further refers to the coding reference depth and the prediction adaptation reference item, determines the adaptation prediction mode and the depth division indication of the target coding unit at the coding depth, improves the accuracy of the adaptation prediction mode and the depth division indication of the target coding unit at the coding depth, and accurately skips the coding cost calculation step of the target coding unit under the non-adaptation prediction mode and the unnecessary division condition according to the adaptation prediction mode and the depth division indication, thereby greatly reducing the coding cost in the video coding process, and further utilizing the reference coding parameters of the reference video frame to perform coding guidance for the target coding unit at other code rates, on the basis of ensuring the coding quality of the target coding unit, the coding complexity of the target coding unit is reduced, and the coding efficiency of the target coding unit is improved.

Example four

Fig. 4 is a schematic structural diagram of a device for determining an encoding mode according to a fourth embodiment of the present invention, specifically, as shown in fig. 4, the device may include:

a coding adaptation module 410, configured to determine, based on reference coding parameters of each mapping coding unit of a target coding unit at each coding depth in a corresponding reference video frame, an adaptation prediction mode and a depth partitioning indication of the target coding unit at the coding depth;

and an encoding mode determining module 420, configured to determine, from each of the adaptive prediction modes at the coded depth and the depth partition indication-oriented encoding partition mode, a target encoding mode that optimizes the encoding cost of the target coding unit.

The device for determining the coding mode provided by this embodiment is applicable to the method for determining the coding mode provided by any of the above embodiments, and has corresponding functions and advantages.

EXAMPLE five

Fig. 5 is a schematic structural diagram of a server according to a fifth embodiment of the present invention, and as shown in fig. 5, the server includes a processor 50, a storage device 51, and a communication device 52; the number of the processors 50 in the server may be one or more, and one processor 50 is taken as an example in fig. 5; the processor 50, the storage device 51 and the communication device 52 in the server may be connected by a bus or other means, and the bus connection is taken as an example in fig. 5.

The server provided by this embodiment may be configured to execute the method for determining the encoding mode provided by any of the above embodiments, and has corresponding functions and advantages.

EXAMPLE six

An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, can implement the method for determining the encoding mode in any of the above embodiments. The method specifically comprises the following steps:

Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the method operations described above, and may also perform related operations in the determination method of the encoding mode provided by any embodiment of the present invention.

From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.

It should be noted that, in the embodiment of the apparatus for determining an encoding mode, the units and modules included in the apparatus are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for determining a coding mode, comprising:

2. The method of claim 1, wherein the determining a target coding mode that optimizes the coding cost of the target coding unit from among the adaptive prediction modes at the coded depth and the depth partition indication-oriented coding partition modes comprises:

predicting a first coding cost of the target coding unit in each of the adaptive prediction modes;

and determining a target coding mode which enables the coding cost of the target coding unit to reach the optimal value based on at least one of the first coding cost and a second coding cost under the coding partitioning mode oriented by the depth partitioning indication.

3. The method according to claim 2, wherein the determining a target coding mode that optimizes the coding cost of the target coding unit based on at least one of the first coding cost and a second coding cost in a coding partition mode for which the depth partition indication is oriented comprises:

if the depth partition indicates that the oriented coding partition mode is empty, determining a target coding mode which enables the coding cost of the target coding unit to reach the optimal value according to the first coding cost;

if the depth partitioning indicates that the oriented coding partitioning mode is not empty, partitioning the target coding unit into partitioning sub-units under the next coding depth adjacent to the coding depth, and predicting a second coding cost after the partitioning sub-units are combined;

and determining the target coding mode which enables the coding cost of the target coding unit to reach the optimal value at least according to the second coding cost.

4. The method of claim 1, wherein the determining the adaptive prediction mode and the depth partition indication for the target coding unit at each coded depth based on the reference coding parameters of the respective mapping coding units of the target coding unit at the corresponding reference video frame at the coded depth comprises:

calculating a coding reference depth and a prediction adaptation reference item of the target coding unit based on reference coding parameters of each mapping coding unit of the target coding unit in a corresponding reference video frame;

determining a depth division indication of the target coding unit under the coding depth according to the coding reference depth;

and determining an adaptive prediction mode of the target coding unit at the coding depth according to the prediction adaptive reference item.

5. The method of claim 4, wherein the prediction adaptation reference entries comprise skip macroblock mode ratios of the target coding unit in each mapped coding unit in the corresponding reference video frame and motion reference vectors of each prediction region in a specific inter prediction mode.

6. The method of claim 5, wherein determining the adaptive prediction mode of the target coding unit at the coded depth according to the prediction adaptation reference term comprises:

if the skipped macroblock mode occupation ratio is a preset full occupation ratio value, determining that the adaptive prediction mode of the target coding unit under the coding depth is the skipped macroblock mode;

and if the skipped macroblock mode occupation ratio is a non-preset full occupation ratio value and exceeds a preset occupation ratio upper limit, determining that the target coding unit excludes the asymmetric partition mode in the inter-frame prediction mode from the adaptive prediction mode under the coding depth.

7. The method according to any of claims 1-6, wherein the search frame candidate set referred to by the target coding unit when predicting the corresponding coding cost using each of the adapted prediction modes consists of the best reference frame of each of the mapped coding units.

8. The method according to any of claims 1-6, wherein for the target coding unit at each coded depth, if the coded depth is a non-initial coded depth, the target coding unit at the coded depth comprises partition subunits obtained by partitioning the target coding unit at a last coded depth adjacent to the coded depth according to the depth partitioning indication at the last coded depth.

9. The method according to any of claims 1-6, further comprising, after determining a target coding mode that optimizes the coding cost of the target coding unit:

and sequentially integrating the target coding modes of the target coding units under each coding depth to obtain the overall coding mode under the optimal coding cost.

10. The method according to any one of claims 1 to 6, wherein the reference video in which the reference video frame is located is a single source video at a lowest resolution among the multiple source videos.

11. An apparatus for determining a coding mode, comprising:

12. A server, characterized in that the server comprises:

one or more processors;

storage means for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement a method for determining an encoding mode as recited in any one of claims 1-10.

13. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method for determining an encoding mode according to any one of claims 1 to 10.