CN109068142B

CN109068142B - 360-degree video intra-frame prediction rapid decision-making method, device, coder-decoder and medium

Info

Publication number: CN109068142B
Application number: CN201811059097.8A
Authority: CN
Inventors: 张萌萌; 董晓莎; 刘志
Original assignee: North China University of Technology
Current assignee: North China University of Technology
Priority date: 2018-09-06
Filing date: 2018-09-06
Publication date: 2022-08-16
Anticipated expiration: 2038-09-06
Also published as: CN109068142A

Abstract

360-degree video has a very high resolution (typically 4K to 8K), resulting in a large increase in encoding time. In order to reduce the encoding complexity, a 360-degree video intra-frame prediction rapid decision device based on texture features is designed. On the one hand, the proposed decision method determines whether to terminate coding unit partitioning as early as possible based on texture complexity; on the other hand, the proposed method reduces the number of candidate modes in the mode decision based on the texture directionality. The two methods may be used simultaneously or separately.

Description

360-degree video intra-frame prediction rapid decision-making method, device, coder-decoder and medium

Technical Field

The present invention relates to the field of image processing, and more particularly to texture feature-based intra prediction CU partitioning and mode selection for processing 360 degree video in High Efficiency Video Coding (HEVC).

Background

The High Efficiency Video Coding (HEVC) standard is derived from Joint Video Coding group (JCT-VC), and is the latest Video Coding standard [1 ]. Compared with the existing video coding standard, HEVC can greatly improve the coding effect, especially for high-resolution video. HEVC video coding standard dust in 1 month in 2013, and HEVC research work is transferred to the field of expanding the application range of HEVC, and coding for virtual reality video is proposed.

With the growing commercial interest in the field of Virtual Reality (VR) in recent years, VCEG by ITU-T and MPEG by ISO/IEC have together established a Joint video exploration team (JFET) for future video coding research and proposed the VR 360 degree video coding framework [2 ]. 360 degree video is typically obtained from a multi-camera array, such as a GoPro Omni camera. The images from multiple cameras are assembled to achieve spherical projection in the horizontal 360 degree direction and the vertical 180 degree direction.

Due to the spherical feature of VR 360 video, the conventional video coding method is difficult to directly apply. Therefore, jfet proposed 11 different spherical video projection formats to solve the coding problem. The 360-degree video is projected on a two-dimensional plane and is converted into a two-dimensional projection format [3] according to a certain proportion, such as Equal Rectangular Projection (ERP), octahedral projection (OHP), truncated quadrangular pyramid projection (TSP), Rotating Sphere Projection (RSP), cubic projection (CMP), Segmented Sphere Projection (SSP) and the like. After projection, the 360-degree video is converted into a traditional video and is encoded in the next step. Considering that HEVC is proposed for traditional video, 360-degree video coding based on HEVC does not efficiently code 360-degree video. In order to improve the performance of 360-degree video coding, many studies have been made to propose new techniques from aspects of projection framework or rate-distortion optimization to improve the quality and efficiency of 360-degree video coding [4] - [7 ].

In document [4], in order to reduce the bit rate of OVs after compression, two adaptive coding technique algorithms suitable for omni-directional video (OV) are proposed. Document [5] proposes a real-time 360-degree video stitching framework for rendering different levels of detail of the entire scene. Document [6] proposes a motion estimation algorithm that can improve the motion prediction accuracy of 360-degree video. In consideration of distortion of the spherical domain, the optimal rate-distortion relationship of the spherical domain is proposed in document [7], and an optimal solution is obtained, which can save up to 11.5% of bit rate.

The projection formats of the conventional video and the 360-degree video show different image textures. In the invention, a fast intra-frame prediction method based on video image texture features is adopted, so that the fast intra-frame prediction method can be applied to 360-degree videos in an ERP format. We calculate the texture complexity in the CU partitioning process for the CU partition size decision and add the texture directionality decision to the mode selection process. The decision is divided into two parts. First, the complexity is classified according to a threshold of the complexity of the image texture to determine whether the current CU is skipped or further divided. Second, the candidate prediction modes are further reduced based on texture directionality.

In the description of the present invention, the following documents are cited, which are incorporated herein as part of the disclosure of the present invention.

[1]G.J.Sullivan，J.R.Ohm，W.J.Han and T.Wiegand，“Overview of the HighEfficiency Video Coding(HEVC)Standard，”IEEE Transactions on Circuits andSystems for Video Technology，Vol.22，No.12，PP.1649-1668，Dec.2012.

[2]Yuwen He，Xiaoyu Xiu，Yan Ye et al.“360Lib Software Manual”，Joint Video Exploration Team(JVET)of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11，JVET 360Lib Software Manual，Aug.2017.

[3]Yao Lu，Jisheng Li，Ziyu Wen，Xianyu Meng，AHG8：Padding method for Segmented Sphere Projection，Joint Video Exploration Team(JVET)of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11 7th Meeting：Torino，IT，13 July-21 July，2017.

[4]Minhao Tang，Yu Zhang，Jiangtao Wen，Shiqiang Yang，OPTIMIZED VIDEO CODING FOR OMNIDIRECTIONAL VIDEOS，IEEE International Conference on Multimedia and Expo(ICME)，2017.

[5]Wei-Tse Lee，Hsin-I Chen，Ming-Shiuan Chen et al.High-resolution 360 Video Foveated Stitching for Real-time VR，Computer Graphics Forum，2017，36(7)：115-123.

[6]Nayoung Kim and Je-Won Kan，Bi-directional deformable block-based motion estimation for frame rate-up conversion of 360-degree videos，Electronics Letters，2017，53(17)：1192-1194.

[7]Yiming Li，Jizheng Xu，Zhenzhong Chen，SPHERICAL DOMAIN RATE-DISTORTION OPTIMIZATION FOR 360-DEGREE VIDEO CODING，IEEE International Conference on Multimedia and Expo(ICME)，2017.

Disclosure of Invention

In order to solve the above technical problem, the present application provides a fast blocking method based on coding unit texture complexity in screen coding in HEVC, including:

(1) initializing a current Coding Unit (CU), calculating the image vertical texture complexity VMAD of the CU, judging the CU by using threshold values (V alpha and V beta) related to the vertical texture complexity, if the calculated value is smaller than V alpha, the current CU is not divided, calculating the cost that the current CU is not divided, and jumping to the step (5), if the calculated value is larger than V beta, the current CU is divided, jumping to the step (4), and if the calculated value is between V alpha and V beta, continuing to the step (2);

(2) calculating the result of the current CU horizontal texture complexity HMAD and judging the CU by using threshold values (H alpha and H beta) related to the horizontal texture complexity, if the calculated value is smaller than H alpha, the current CU is not divided, calculating the cost of the current CU which is not divided and jumping to the step (5), and if the calculated value is larger than H beta, the current CU is divided and jumping to the step (4);

(3) dividing the current CU, calculating the division cost, and proceeding to the step (4);

(4) judging whether the size of the current CU is the minimum size or not, and if not, jumping back to the step (1);

(5) and if the size of the current CU is the minimum division size or the current CU is not divided according to the previous judgment result, selecting the optimal division mode of the current CU according to the calculated cost.

(6) Judging the texture directionality of the prediction mode of the current CU, comparing the values of the vertical texture complexity and the horizontal texture complexity, if the vertical texture complexity is larger than the horizontal texture complexity, the prediction mode is a vertical mode, and jumping to the step (7), and if the vertical texture complexity is smaller than the horizontal texture complexity, the prediction mode is a horizontal mode, and jumping to the step (8);

(7) calculating the rate distortion cost of the representative mode in the vertical mode, taking the mode C2 with the minimum rate distortion cost, and jumping to the step (9);

(8) calculating the rate distortion cost of the representative mode in the horizontal mode, taking the mode C2 with the minimum rate distortion cost, and jumping to the step (9);

(9) and (4) calculating the rate distortion cost of each 2 modes at the left and right of the mode C2 with the minimum rate distortion cost in the step (7) or (8), wherein the two modes with the minimum rate distortion cost are the optimal candidate modes.

In one embodiment, step (1) further comprises:

and vertically traversing all pixel points of each column of the current CU, calculating the vertical texture complexity VMAD of each column, solving the average vertical texture complexity meanVMAD of the current CU, and comparing the average vertical texture complexity meanVMAD with two thresholds of the type.

In one embodiment, step (2) further comprises:

and traversing all pixel points of each line of the current CU horizontally, calculating the horizontal texture complexity HMAD of each line, solving the average horizontal texture complexity meanHMAD of the current CU, and comparing the average horizontal texture complexity with two thresholds of the type.

In one embodiment, step (6) further comprises:

and (3) adopting the average texture complexity calculated in the step (1) and the step (2) to obtain values of the vertical texture complexity and the horizontal texture complexity, and comparing the values of the vertical texture complexity and the horizontal texture complexity, wherein the mode with lower complexity is a better mode.

In one embodiment, step (7) further comprises:

the representative modes in the vertical modes are 18, 22, 26, 30 and 34, the rate distortion cost of the 5 modes is calculated, the planer and the DC mode are calculated, and the mode with the minimum rate distortion cost is selected.

In one embodiment, step (8) further comprises:

the representative modes in the horizontal modes are 2, 6, 10, 14 and 18, the rate distortion cost of the 5 modes is calculated, the planer and the DC mode are calculated, and the mode with the minimum rate distortion cost is selected.

In one embodiment, step (9) further comprises:

if the rate-distortion cost of the planer and the DC mode is the minimum, the two modes are the optimal candidate modes. Otherwise, the mode C2 with the smallest rate-distortion cost in the calculation step (7) or (8) is adjacent to 4 modes: c2-2, C2-1, C2+1, C2+2, the optimal candidate patterns are the two patterns in which the rate-distortion cost is the smallest.

In one embodiment, two thresholds for determining the current CU partition are set through statistics, and the partition result of the first frame video image of the 360-degree video, the image texture complexity and the related data are counted in a sampling manner to obtain the optimal threshold for the CU partition.

According to yet another aspect of the present invention, an encoder for performing HEVC is presented, which is operable to enable fast decision-making for texture feature based 360 degree video intra prediction in HEVC, comprising:

logic (1): initializing a current Coding Unit (CU), calculating the image vertical texture complexity VMAD of the CU, judging the CU by using threshold values (V alpha and V beta) related to the vertical texture complexity, if the calculated value is smaller than V alpha, the current CU is not divided, calculating the cost that the current CU is not divided, and jumping to the step (5), if the calculated value is larger than V beta, the current CU is divided, jumping to the step (4), and if the calculated value is between V alpha and V beta, continuing to the step (2);

logic (2): calculating the result of the current CU horizontal texture complexity HMAD and judging the CU by using threshold values (H alpha and H beta) related to the horizontal texture complexity, if the calculated value is smaller than H alpha, the current CU is not divided, calculating the cost of the current CU which is not divided and skipping to the step (5), and if the calculated value is larger than H beta, the current CU is divided and skipping to the step (4);

logic (3): dividing the current CU, calculating the division cost, and proceeding to the step (4);

logic (4): judging whether the size of the current CU is the minimum size or not, and if not, jumping back to the step (1);

logic (5): and if the size of the current CU is the minimum division size or the current CU is not divided according to the previous judgment result, selecting the optimal division mode of the current CU according to the calculated cost.

Logic (6): judging the texture directionality of the prediction mode of the current CU, comparing the values of the vertical texture complexity and the horizontal texture complexity, if the vertical texture complexity is larger than the horizontal texture complexity, the prediction mode is a vertical mode, and jumping to the step (7), and if the vertical texture complexity is smaller than the horizontal texture complexity, the prediction mode is a horizontal mode, and jumping to the step (8);

logic (7): calculating the rate distortion cost of the representative mode in the vertical mode, taking the mode C2 with the minimum rate distortion cost, and jumping to the step (9);

logic (8): calculating the rate distortion cost of the representative mode in the horizontal mode, taking the mode C2 with the minimum rate distortion cost, and jumping to the step (9);

logic (9): and (5) calculating the rate distortion cost of each of the left and right 2 modes of the mode C2 with the minimum rate distortion cost in the step (7) or (8), wherein the two modes with the minimum rate distortion cost are the optimal candidate modes.

According to yet another aspect of the present invention, a hardware video codec for performing HEVC is presented, which is operable to enable fast decision-making for texture feature based 360 degree video intra prediction in HEVC, comprising:

circuit block (1): initializing a current Coding Unit (CU), calculating the image vertical texture complexity VMAD of the CU, judging the CU by using threshold values (V alpha and V beta) related to the vertical texture complexity, if the calculated value is smaller than V alpha, the current CU is not divided, calculating the cost that the current CU is not divided, and jumping to the step (5), if the calculated value is larger than V beta, the current CU is divided, jumping to the step (4), and if the calculated value is between V alpha and V beta, continuing to the step (2);

circuit block (2): calculating the result of the current CU horizontal texture complexity HMAD and judging the CU by using threshold values (H alpha and H beta) related to the horizontal texture complexity, if the calculated value is smaller than H alpha, the current CU is not divided, calculating the cost of the current CU which is not divided and jumping to the step (5), and if the calculated value is larger than H beta, the current CU is divided and jumping to the step (4);

circuit block (3): dividing the current CU, calculating the division cost, and proceeding to the step (4);

circuit block (4): judging whether the size of the current CU is the minimum size or not, and if not, jumping back to the step (1);

circuit block (5): and if the size of the current CU is the minimum division size or the current CU is not divided according to the previous judgment result, selecting the optimal division mode of the current CU according to the calculated cost.

Circuit block (6): judging the texture directionality of the prediction mode of the current CU, comparing the values of the vertical texture complexity and the horizontal texture complexity, if the vertical texture complexity is larger than the horizontal texture complexity, the prediction mode is a vertical mode, and jumping to the step (7), and if the vertical texture complexity is smaller than the horizontal texture complexity, the prediction mode is a horizontal mode, and jumping to the step (8);

circuit block (7): calculating the rate distortion cost of the representative mode in the vertical mode, taking the mode C2 with the minimum rate distortion cost, and jumping to the step (9);

circuit block (8): calculating the rate distortion cost of the representative mode in the horizontal mode, taking the mode C2 with the minimum rate distortion cost, and jumping to the step (9);

circuit block (9): and (4) calculating the rate distortion cost of each 2 modes at the left and right of the mode C2 with the minimum rate distortion cost in the step (7) or (8), wherein the two modes with the minimum rate distortion cost are the optimal candidate modes.

According to still another aspect of the present invention, an apparatus for fast decision making based on texture feature 360 degree video intra prediction in HEVC is provided, including:

unit (1): initializing a current Coding Unit (CU), calculating the image vertical texture complexity VMAD of the CU, judging the CU by using threshold values (V alpha and V beta) related to the vertical texture complexity, if the calculated value is smaller than V alpha, the current CU is not divided, calculating the cost that the current CU is not divided, and jumping to the step (5), if the calculated value is larger than V beta, the current CU is divided, jumping to the step (4), and if the calculated value is between V alpha and V beta, continuing to the step (2);

unit (2): calculating the result of the current CU horizontal texture complexity HMAD and judging the CU by using threshold values (H alpha and H beta) related to the horizontal texture complexity, if the calculated value is smaller than H alpha, the current CU is not divided, calculating the cost of the current CU which is not divided and jumping to the step (5), and if the calculated value is larger than H beta, the current CU is divided and jumping to the step (4);

unit (3): dividing the current CU, calculating the division cost, and proceeding to the step (4);

unit (4): judging whether the size of the current CU is the minimum size or not, and if not, jumping back to the step (1);

unit (5): and if the size of the current CU is the minimum division size or the current CU is not divided according to the previous judgment result, selecting the optimal division mode of the current CU according to the calculated cost.

Unit (6): judging the texture directionality of the prediction mode of the current CU, comparing the values of the vertical texture complexity and the horizontal texture complexity, if the vertical texture complexity is larger than the horizontal texture complexity, the prediction mode is a vertical mode, and jumping to the step (7), and if the vertical texture complexity is smaller than the horizontal texture complexity, the prediction mode is a horizontal mode, and jumping to the step (8);

unit (7): calculating the rate distortion cost of the representative mode in the vertical mode, taking the mode C2 with the minimum rate distortion cost, and jumping to the step (9);

unit (8): calculating the rate distortion cost of the representative mode in the horizontal mode, taking the mode C2 with the minimum rate distortion cost, and jumping to the step (9);

unit (9): and (4) calculating the rate distortion cost of each 2 modes at the left and right of the mode C2 with the minimum rate distortion cost in the step (7) or (8), wherein the two modes with the minimum rate distortion cost are the optimal candidate modes.

According to yet another aspect of the invention, a computer-readable medium is provided having computer-readable instructions stored thereon which, when executed, implement the respective method as described above.

According to yet another aspect of the invention, there is provided an apparatus comprising:

the video coding device comprises an input unit, a processing unit and a processing unit, wherein the input unit is used for receiving an original video frame to be subjected to HEVC coding;

a memory for storing the received original video frame and the encoded video frame;

one or more processors configured to implement the respective methods as described above.

Drawings

Fig. 1 illustrates one embodiment of an encoder block diagram of HEVC;

FIG. 2 illustrates an exemplary CU and CTU partitioning;

fig. 3 shows a 360 degree HEVC-based video encoder framework;

fig. 4 shows an expansion method of the ERP projection format.

Fig. 5 shows a high level flow diagram of the inventive coding scheme for CU partitioning.

Fig. 6 shows a high level flow chart of the inventive mode selective coding scheme.

Fig. 7 shows the division of vertical and horizontal modes of the 35 candidate prediction modes.

Fig. 8 shows the CU partition comparison of the first frame in HEVC incorporating the present invention with the "drivingmincity" sequence and the "chairliftfride" sequence in standard mode;

FIG. 9 shows a codec for implementing the method of the invention, according to an embodiment of the invention;

figure 10 shows an apparatus for practicing the invention, according to one embodiment of the invention.

Detailed Description

Various aspects are now described with reference to the drawings. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of one or more aspects. It may be evident, however, that such aspect(s) may be practiced without these specific details.

As used in this application, the terms "component," "module," "system," and the like are intended to refer to a computer-related entity, such as but not limited to hardware, firmware, a combination of hardware and software, or software in execution. For example, a component may be, but is not limited to: a process running on a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a computing device and the computing device can be a component. One or more components can reside within a process and/or thread of execution and a component can be localized on one computer and/or distributed between two or more computers. In addition, these components can execute from various computer readable media having various data structures stored thereon. The components may communicate by way of local and/or remote processes such as in accordance with a signal having one or more data packets, e.g., data from one component interacting with another component in a local system, distributed system, and/or across a network such as the internet with other systems by way of the signal.

Fig. 1 shows a general block diagram of a video encoder implemented by High Efficiency Video Coding (HEVC). The encoder architecture of HEVC is substantially the same as that used in h.264, and mainly aims to further research and improve the algorithms used in each module, especially for high resolution video sequences, and the improvement aims to reduce the bitrate to 50% of the h.264 standard under the same video quality (PSNR).

Since the encoder architecture of HEVC is substantially the same as that used by h.264, the overall architecture of fig. 1 is not described in this application, so as not to obscure the present invention.

In HEVC, the input video is first partitioned into small blocks called Coding Tree Units (CTUs). As will be understood by those skilled in the art, the CTU corresponds to the concept of a macroblock (macroblock) of the previous standard. A Coding Unit (CU) is a square (pixel) unit with one prediction mode (intra, inter or skipped). The prediction unit partitioning based on CTUs and CUs is shown in fig. 2.

The HEVC-based 360-degree video coding framework mainly includes coding format conversion, encoder/decoder and decoding format conversion. The encoding process is shown in fig. 3.

In 360 degree video coding, spherical video is projected into a two-dimensional rectangular format. For the most popular ERP projection format used in 360 degree video coding, the content located near both poles is greatly stretched, resulting in a change in the texture characteristics of the area. We performed a comprehensive statistical analysis of 360 video sequences in ERP format. As shown in fig. 3, the upper and lower portions of these sequences are typically the largest and the texture is typically uniform. Therefore, these portions can be encoded using large blocks. The block information in the middle of these sequences is relatively complex and requires coding small blocks. Fig. 4 shows an expansion method of the ERP projection format.

A. Summary of the solution

In HEVC, the encoding side decides whether to use one large CU or split it into smaller CUs by the rate-distortion (RD) criterion. The CU partitioning structure based on the quadtree can flexibly adapt to various texture structures of the image. However, to find the optimal CU partition requires a lot of computational complexity, because the encoding side needs to look at the RD of each size CU to find the optimal partition. Most of the encoding time is spent on a large number of RD checks. If the CU partitioning method can be known in advance, we can save a lot of coding time.

There are 35 prediction modes per CU in intra prediction. For each LCU, 3 candidate modes may be selected for 64 × 64, 32 × 32, and 16 × 16 blocks and 8 candidate modes may be selected for 8 × 8 and 4 × 4 blocks by preliminary mode determination (RMD). The combination with the most probable modes of the neighboring blocks allows the PU to obtain the best mode candidate list (MPM). For each Mode in the candidate list, we select the optimal Mode Best Mode (BM) by Rate Distortion Optimization (RDO). The process of optimal mode selection is based on a top-down quadtree structure approach. And performing transformation quantization through top-down traversal to obtain the optimal mode with the minimum RDcost. After the current PU and its four sub-PUs complete prediction, the CU is encoded by a bottom-up quadtree structure. And selecting the sub-optimal CU with the smaller RDcost by comparing the RDcost of the current CU with the sum of the RDcosts of the four sub-CUs. More specifically, after the four sub-CUs complete prediction, the four sub-CUs are compared with the current CU to determine the next optimal partition, which is performed in a bottom-up manner. For example, after the intra prediction of four 4 × 4 blocks of the current 8 × 8 block is completed, the RDcost of the current 8 × 8 block is compared to determine whether the 8 × 8 block is divided, which is the sub-optimal division. The 16 × 16 block is compared with its four 8 × 8 blocks, and so on, until the 64 × 64 block is compared with its four 32 × 32 blocks, which is performed by a bottom-up quadtree structure, and from the maximum depth, each time a comparison is made, the sub-optimal part is saved, until the depth is 0, the optimal partitioning mode of the LCU is obtained.

According to the invention, the advanced partitioning of the CU acts on all the calculations of the RDcost in a top-down manner after the RMD, and the advanced partitioning can save the time for traversing the large blocks. The CU early termination division plays a role in comparing all bottom-up RDcosts to obtain the optimal CU, and the time for making small blocks can be saved by early termination division. In other words, the early division of the CU is to skip the prediction of the current block, and the early termination division of the CU is to terminate the CU continuation division. If a CU is most likely to be partitioned ahead of time, we skip all the processes of calculating RDcost for current intra prediction. Otherwise, if a CU can terminate partitioning early, it is not partitioned.

In the RMD mode, the present invention improves coding efficiency by reducing the number of candidate modes by classifying 35 candidate modes and selecting 2 optimal candidate modes for each of 64 × 64, 32 × 32, and 16 × 16 blocks and 8 × 8 and 4 × 4 blocks.

B. Fast decision scheme of the present application

In order to reduce the computational complexity, a 360-degree video intra-frame prediction fast decision method based on texture features is provided.

The basic concept of the method is as follows: the texture features are utilized to improve the efficiency of 360-degree video intra-frame prediction coding in the ERP format. In one aspect, the texture complexity is pre-computed to determine the depth of the predicted CU block in the current CU partitioning process, based on the complexity of the texture features. If the texture complexity of the current block is high, we should divide it into small blocks to balance compression and image quality, and directly skip the prediction of the block and directly make the prediction of the sub-block portion of the current block (i.e. divide in advance), thus saving the time for predicting the large block. If the complexity of the texture of the current block is low, we skip the calculation of RDO and decide directly that no further partitioning is needed, we do not recurse any more, i.e. terminate the partitioning, saving time to predict the sub-blocks of the current block (i.e. terminate the partitioning early). Fig. 5 shows a high level flow diagram of the inventive coding scheme for CU partitioning.

The calculation formula for the vertical and horizontal texture complexity is as follows:

the calculation formula for the average vertical and horizontal texture complexity is as follows:

take horizontal texture complexity as an example:

1) when meanHMAD < H α, i.e. the texture information of the CU block is simple, further calculations of RD cost comparisons between the CU block and its four sub-blocks are skipped and the current depth is directly determined as the depth of the optimal CU.

2) When meanHMAD > H β, i.e. the texture information of a CU block is complex, skipping further calculations of RD cost comparisons between the CU block and its four sub-blocks, the current depth is directly determined to be a non-optimal depth, thus requiring further partitioning.

3) When H α < meanHMAD < H β, RDcost should be calculated to determine whether to continue further partitioning.

On the other hand, the directionality of the texture features may also be used to determine the directionality of the candidate pattern in the pattern decision process. In the prediction mode decision algorithm, we add some new decisions in the RMD and MPM process before RDO, based on the vertical and horizontal texture directivities calculated in the previous step. Fig. 7 shows the division of the vertical mode and the horizontal mode of the 35 candidate prediction modes. The angle candidate patterns are reduced from 33 to 17 and then the 17 patterns are divided into 5 groups to further determine which group is likely the best pattern. Finally, we add the planer mode and the DC mode, determine two candidate modes with smaller rdcosts as the optimal candidate mode, and add them to the MPM. Compared with the 35 pattern selection 8/3 algorithms in the standard mode, the 35 patterns must be traversed in the standard mode to determine the optimal pattern. The present invention halves the candidate patterns directly by determining texture directionality and then further screens the remaining half to determine the best pattern. Fig. 6 shows a high level flow chart of the inventive mode selective coding scheme.

Another aspect of this decision is to set the thresholds for early partitioning and early termination of partitioning by statistics.

Another aspect of this decision is that the threshold varies with boundary values related to the quantization step size (QP) and CU depth variations.

Another aspect of this decision is that texture directionality is compared using average horizontal and vertical texture complexity computed in CU partitioning.

Fig. 8 shows the CU partition comparison of the first frame in HEVC incorporating the present invention with the "drivingmincity" sequence and the "chairliftfride" sequence in standard mode; the partition mode of a CU may be divided into several most likely cases according to the texture features of the current frame. And accelerating the process by judging the complexity of the image. The present invention proposes a simple way to perform fast blocking by determining the texture complexity of the current CU relative to the texture feature value calculated for the current CU.

An HEVC-based video encoder for performing the above method is shown in fig. 9. Which comprises six logical blocks (1) - (9) in a one-to-one correspondence with the steps of the method described above. In one embodiment, the video encoder may be an HEVC-based hardware video codec (e.g., separate chip or circuit logic regions) with six logic blocks (1) - (9) corresponding to six circuit blocks in the codec with the same functionality. In another embodiment, the video encoder may be a software program executing in a processor, wherein six logical blocks (1) - (9) correspond to program modules and/or units having the same functionality.

FIG. 10 shows an apparatus for practicing the invention, according to one embodiment of the invention, comprising: the video coding device comprises an input unit, a processing unit and a processing unit, wherein the input unit is used for receiving an original video frame to be subjected to HEVC coding; a memory for storing the received original video frame and the encoded video frame; one or more processors configured to implement the respective methods as described above.

In summary, the present invention provides a fast decision method for 360-degree video intra prediction based on texture features in High Efficiency Video Coding (HEVC). More specifically, the present invention proposes improved CU partition prediction and mode selection prediction decisions in intra prediction.

Fig. 5 and 6 show fast decision-making methods for texture feature complexity based 360-degree video intra prediction in HEVC according to one embodiment of the invention.

(2) calculating the result of the current CU horizontal texture complexity HMAD and judging the CU by using threshold values (H alpha and H beta) related to the horizontal texture complexity, if the calculated value is smaller than H alpha, the current CU is not divided, calculating the cost of the current CU which is not divided and skipping to the step (5), and if the calculated value is larger than H beta, the current CU is divided and skipping to the step (4);

In step (1), a CU with the largest size, for example, 64 × 64 in HEVC, is initially selected as the current CU.

On the other hand, the invention also provides a device for quickly deciding the 360-degree video intra-frame prediction based on the texture features in High Efficiency Video Coding (HEVC).

The invention also provides a video codec for realizing the method or the device.

The invention also proposes a corresponding device claim.

The invention also proposes a computer program product comprising instructions which, when executed by a processor, perform the above-mentioned method.

It should be understood by those skilled in the art that although the present invention is made with respect to HEVC, any video coding technique that employs CU blocking techniques after HEVC may apply the present invention.

The disclosed methods may be implemented in software, hardware, firmware, etc.

When implemented in hardware, the video encoder may be implemented or performed with a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration. Additionally, at least one processor may comprise one or more modules operable to perform one or more of the steps and/or operations described above.

When the video encoder is implemented in hardware circuitry, such as an ASIC, FPGA, or the like, it may include various circuit blocks configured to perform various functions. Those skilled in the art can design and implement these circuits in various ways to achieve the various functions disclosed herein, depending on various constraints imposed on the overall system.

While the foregoing disclosure discusses illustrative aspects and/or embodiments, it should be noted that many changes and modifications could be made herein without departing from the scope of the described aspects and/or embodiments as defined by the appended claims. Furthermore, although elements of the described aspects and/or embodiments may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated. Additionally, all or a portion of any aspect and/or embodiment may be utilized with all or a portion of any other aspect and/or embodiment, unless stated to the contrary.

Claims

1. A method for fast decision making of texture feature based 360-degree video intra prediction in HEVC comprises the following steps:

(1) initializing a current coding unit CU, calculating the image vertical texture complexity VMAD of the CU, judging the CU by using threshold values V alpha and V beta related to the vertical texture complexity, if the calculated value is smaller than V alpha, the current CU is not divided, calculating the cost that the current CU is not divided, and jumping to the step (4), if the calculated value is larger than V beta, the current CU is divided, jumping to the step (3), and if the calculated value is between V alpha and V beta, continuing to the step (2);

(2) calculating the result of the current CU horizontal texture complexity HMAD and judging the CU by using thresholds H alpha and H beta related to the horizontal texture complexity, if the calculated value is smaller than H alpha, the current CU is not divided, calculating the cost of the current CU not divided and jumping to the step (4), if the calculated value is larger than H beta, the current CU is divided and jumping to the step (3), if the calculated value is between H alpha and H beta, calculating the division cost to determine whether to continue further division, and if the current CU does not need further division, calculating the cost of the current CU not divided and jumping to the step (4);

(3) judging whether the size of the current CU is the minimum size or not, and if not, jumping back to the step (1);

(4) if the size of the current CU is the minimum division size or the previous judgment result is that the current CU is not divided, selecting the optimal division mode of the current CU according to the calculated cost;

(5) judging the texture directionality of the prediction mode of the current CU, comparing the values of the vertical texture complexity and the horizontal texture complexity, if the vertical texture complexity is larger than the horizontal texture complexity, the prediction mode is a vertical mode, and jumping to the step (6), and if the vertical texture complexity is smaller than the horizontal texture complexity, the prediction mode is a horizontal mode, and jumping to the step (7);

(6) calculating the rate distortion cost of the representative mode in the vertical mode, taking the mode C2 with the minimum rate distortion cost, and jumping to the step (8);

(7) calculating the rate distortion cost of the representative mode in the horizontal modes, taking the mode C2 with the minimum rate distortion cost, and jumping to the step (8);

(8) and (4) calculating the rate distortion cost of the adjacent 4 modes of the mode C2 with the minimum rate distortion cost in the step (6) or (7), wherein the two modes with the minimum rate distortion cost are the optimal candidate modes.

2. The method of claim 1, step (1) further comprising:

and vertically traversing all pixel points of each column of the current CU, calculating the vertical texture complexity VMAD of each column, solving the average vertical texture complexity meanVMAD of the current CU, and comparing with thresholds V alpha and V beta related to the vertical texture complexity.

3. The method of claim 1, step (2) further comprising:

and traversing all pixel points of each line of the current CU horizontally, calculating the horizontal texture complexity HMAD of each line, solving the average horizontal texture complexity means HMAD of the current CU, and comparing with thresholds H alpha and H beta related to the horizontal texture complexity.

4. The method of claim 1, step (5) further comprising:

and (3) comparing the average texture complexity obtained by calculating in the step (1) and the step (2) by using the values of the vertical texture complexity and the horizontal texture complexity, wherein the mode with lower complexity is a better mode.

5. The method of claim 1, step (6) further comprising:

6. The method of claim 1, step (7) further comprising:

7. The method of claim 1, step (8) further comprising:

if the rate-distortion cost of the placer and the DC mode is minimum, the two modes are optimal candidate modes; otherwise, the mode C2 with the smallest rate-distortion cost in the calculation step (6) or (7) is adjacent to 4 modes: c2-2, C2-1, C2+1, C2+2, the optimal candidate patterns are the two patterns in which the rate-distortion cost is the smallest.

8. The method as claimed in claim 1, wherein two thresholds related to the complexity of the vertical texture and the complexity of the horizontal texture for determining the current CU partition are set by statistics, and the partition result of the first frame video image of the 360-degree video, the complexity of the image texture and related data are counted in a sampling manner to obtain the optimal threshold for the CU partition.

9. A hardware video codec for performing HEVC operable to enable texture feature based 360 degree video intra prediction fast decision in HEVC, comprising:

a circuit block 1: initializing a current coding unit CU, calculating the image vertical texture complexity VMAD of the CU, judging the CU by using thresholds V alpha and V beta related to the vertical texture complexity, if the calculated value is smaller than V alpha, the current CU is not divided, calculating the cost that the current CU is not divided, and jumping to a circuit block 4, if the calculated value is larger than V beta, the current CU is divided, jumping to a circuit block 3, and if the calculated value is between V alpha and V beta, continuing to the circuit block 2;

the circuit block 2: calculating the result of the current CU horizontal texture complexity HMAD and judging the CU by using thresholds H alpha and H beta related to the horizontal texture complexity, if the calculated value is smaller than H alpha, the current CU is not divided, calculating the cost of the current CU not divided and jumping to a circuit block 4, if the calculated value is larger than H beta, the current CU is divided and jumping to a circuit block 3, if the calculated value is between H alpha and H beta, calculating the division cost to determine whether to continue further division, and if the current CU does not need further division, calculating the cost of the current CU not divided and jumping to the circuit block 4;

a circuit block 3: judging whether the size of the current CU is the minimum size or not, and if not, skipping back to the circuit block 1;

the circuit block 4: if the size of the current CU is the minimum division size or the previous judgment result is that the current CU is not divided, selecting the optimal division mode of the current CU according to the calculated cost;

the circuit block 5: judging texture directionality of a prediction mode of the current CU, comparing values of vertical texture complexity and horizontal texture complexity, if the vertical texture complexity is larger than the horizontal texture complexity, the prediction mode is a vertical mode and jumps to a circuit block 6, and if the vertical texture complexity is smaller than the horizontal texture complexity, the prediction mode is a horizontal mode and jumps to a circuit block 7;

the circuit block 6: calculating the rate distortion cost of the representative mode in the vertical mode, selecting the mode C2 with the minimum rate distortion cost, and jumping to the circuit block 8;

the circuit block 7: calculating the rate distortion cost of the representative mode in the horizontal mode, taking the mode C2 with the minimum rate distortion cost, and jumping to the circuit block 8;

the circuit block 8: rate-distortion costs of the neighboring 4 modes of the mode C2 in which the rate-distortion cost is the smallest in the circuit block 6 or 7 are calculated, and the two modes in which the rate-distortion cost is the smallest are the optimal candidate modes.

10. A texture feature based 360-degree video intra prediction fast decision device in HEVC, comprising:

unit 1: initializing a current coding unit CU, calculating the image vertical texture complexity VMAD of the CU, judging the CU by using threshold values V alpha and V beta related to the vertical texture complexity, if the calculated value is smaller than V alpha, the current CU is not divided, calculating the cost that the current CU is not divided, and jumping to a unit 4, if the calculated value is larger than V beta, the current CU is divided, jumping to a unit 3, and if the calculated value is between V alpha and V beta, continuing to a unit 2;

unit 2: calculating the result of the current CU horizontal texture complexity HMAD and judging the CU by using thresholds H alpha and H beta related to the horizontal texture complexity, if the calculated value is smaller than H alpha, the current CU is not divided, calculating the cost of the current CU not divided and jumping to a unit 4, if the calculated value is larger than H beta, the current CU is divided and jumping to a unit 3, if the calculated value is between H alpha and H beta, calculating the division cost to determine whether to continue further division, and if the current CU does not need further division, calculating the cost of the current CU not divided and jumping to the unit 4;

unit 3: judging whether the size of the current CU is the minimum size or not, and if not, skipping back to the unit 1;

unit 4: if the size of the current CU is the minimum partition size or the current CU is not partitioned according to the previous judgment result, selecting the optimal partition mode of the current CU according to the calculated cost;

unit 5: judging texture directionality of a prediction mode of the current CU, comparing values of vertical texture complexity and horizontal texture complexity, if the vertical texture complexity is larger than the horizontal texture complexity, the prediction mode is a vertical mode, and jumping to a unit 6, and if the vertical texture complexity is smaller than the horizontal texture complexity, the prediction mode is a horizontal mode, and jumping to a unit 7;

a unit 6: calculating the rate distortion cost of the representative mode in the vertical mode, taking the mode C2 with the minimum rate distortion cost, and jumping to the unit 8;

a unit 7: calculating the rate distortion cost of the representative mode in the horizontal mode, taking the mode C2 with the minimum rate distortion cost, and jumping to the unit 8;

unit 8: rate-distortion costs of the neighboring 4 modes of the mode C2 with the minimum rate-distortion cost in the calculation unit 6 or 7 are calculated, and the two modes with the minimum rate-distortion costs are the optimal candidate modes.

11. A computer readable medium containing program code which, when executed by a processor, performs the method of any of claims 1-8.