CN112804523B

CN112804523B - Parallel intra-frame mode selection system and method

Info

Publication number: CN112804523B
Application number: CN202011629844.4A
Authority: CN
Inventors: 向国庆; 刘凯阳; 文湘鄂; 范晓东; 宋磊
Original assignee: Beijing Boya Huishi Intelligent Technology Research Institute Co ltd
Current assignee: Beijing Boya Huishi Intelligent Technology Research Institute Co ltd
Priority date: 2020-12-30
Filing date: 2020-12-30
Publication date: 2023-01-03
Anticipated expiration: 2040-12-30
Also published as: CN112804523A

Abstract

The application discloses a parallel intra mode selection system and method, comprising: the storage module divides pixel data into four prediction units with different sizes, inputs the prediction units into a first parallel module and a second parallel module, and sends the gradient to the first parallel module; the first parallel module includes: a gradient calculation unit and a plurality of PU calculation units; the gradient calculation unit is used for calculating gradients of the prediction units with different sizes and sending the gradients to the storage module; the PU calculation units are used for determining an optimal intra-frame prediction mode corresponding to the prediction units with three sizes according to the gradient and sending the optimal intra-frame prediction mode to the comparison output module; the second parallel module determines the optimal intra-frame prediction mode of the prediction unit with each size according to the rough mode decision, the prediction mode bit number and the mode decision and outputs the optimal intra-frame prediction mode to the comparison output module; and the comparison output module compares the optimal intra-frame prediction mode and outputs the optimal intra-frame prediction mode. The parallel mode is used for accelerating the calculation speed; and calculating the optimal intra mode reduction data dependency according to the bit number of the prediction mode.

Description

Parallel intra-frame mode selection system and method

Technical Field

The present application relates to the field of video coding technologies, and in particular, to a parallel intra mode selection system and method.

Background

Intra-frame coding is an important component of video coding techniques, which is generated in order to reduce spatial redundancy. In order to make the predicted value more accurate and the predicted residual error smaller, the intra-frame prediction in the current mainstream standard adopts a plurality of prediction modes for prediction. For example, in the AVS2 video coding standard, all Prediction Units (PUs) need to traverse 33 intra-frame Prediction modes to perform a Rough Mode Decision (RMD) process, obtain 9 better candidate modes to perform a next Rate Distortion Optimization (RDO) process, and compare the obtained Rate Distortion costs to select an optimal intra-frame Prediction Mode. Although the performance of intra-frame coding is remarkably improved by increasing the modes, the complexity of coding is also increased sharply, and the real-time implementation of hardware is not facilitated.

In view of the foregoing, it is desirable to provide a parallel intra mode selection system and method that can improve the computation speed and achieve real-time encoding.

Disclosure of Invention

To solve the above problems, the present application proposes a parallel intra mode selection system and method.

In a first aspect, the present application provides a parallel intra mode selection system, comprising: the device comprises a storage module, a first parallel module, a second parallel module and a comparison output module;

the storage module is used for storing pixel data and gradients, dividing the pixel data into four prediction units with different sizes, inputting the prediction units into the first parallel module and the second parallel module, and sending the gradients to the first parallel module;

the first parallel module comprising: a gradient calculation unit and a plurality of PU calculation units;

the gradient calculation unit is used for calculating gradients of the prediction units with three different sizes and sending the gradients to the storage module;

the PU calculation units are all used for determining an optimal intra-frame prediction mode corresponding to the prediction units with three sizes according to the gradient and sending the optimal intra-frame prediction mode to the comparison output module;

the second parallel module is used for determining the optimal intra-frame prediction mode of the prediction unit with each size according to the rough mode decision, the prediction mode bit number and the mode decision and outputting the optimal intra-frame prediction mode to the comparison output module;

the comparison output module is used for comparing the received multiple optimal intra-frame prediction modes and outputting one optimal intra-frame prediction mode.

Preferably, the storage module includes: a pixel storage unit and a gradient storage unit;

the pixel storage unit is used for dividing the pixel data into four prediction units with different sizes, and the four prediction units with different sizes comprise: a first size prediction unit, a second size prediction unit, a third size prediction unit, and a fourth size prediction unit; sending the first size prediction unit, the second size prediction unit and the third size prediction unit to the gradient calculation unit, and sending the fourth size prediction unit to the second parallel module;

the gradient storage unit is configured to store the gradient sent by the gradient calculation unit and send the gradient to the plurality of PU calculation units in the first parallel module.

Preferably, the second parallel module comprises: a first mode decision unit and a second mode decision unit;

the first mode decision unit is configured to determine multiple intra-frame prediction modes of a fourth size prediction unit according to a coarse mode decision and a prediction mode bit number, and send the multiple intra-frame prediction modes to the second mode decision unit;

and the second mode decision unit is used for determining an optimal intra-frame prediction mode corresponding to the fourth size prediction unit from a plurality of intra-frame prediction modes according to mode decision and outputting the optimal intra-frame prediction mode to the comparison output module.

Preferably, the gradient calculating unit is specifically configured to calculate a first horizontal gradient, a first vertical gradient, a second vertical gradient, and a third horizontal gradient corresponding to the first size predicting unit, and a third horizontal gradient and a third vertical gradient corresponding to the third size predicting unit, and send the calculated gradients to the gradient storage unit.

Preferably, a plurality of said PU computation units, comprising: the system comprises a first PU computing unit, a second PU computing unit and a third PU computing unit;

the first PU calculation unit is configured to calculate a first gradient angle corresponding to the first size prediction unit according to the first horizontal gradient and the first vertical gradient, determine two intra-frame prediction modes corresponding to the first size prediction unit according to the first gradient angle, determine an optimal intra-frame prediction mode corresponding to the first size prediction unit from the two intra-frame prediction modes corresponding to the first size prediction unit and multiple non-angle intra-frame prediction modes, and send the optimal intra-frame prediction mode to the comparison output module;

the second PU calculation unit is configured to calculate a second gradient angle corresponding to the second size prediction unit according to the second horizontal gradient and the second vertical gradient, determine two intra-frame prediction modes corresponding to the second size prediction unit according to the second gradient angle, determine an optimal intra-frame prediction mode corresponding to the second size prediction unit from the two intra-frame prediction modes corresponding to the second size prediction unit and multiple non-angular intra-frame prediction modes, and send the optimal intra-frame prediction mode to the comparison output module;

the third PU calculation unit is configured to calculate a third gradient angle corresponding to the third size prediction unit according to the third horizontal gradient and the third vertical gradient, determine two intra-frame prediction modes corresponding to the third size prediction unit according to the third gradient angle, determine an optimal intra-frame prediction mode corresponding to the third size prediction unit from the two intra-frame prediction modes corresponding to the third size prediction unit and multiple non-angular intra-frame prediction modes, and send the optimal intra-frame prediction mode to the comparison output module.

Preferably, the pixel storage unit includes: 16 RAMs of width 32 and depth 64 bits.

Preferably, the gradient storage unit includes: a 32-bit RAM of depth 496.

Preferably, the prediction mode bit number comprises a plurality of bit values, the bit values corresponding to different intra prediction modes.

Preferably, the gradient calculating unit calculates the first horizontal gradient, the first vertical gradient, the second horizontal gradient, the second vertical gradient, the third horizontal gradient, and the third vertical gradient using a sobel operator.

In a second aspect, the present application provides a parallel intra mode selection method, including:

the storage module divides the stored pixel data into four prediction units with different sizes, and inputs the prediction units into the first parallel module and the second parallel module for parallel calculation;

the second parallel module determines the optimal intra-frame prediction mode of the prediction unit with each size according to the rough mode decision, the prediction mode bit number and the mode decision and outputs the optimal intra-frame prediction mode to the comparison output module;

the gradient calculation unit of the first parallel module calculates gradients of the prediction units with three different sizes and sends the gradients to the storage module;

a storage module sends the gradient to a plurality of PU calculation units of the first parallel module;

the PU calculation units of the first parallel module determine an optimal intra-frame prediction mode corresponding to the prediction units with three sizes according to the gradient and send the optimal intra-frame prediction mode to the comparison output module;

the comparison output module compares the received multiple optimal intra-frame prediction modes and outputs one optimal intra-frame prediction mode.

The application has the advantages that: the two parallel modules are used for simultaneously calculating, the first parallel module is used for calculating gradients and corresponding optimal intra-frame modes of prediction units with various sizes, and the second parallel module is used for calculating the optimal intra-frame mode of the prediction unit with one size, so that the calculation speed can be increased; and the second parallel module calculates the optimal intra-frame mode according to the bit number of the prediction mode, so that the data dependency can be reduced, the calculation speed is further increased, and the real-time coding is realized.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating preferred embodiments and are not to be construed as limiting the application. And like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

FIG. 1 is a schematic block diagram of a parallel intra mode selection system provided herein;

FIG. 2 is a schematic diagram of a parallel intra mode selection system provided herein;

FIG. 3 is a schematic diagram of a pixel storage unit of a parallel intra mode selection system provided herein;

FIG. 4 is a schematic diagram illustrating an update flow of a mode storage unit of a parallel intra mode selection system according to the present application;

FIG. 5 is a flowchart illustrating a conventional intra mode prediction;

FIG. 6 is a flow diagram illustrating a second parallel module of a parallel intra mode selection system provided herein;

FIG. 7 (a) is a schematic diagram of a horizontal convolution kernel of a Sobel operator for a parallel intra mode selection system provided herein;

FIG. 7 (b) is a schematic diagram of a vertical convolution kernel of a parallel intra mode selection system provided herein;

FIG. 8 is a flow diagram of a PU compute unit of a parallel intra mode selection system provided herein;

fig. 9 is a schematic diagram illustrating steps of a parallel intra mode selection method according to the present application.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

In a first aspect, according to an embodiment of the present application, a parallel intra mode selection system is provided, as shown in fig. 1, including: a memory module 300, a first parallel module 100, a second parallel module 200 and a comparison output module 400.

And the storage module is used for storing the pixel data and the gradient, dividing the pixel data into four prediction units with different sizes, inputting the prediction units into the first parallel module and the second parallel module, and sending the gradient to the first parallel module.

A first parallel module comprising: a gradient calculation unit 110 and a plurality of PU calculation units (121 to 12 n).

The gradient calculation unit is used for calculating gradients of the prediction units with three different sizes and sending the gradients to the storage module. And the PU computing units are all used for determining an optimal intra-frame prediction mode corresponding to the prediction units with three sizes according to the gradient and sending the optimal intra-frame prediction mode to the comparison output module.

And the second parallel module is used for determining the optimal intra-frame prediction mode of the prediction unit with each size according to the rough mode decision, the prediction mode bit number and the mode decision and outputting the optimal intra-frame prediction mode to the comparison output module.

And the comparison output module is used for comparing the received multiple optimal intra-frame prediction modes and outputting an optimal intra-frame prediction mode.

As shown in fig. 2, the memory module includes: a pixel storage unit 310 and a gradient storage unit 320.

A pixel storage unit for dividing pixel data into four different sized prediction units, the four different sized prediction units comprising: a first size prediction unit, a second size prediction unit, a third size prediction unit, and a fourth size prediction unit; and sending the first size prediction unit, the second size prediction unit and the third size prediction unit to the gradient calculation unit, and sending the fourth size prediction unit to the second parallel module.

The first size prediction unit is an 8 × 8 prediction unit, the second size prediction unit is a 16 × 16 prediction unit, the third size prediction unit is a 32 × 32 prediction unit, and the fourth size prediction unit is a 4 × 4 prediction unit.

And the gradient storage unit is used for storing the gradient sent by the gradient calculation unit and sending the gradient to the plurality of PU calculation units in the first parallel module.

A second parallel module comprising: a first mode decision unit and a second mode decision unit.

And the first mode decision unit is used for determining a plurality of intra-frame prediction modes of the fourth size prediction unit according to the rough mode decision and the prediction mode bit number and sending the intra-frame prediction modes to the second mode decision unit.

And the second mode decision unit is used for determining an optimal intra-frame prediction mode corresponding to the fourth size prediction unit from the multiple intra-frame prediction modes according to the mode decision and outputting the optimal intra-frame prediction mode to the comparison output module.

The gradient calculation unit is specifically configured to calculate a first horizontal gradient, a first vertical gradient, a second vertical gradient, and a third horizontal gradient corresponding to the first size prediction unit, and a gradient, a third horizontal gradient, and a third vertical gradient corresponding to the third size prediction unit, and send the gradients to the gradient storage unit.

As shown in fig. 2, a plurality of PU computational units, comprising: a first PU calculation unit 121, a second PU calculation unit 122, and a third PU calculation unit 123.

And the first PU calculating unit is used for calculating a first gradient angle corresponding to the first size prediction unit according to the first horizontal gradient and the first vertical gradient, determining a first texture direction angle according to the first gradient angle, and determining two intra-frame prediction modes corresponding to the first size prediction unit according to the first texture direction angle. And forming a coarse mode decision candidate list of the first size prediction unit by using two intra prediction modes corresponding to the first size prediction unit and the left and upper current optimal prediction modes corresponding to the first size prediction unit, and determining one current optimal prediction mode corresponding to the first size prediction unit by using a coarse mode decision from the coarse mode decision candidate list of the first size prediction unit. And determining an optimal intra-frame prediction mode corresponding to the first size prediction unit from the current optimal prediction mode corresponding to the first size prediction unit and a plurality of non-angle intra-frame prediction modes, and sending the optimal intra-frame prediction mode to the comparison output module. Wherein the current optimal prediction modes to the left and above corresponding to the first size are the current optimal prediction mode of the prediction unit to the left of the first size prediction unit currently being calculated and the current optimal prediction mode of the prediction unit above the first size prediction unit currently being calculated.

And the second PU calculating unit is used for calculating a second gradient angle corresponding to the second size prediction unit according to the second horizontal gradient and the second vertical gradient, determining a second texture direction angle according to the second gradient angle, and determining two intra-frame prediction modes corresponding to the second size prediction unit according to the second texture direction angle. And forming a coarse mode decision candidate list of the second size prediction unit by using the two intra-prediction modes corresponding to the second size prediction unit and the left and upper current optimal prediction modes corresponding to the second size prediction unit, and determining one current optimal prediction mode corresponding to the second size prediction unit by using a coarse mode decision from the coarse mode decision candidate list of the second size prediction unit. And determining an optimal intra-frame prediction mode corresponding to the second size prediction unit from the current optimal prediction mode corresponding to the second size prediction unit and a plurality of non-angle intra-frame prediction modes, and sending the optimal intra-frame prediction mode to the comparison output module. Wherein the optimal prediction modes to the left and above corresponding to the second size are the current optimal frame mode of the prediction unit to the left of the second-sized prediction unit currently being calculated and the current optimal prediction mode of the prediction unit above the second-sized prediction unit currently being calculated.

And the third PU calculating unit is used for calculating a third gradient angle corresponding to the third size prediction unit according to the third horizontal gradient and the third vertical gradient, determining a third texture direction angle according to the third gradient angle, and determining two intra-frame prediction modes corresponding to the third size prediction unit according to the third texture direction angle. And forming a rough mode decision candidate list of the third size prediction unit by using the two intra prediction modes corresponding to the third size prediction unit and the left and upper current optimal prediction modes corresponding to the third size prediction unit, and determining one current optimal prediction mode corresponding to the third size prediction unit by using a rough mode decision from the rough mode decision candidate list of the third size prediction unit. And determining an optimal intra-frame prediction mode corresponding to the third size prediction unit from the current optimal pre-prediction mode corresponding to the third size prediction unit and a plurality of non-angle intra-frame prediction modes, and sending the optimal intra-frame prediction mode to the comparison output module. Wherein the optimal prediction modes to the left and above corresponding to the third size are the current optimal prediction mode of the prediction unit to the left of the third-size prediction unit currently being calculated and the current optimal prediction mode of the prediction unit above the third-size prediction unit currently being calculated.

For the first size prediction unit, the second size prediction unit and the third size prediction unit currently undergoing calculation, the current optimal prediction modes of the left and upper prediction units thereof have used coarse mode decision, are calculated from the coarse mode decision candidate list and are stored in the mode storage unit, and thus can be directly obtained from the mode storage unit.

The gradient calculation unit, the first PU calculation unit, the second PU calculation unit and the third PU calculation unit in the first parallel module sequentially and serially calculate gradients of prediction units with three different sizes, an optimal intra-frame prediction mode of the first size prediction unit, an optimal intra-frame prediction mode of the second size prediction unit and an optimal intra-frame prediction mode of the third size prediction unit.

The plurality of non-angular intra prediction modes include: DC mode, plane mode and bipolar mode.

The pixel storage unit includes: 16 RAMs of width 32 and depth 64 bits.

The gradient storage unit includes: a 32-bit RAM of depth 496.

The storage module further comprises a mode storage unit, which is used for storing the current optimal prediction mode corresponding to each size of prediction unit obtained by the gradient calculation unit when the gradient calculation of each size of prediction unit is carried out. The pattern storage unit includes 28 RAMs of 8 bits in width.

The prediction mode bit number comprises a plurality of bit values, the bit values corresponding to different intra prediction modes.

A gradient calculation unit that calculates a first horizontal gradient, a first vertical gradient, a second horizontal gradient, a second vertical gradient, a third horizontal gradient, and a third vertical gradient using a Sobel operator.

The following examples are provided to further illustrate the present application.

In order to meet the requirement of 1920x1080@60fps at the clock frequency of 300MHz, the target time of a Coding Tree Unit (CTU) stage pipeline is 9800 clock cycles. To avoid the repeated calculation, at the beginning of the first parallel block, gradient information of all pixels in a Largest Coding Unit (LCU) needs to be calculated first using a gradient calculation unit. By default, the raw pixels of the LCU level (4096 =64 × 64) are stored in an 8-bit RAM with a depth of 4096. To meet the clock constraint, pixels of 64 × 64 LCUs are stored in a pixel storage unit including 16 RAMs. As shown in fig. 3, 64 pixels can be accessed in one clock cycle using 16 RAM memory pixels of 32 bits width and 64 bits depth, thereby saving access time significantly. The gradient calculation unit performs convolution calculation on the first size prediction unit, the second size prediction unit and the third size prediction unit by using a Sobel operator to obtain a first horizontal gradient, a first vertical gradient, a second horizontal gradient, a second vertical gradient, a third horizontal gradient and a third vertical gradient, and the process of obtaining the first horizontal gradient, the first vertical gradient, the second horizontal gradient, the second vertical gradient, the third horizontal gradient and the third vertical gradient is executed through a 16-parallel calculation and pipeline structure. The results of the convolution calculations are stored in a 32-bit RAM of depth 496. Then, all gradient directions are calculated. For an 8 × 8 prediction unit, 16 RAMs in the mode storage unit are used to store the current optimal prediction modes for its left block (left) and top block (top). After traversing each 8 × 8 prediction unit, the corresponding RAM of each 8 × 8 prediction unit in the pattern storage unit is updated according to the flow shown in fig. 4. The current optimal prediction modes of the left and top blocks of the 16 × 16 prediction unit are stored using 8 RAMs in the mode storage unit, and the current optimal prediction modes of the left and top blocks of the 32 × 32 prediction unit are stored using 4 RAMs in the mode storage unit. The update schedules of the 16 × 16 prediction unit and the 32 × 32 prediction unit are similar to fig. 4. The throughput rate requirement of 1920x1080@60fps under 300MHz is finally realized through the circuit design.

The serial architecture can be used in algorithm implementation, and although the serial architecture does not affect the coding efficiency, data dependency needs to be eliminated first to realize a real-time ASIC chip with a pipelined and parallel design. When the intra Mode selection is performed, the reconstructed pixels and the optimal Mode selection are the main reasons for data dependency, and as shown in fig. 5, a rough Mode decision of the current coding tree unit is based on the pixels reconstructed using a Most Probable Mode (MPM), whereas the Most Probable Mode is based on the optimal Mode obtained by a series of processes such as prediction, conversion, quantization, inverse conversion, inverse entropy coding, and Mode decision performed by the previous coding tree unit. Therefore, the serial coding function of the coarse decision mode is not suitable for hardware design and needs to be readjusted to adapt to the real-time coding performance. According to the embodiment of the application, the reconstructed pixels are replaced by the original pixels, so that the influence on the coding performance is reduced. At J for coarse mode decision _HAD In the calculation, the values shown in Table 1 are directly used as the prediction mode bit number R _mode Thereby removing data dependencies.

TABLE 1

Intra prediction mode	R _mode
		0-1	1
2-3	2
		4-7	3
8-15	4
		16-31	5
32	6

As shown in table 1, the intra modes used in the present application are 33 types in total from No. 0 to No. 32, and the number of prediction mode bits R corresponding to each intra mode _mode The values of (A) are shown in Table 1.

In the AVS2 standard, it takes much time to traverse the 4 × 4 prediction unit 256 times in one maximum coding unit, but in the embodiment of the present application, the 4 × 4 prediction unit uses one circuit (the second parallel module) alone, and the prediction units and the gradient calculation unit of other sizes share one circuit (the first parallel module), and the two circuits are executed in parallel.

As shown in FIG. 6, for the second parallel module, first, the first mode decision unit is based on J _HAD And the formula and the prediction mode bit number perform coarse mode decision on all the received fourth-size prediction units, select 9 intra-frame prediction modes with the minimum calculation result from the 33 intra-frame prediction modes, and send the intra-frame prediction modes to the second mode decision unit.

J _HAD ＝SATD+λ·R _mode

Wherein SATD represents the sum of absolute conversion errors, λ is the Lagrange multiplier, R _mode The estimated number of bits for the prediction mode.

The second mode decision unit determines an optimal intra-frame prediction mode corresponding to the fourth size prediction unit from the 9 intra-frame prediction modes sent by the first mode decision unit according to the mode decision and outputs the optimal intra-frame prediction mode to the comparison output module.

At the same time, the gradient calculation unit in the first parallel module first calculates the horizontal gradient and the vertical gradient corresponding to the received first size prediction unit, the second size prediction unit and the third size prediction unit respectively by using a sobel operator. The Sobel operator is a widely used, simple and efficient edge detection operator, and comprises a horizontal direction and a vertical direction. As shown in fig. 7 (a) and 7 (b), the horizontal and vertical convolution kernels of the sobel operator.

The gradient in the horizontal and vertical directions is calculated as shown below, where P _i，j Representing the pixel value, gx, corresponding to the pixel at position (i, j) _ij And Gy _ij Representing the magnitude of the gradient in the horizontal and vertical directions, respectively.

Gx _ij ＝P _i+1，j-1 +2×P _i+1，j +P _i+1，j+1 -P _i-1，j-1 -2×P _i-1，j -P _i-1，j+1

Gy _ij ＝P _i-1，j-1 +2×P _i，j-1 +P _i+1，j-1 -P _i-1，j+1 -2×P _i，j+1 -P _i+1，j+1

As shown in fig. 8, the first PU calculation unit calculates a first gradient angle θ corresponding to the first size prediction unit according to the first horizontal gradient and the first vertical gradient corresponding to the first size prediction unit using the following formula.

Since the texture direction angle of the current prediction unit is perpendicular to the first gradient angle θ, the texture direction angle α may be calculated by the following equation.

Two intra prediction modes near the texture direction angle alpha are selected to be added into a candidate list, and then the optimal prediction modes corresponding to the left and the upper parts of the first size prediction unit are added, and 4 modes in total form a rough mode decision candidate list of the first size prediction unit. J for coarse mode decision _HAD Calculating to select an optimal mode as the corresponding first modeAnd the current optimal prediction mode of the size prediction unit, the DC mode, the Plane mode and the Bilinear mode form a mode decision candidate list of the first size prediction unit, the next mode decision process is entered, and finally an optimal intra-frame prediction mode corresponding to the first size prediction unit is decided and sent to the comparison output module.

The calculation methods and processes of the second PU calculation unit and the third PU calculation unit are the same as those of the first PU calculation unit, and are not described herein again.

Comparison output Module uses J _HAD And calculating the received four optimal intra-frame prediction modes, and outputting the intra-frame prediction mode with the minimum calculation result and the size of the prediction unit corresponding to the intra-frame prediction mode as the size division and the intra-frame prediction mode of the coding unit corresponding to the current data.

In a second aspect, according to an embodiment of the present application, a parallel intra mode selection method is provided, as shown in fig. 9, including:

s101, a storage module divides stored pixel data into four prediction units with different sizes, and the prediction units are input into a first parallel module and a second parallel module to perform parallel calculation;

s102, the second parallel module determines the optimal intra-frame prediction mode of the prediction unit with each size according to the rough mode decision, the prediction mode bit number and the mode decision and outputs the optimal intra-frame prediction mode to the comparison output module;

s103, calculating gradients of the prediction units with three different sizes by the gradient calculation unit of the first parallel module, and sending the gradients to the storage module;

s104, the storage module sends the gradient to a plurality of PU calculation units of the first parallel module;

s105, determining an optimal intra-frame prediction mode corresponding to prediction units of three sizes by the PU calculation units of the first parallel module according to the gradient, and sending the optimal intra-frame prediction mode to the comparison output module;

s106, the comparison output module compares the received multiple optimal intra-frame prediction modes and outputs an optimal intra-frame prediction mode.

In the embodiment of the application, the two parallel modules are used for simultaneously calculating, the first parallel module is used for calculating the gradients and the corresponding optimal intra-frame modes of the prediction units with various sizes, and the second parallel module is used for calculating the optimal intra-frame mode of the prediction unit with one size, so that the calculation speed can be increased; and the second parallel module calculates the optimal intra-frame mode according to the prediction mode bit number, so that the data dependency can be reduced, the calculation speed is further increased, and the real-time encoding is realized. The texture directions of prediction units with 3 sizes are determined by using the Sobel operator to determine the intra-frame prediction mode based on the texture characteristics, so that the number of the intra-frame prediction modes needing to be decided is reduced, and the time for deciding the intra-frame mode is shortened. Replacing the reconstructed pixels with original pixels, and directly using the determined bit number of the prediction mode, thereby removing data dependency on hardware; for two large parallel modules, the parallelism of each large parallel module is designed and increased, and the real-time requirement of hardware is met.

The above description is only for the preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A parallel intra mode selection system, comprising: the device comprises a storage module, a first parallel module, a second parallel module and a comparison output module;

2. The system of claim 1, wherein the storage module comprises: a pixel storage unit and a gradient storage unit;

the gradient storage unit is configured to store and receive the gradient sent by the gradient calculation unit, and send the gradient to the plurality of PU calculation units in the first parallel module.

3. The system of claim 2, wherein the second parallelization module comprises: a first mode decision unit and a second mode decision unit;

the first mode decision unit is configured to determine multiple intra-frame prediction modes of a fourth size prediction unit according to the coarse mode decision and the prediction mode bit number, and send the multiple intra-frame prediction modes to the second mode decision unit;

4. The system according to claim 2, wherein the gradient calculating unit is specifically configured to calculate a first horizontal gradient, a first vertical gradient, a second vertical gradient and a third horizontal gradient corresponding to the first size predicting unit, and a gradient corresponding to the third size predicting unit, and send the third horizontal gradient and the third vertical gradient to the gradient storage unit.

5. The system of claim 4, wherein the plurality of PU compute units comprises: the first PU calculating unit, the second PU calculating unit and the third PU calculating unit;

the first PU calculation unit is configured to calculate a first gradient angle corresponding to the first size prediction unit according to the first horizontal gradient and the first vertical gradient, determine two intra-frame prediction modes corresponding to the first size prediction unit according to the first gradient angle, determine an optimal intra-frame prediction mode corresponding to the first size prediction unit from the two intra-frame prediction modes corresponding to the first size prediction unit and multiple non-angular intra-frame prediction modes, and send the optimal intra-frame prediction mode to the comparison output module;

the second PU calculation unit is configured to calculate a second gradient angle corresponding to the second size prediction unit according to a second horizontal gradient and a second vertical gradient, determine two intra-frame prediction modes corresponding to the second size prediction unit according to the second gradient angle, determine an optimal intra-frame prediction mode corresponding to the second size prediction unit from the two intra-frame prediction modes corresponding to the second size prediction unit and multiple non-angle intra-frame prediction modes, and send the optimal intra-frame prediction mode to the comparison output module;

6. The system of claim 2, wherein the pixel storage unit comprises: 16 RAMs of width 32 and depth 64 bits.

7. The parallel intra mode selection system according to claim 2, wherein the gradient storage unit comprises: 32-bit RAM of depth 496.

8. The system of claim 3, wherein the number of prediction mode bits comprises a plurality of bit values, the bit values corresponding to different intra prediction modes.

9. The system of claim 4, wherein the gradient computation unit computes the first horizontal gradient, the first vertical gradient, the second horizontal gradient, the second vertical gradient, the third horizontal gradient, and the third vertical gradient using a Sobel operator.

10. A method of parallel intra mode selection, comprising: