CN112804523B - Parallel intra-frame mode selection system and method - Google Patents

Parallel intra-frame mode selection system and method Download PDF

Info

Publication number
CN112804523B
CN112804523B CN202011629844.4A CN202011629844A CN112804523B CN 112804523 B CN112804523 B CN 112804523B CN 202011629844 A CN202011629844 A CN 202011629844A CN 112804523 B CN112804523 B CN 112804523B
Authority
CN
China
Prior art keywords
gradient
unit
prediction
intra
mode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011629844.4A
Other languages
Chinese (zh)
Other versions
CN112804523A (en
Inventor
向国庆
刘凯阳
文湘鄂
范晓东
宋磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Boya Huishi Intelligent Technology Research Institute Co ltd
Original Assignee
Beijing Boya Huishi Intelligent Technology Research Institute Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Boya Huishi Intelligent Technology Research Institute Co ltd filed Critical Beijing Boya Huishi Intelligent Technology Research Institute Co ltd
Priority to CN202011629844.4A priority Critical patent/CN112804523B/en
Publication of CN112804523A publication Critical patent/CN112804523A/en
Application granted granted Critical
Publication of CN112804523B publication Critical patent/CN112804523B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/109Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • H04N19/122Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Abstract

The application discloses a parallel intra mode selection system and method, comprising: the storage module divides pixel data into four prediction units with different sizes, inputs the prediction units into a first parallel module and a second parallel module, and sends the gradient to the first parallel module; the first parallel module includes: a gradient calculation unit and a plurality of PU calculation units; the gradient calculation unit is used for calculating gradients of the prediction units with different sizes and sending the gradients to the storage module; the PU calculation units are used for determining an optimal intra-frame prediction mode corresponding to the prediction units with three sizes according to the gradient and sending the optimal intra-frame prediction mode to the comparison output module; the second parallel module determines the optimal intra-frame prediction mode of the prediction unit with each size according to the rough mode decision, the prediction mode bit number and the mode decision and outputs the optimal intra-frame prediction mode to the comparison output module; and the comparison output module compares the optimal intra-frame prediction mode and outputs the optimal intra-frame prediction mode. The parallel mode is used for accelerating the calculation speed; and calculating the optimal intra mode reduction data dependency according to the bit number of the prediction mode.

Description

Parallel intra-frame mode selection system and method
Technical Field
The present application relates to the field of video coding technologies, and in particular, to a parallel intra mode selection system and method.
Background
Intra-frame coding is an important component of video coding techniques, which is generated in order to reduce spatial redundancy. In order to make the predicted value more accurate and the predicted residual error smaller, the intra-frame prediction in the current mainstream standard adopts a plurality of prediction modes for prediction. For example, in the AVS2 video coding standard, all Prediction Units (PUs) need to traverse 33 intra-frame Prediction modes to perform a Rough Mode Decision (RMD) process, obtain 9 better candidate modes to perform a next Rate Distortion Optimization (RDO) process, and compare the obtained Rate Distortion costs to select an optimal intra-frame Prediction Mode. Although the performance of intra-frame coding is remarkably improved by increasing the modes, the complexity of coding is also increased sharply, and the real-time implementation of hardware is not facilitated.
In view of the foregoing, it is desirable to provide a parallel intra mode selection system and method that can improve the computation speed and achieve real-time encoding.
Disclosure of Invention
To solve the above problems, the present application proposes a parallel intra mode selection system and method.
In a first aspect, the present application provides a parallel intra mode selection system, comprising: the device comprises a storage module, a first parallel module, a second parallel module and a comparison output module;
the storage module is used for storing pixel data and gradients, dividing the pixel data into four prediction units with different sizes, inputting the prediction units into the first parallel module and the second parallel module, and sending the gradients to the first parallel module;
the first parallel module comprising: a gradient calculation unit and a plurality of PU calculation units;
the gradient calculation unit is used for calculating gradients of the prediction units with three different sizes and sending the gradients to the storage module;
the PU calculation units are all used for determining an optimal intra-frame prediction mode corresponding to the prediction units with three sizes according to the gradient and sending the optimal intra-frame prediction mode to the comparison output module;
the second parallel module is used for determining the optimal intra-frame prediction mode of the prediction unit with each size according to the rough mode decision, the prediction mode bit number and the mode decision and outputting the optimal intra-frame prediction mode to the comparison output module;
the comparison output module is used for comparing the received multiple optimal intra-frame prediction modes and outputting one optimal intra-frame prediction mode.
Preferably, the storage module includes: a pixel storage unit and a gradient storage unit;
the pixel storage unit is used for dividing the pixel data into four prediction units with different sizes, and the four prediction units with different sizes comprise: a first size prediction unit, a second size prediction unit, a third size prediction unit, and a fourth size prediction unit; sending the first size prediction unit, the second size prediction unit and the third size prediction unit to the gradient calculation unit, and sending the fourth size prediction unit to the second parallel module;
the gradient storage unit is configured to store the gradient sent by the gradient calculation unit and send the gradient to the plurality of PU calculation units in the first parallel module.
Preferably, the second parallel module comprises: a first mode decision unit and a second mode decision unit;
the first mode decision unit is configured to determine multiple intra-frame prediction modes of a fourth size prediction unit according to a coarse mode decision and a prediction mode bit number, and send the multiple intra-frame prediction modes to the second mode decision unit;
and the second mode decision unit is used for determining an optimal intra-frame prediction mode corresponding to the fourth size prediction unit from a plurality of intra-frame prediction modes according to mode decision and outputting the optimal intra-frame prediction mode to the comparison output module.
Preferably, the gradient calculating unit is specifically configured to calculate a first horizontal gradient, a first vertical gradient, a second vertical gradient, and a third horizontal gradient corresponding to the first size predicting unit, and a third horizontal gradient and a third vertical gradient corresponding to the third size predicting unit, and send the calculated gradients to the gradient storage unit.
Preferably, a plurality of said PU computation units, comprising: the system comprises a first PU computing unit, a second PU computing unit and a third PU computing unit;
the first PU calculation unit is configured to calculate a first gradient angle corresponding to the first size prediction unit according to the first horizontal gradient and the first vertical gradient, determine two intra-frame prediction modes corresponding to the first size prediction unit according to the first gradient angle, determine an optimal intra-frame prediction mode corresponding to the first size prediction unit from the two intra-frame prediction modes corresponding to the first size prediction unit and multiple non-angle intra-frame prediction modes, and send the optimal intra-frame prediction mode to the comparison output module;
the second PU calculation unit is configured to calculate a second gradient angle corresponding to the second size prediction unit according to the second horizontal gradient and the second vertical gradient, determine two intra-frame prediction modes corresponding to the second size prediction unit according to the second gradient angle, determine an optimal intra-frame prediction mode corresponding to the second size prediction unit from the two intra-frame prediction modes corresponding to the second size prediction unit and multiple non-angular intra-frame prediction modes, and send the optimal intra-frame prediction mode to the comparison output module;
the third PU calculation unit is configured to calculate a third gradient angle corresponding to the third size prediction unit according to the third horizontal gradient and the third vertical gradient, determine two intra-frame prediction modes corresponding to the third size prediction unit according to the third gradient angle, determine an optimal intra-frame prediction mode corresponding to the third size prediction unit from the two intra-frame prediction modes corresponding to the third size prediction unit and multiple non-angular intra-frame prediction modes, and send the optimal intra-frame prediction mode to the comparison output module.
Preferably, the pixel storage unit includes: 16 RAMs of width 32 and depth 64 bits.
Preferably, the gradient storage unit includes: a 32-bit RAM of depth 496.
Preferably, the prediction mode bit number comprises a plurality of bit values, the bit values corresponding to different intra prediction modes.
Preferably, the gradient calculating unit calculates the first horizontal gradient, the first vertical gradient, the second horizontal gradient, the second vertical gradient, the third horizontal gradient, and the third vertical gradient using a sobel operator.
In a second aspect, the present application provides a parallel intra mode selection method, including:
the storage module divides the stored pixel data into four prediction units with different sizes, and inputs the prediction units into the first parallel module and the second parallel module for parallel calculation;
the second parallel module determines the optimal intra-frame prediction mode of the prediction unit with each size according to the rough mode decision, the prediction mode bit number and the mode decision and outputs the optimal intra-frame prediction mode to the comparison output module;
the gradient calculation unit of the first parallel module calculates gradients of the prediction units with three different sizes and sends the gradients to the storage module;
a storage module sends the gradient to a plurality of PU calculation units of the first parallel module;
the PU calculation units of the first parallel module determine an optimal intra-frame prediction mode corresponding to the prediction units with three sizes according to the gradient and send the optimal intra-frame prediction mode to the comparison output module;
the comparison output module compares the received multiple optimal intra-frame prediction modes and outputs one optimal intra-frame prediction mode.
The application has the advantages that: the two parallel modules are used for simultaneously calculating, the first parallel module is used for calculating gradients and corresponding optimal intra-frame modes of prediction units with various sizes, and the second parallel module is used for calculating the optimal intra-frame mode of the prediction unit with one size, so that the calculation speed can be increased; and the second parallel module calculates the optimal intra-frame mode according to the bit number of the prediction mode, so that the data dependency can be reduced, the calculation speed is further increased, and the real-time coding is realized.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating preferred embodiments and are not to be construed as limiting the application. And like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a schematic block diagram of a parallel intra mode selection system provided herein;
FIG. 2 is a schematic diagram of a parallel intra mode selection system provided herein;
FIG. 3 is a schematic diagram of a pixel storage unit of a parallel intra mode selection system provided herein;
FIG. 4 is a schematic diagram illustrating an update flow of a mode storage unit of a parallel intra mode selection system according to the present application;
FIG. 5 is a flowchart illustrating a conventional intra mode prediction;
FIG. 6 is a flow diagram illustrating a second parallel module of a parallel intra mode selection system provided herein;
FIG. 7 (a) is a schematic diagram of a horizontal convolution kernel of a Sobel operator for a parallel intra mode selection system provided herein;
FIG. 7 (b) is a schematic diagram of a vertical convolution kernel of a parallel intra mode selection system provided herein;
FIG. 8 is a flow diagram of a PU compute unit of a parallel intra mode selection system provided herein;
fig. 9 is a schematic diagram illustrating steps of a parallel intra mode selection method according to the present application.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
In a first aspect, according to an embodiment of the present application, a parallel intra mode selection system is provided, as shown in fig. 1, including: a memory module 300, a first parallel module 100, a second parallel module 200 and a comparison output module 400.
And the storage module is used for storing the pixel data and the gradient, dividing the pixel data into four prediction units with different sizes, inputting the prediction units into the first parallel module and the second parallel module, and sending the gradient to the first parallel module.
A first parallel module comprising: a gradient calculation unit 110 and a plurality of PU calculation units (121 to 12 n).
The gradient calculation unit is used for calculating gradients of the prediction units with three different sizes and sending the gradients to the storage module. And the PU computing units are all used for determining an optimal intra-frame prediction mode corresponding to the prediction units with three sizes according to the gradient and sending the optimal intra-frame prediction mode to the comparison output module.
And the second parallel module is used for determining the optimal intra-frame prediction mode of the prediction unit with each size according to the rough mode decision, the prediction mode bit number and the mode decision and outputting the optimal intra-frame prediction mode to the comparison output module.
And the comparison output module is used for comparing the received multiple optimal intra-frame prediction modes and outputting an optimal intra-frame prediction mode.
As shown in fig. 2, the memory module includes: a pixel storage unit 310 and a gradient storage unit 320.
A pixel storage unit for dividing pixel data into four different sized prediction units, the four different sized prediction units comprising: a first size prediction unit, a second size prediction unit, a third size prediction unit, and a fourth size prediction unit; and sending the first size prediction unit, the second size prediction unit and the third size prediction unit to the gradient calculation unit, and sending the fourth size prediction unit to the second parallel module.
The first size prediction unit is an 8 × 8 prediction unit, the second size prediction unit is a 16 × 16 prediction unit, the third size prediction unit is a 32 × 32 prediction unit, and the fourth size prediction unit is a 4 × 4 prediction unit.
And the gradient storage unit is used for storing the gradient sent by the gradient calculation unit and sending the gradient to the plurality of PU calculation units in the first parallel module.
A second parallel module comprising: a first mode decision unit and a second mode decision unit.
And the first mode decision unit is used for determining a plurality of intra-frame prediction modes of the fourth size prediction unit according to the rough mode decision and the prediction mode bit number and sending the intra-frame prediction modes to the second mode decision unit.
And the second mode decision unit is used for determining an optimal intra-frame prediction mode corresponding to the fourth size prediction unit from the multiple intra-frame prediction modes according to the mode decision and outputting the optimal intra-frame prediction mode to the comparison output module.
The gradient calculation unit is specifically configured to calculate a first horizontal gradient, a first vertical gradient, a second vertical gradient, and a third horizontal gradient corresponding to the first size prediction unit, and a gradient, a third horizontal gradient, and a third vertical gradient corresponding to the third size prediction unit, and send the gradients to the gradient storage unit.
As shown in fig. 2, a plurality of PU computational units, comprising: a first PU calculation unit 121, a second PU calculation unit 122, and a third PU calculation unit 123.
And the first PU calculating unit is used for calculating a first gradient angle corresponding to the first size prediction unit according to the first horizontal gradient and the first vertical gradient, determining a first texture direction angle according to the first gradient angle, and determining two intra-frame prediction modes corresponding to the first size prediction unit according to the first texture direction angle. And forming a coarse mode decision candidate list of the first size prediction unit by using two intra prediction modes corresponding to the first size prediction unit and the left and upper current optimal prediction modes corresponding to the first size prediction unit, and determining one current optimal prediction mode corresponding to the first size prediction unit by using a coarse mode decision from the coarse mode decision candidate list of the first size prediction unit. And determining an optimal intra-frame prediction mode corresponding to the first size prediction unit from the current optimal prediction mode corresponding to the first size prediction unit and a plurality of non-angle intra-frame prediction modes, and sending the optimal intra-frame prediction mode to the comparison output module. Wherein the current optimal prediction modes to the left and above corresponding to the first size are the current optimal prediction mode of the prediction unit to the left of the first size prediction unit currently being calculated and the current optimal prediction mode of the prediction unit above the first size prediction unit currently being calculated.
And the second PU calculating unit is used for calculating a second gradient angle corresponding to the second size prediction unit according to the second horizontal gradient and the second vertical gradient, determining a second texture direction angle according to the second gradient angle, and determining two intra-frame prediction modes corresponding to the second size prediction unit according to the second texture direction angle. And forming a coarse mode decision candidate list of the second size prediction unit by using the two intra-prediction modes corresponding to the second size prediction unit and the left and upper current optimal prediction modes corresponding to the second size prediction unit, and determining one current optimal prediction mode corresponding to the second size prediction unit by using a coarse mode decision from the coarse mode decision candidate list of the second size prediction unit. And determining an optimal intra-frame prediction mode corresponding to the second size prediction unit from the current optimal prediction mode corresponding to the second size prediction unit and a plurality of non-angle intra-frame prediction modes, and sending the optimal intra-frame prediction mode to the comparison output module. Wherein the optimal prediction modes to the left and above corresponding to the second size are the current optimal frame mode of the prediction unit to the left of the second-sized prediction unit currently being calculated and the current optimal prediction mode of the prediction unit above the second-sized prediction unit currently being calculated.
And the third PU calculating unit is used for calculating a third gradient angle corresponding to the third size prediction unit according to the third horizontal gradient and the third vertical gradient, determining a third texture direction angle according to the third gradient angle, and determining two intra-frame prediction modes corresponding to the third size prediction unit according to the third texture direction angle. And forming a rough mode decision candidate list of the third size prediction unit by using the two intra prediction modes corresponding to the third size prediction unit and the left and upper current optimal prediction modes corresponding to the third size prediction unit, and determining one current optimal prediction mode corresponding to the third size prediction unit by using a rough mode decision from the rough mode decision candidate list of the third size prediction unit. And determining an optimal intra-frame prediction mode corresponding to the third size prediction unit from the current optimal pre-prediction mode corresponding to the third size prediction unit and a plurality of non-angle intra-frame prediction modes, and sending the optimal intra-frame prediction mode to the comparison output module. Wherein the optimal prediction modes to the left and above corresponding to the third size are the current optimal prediction mode of the prediction unit to the left of the third-size prediction unit currently being calculated and the current optimal prediction mode of the prediction unit above the third-size prediction unit currently being calculated.
For the first size prediction unit, the second size prediction unit and the third size prediction unit currently undergoing calculation, the current optimal prediction modes of the left and upper prediction units thereof have used coarse mode decision, are calculated from the coarse mode decision candidate list and are stored in the mode storage unit, and thus can be directly obtained from the mode storage unit.
The gradient calculation unit, the first PU calculation unit, the second PU calculation unit and the third PU calculation unit in the first parallel module sequentially and serially calculate gradients of prediction units with three different sizes, an optimal intra-frame prediction mode of the first size prediction unit, an optimal intra-frame prediction mode of the second size prediction unit and an optimal intra-frame prediction mode of the third size prediction unit.
The plurality of non-angular intra prediction modes include: DC mode, plane mode and bipolar mode.
The pixel storage unit includes: 16 RAMs of width 32 and depth 64 bits.
The gradient storage unit includes: a 32-bit RAM of depth 496.
The storage module further comprises a mode storage unit, which is used for storing the current optimal prediction mode corresponding to each size of prediction unit obtained by the gradient calculation unit when the gradient calculation of each size of prediction unit is carried out. The pattern storage unit includes 28 RAMs of 8 bits in width.
The prediction mode bit number comprises a plurality of bit values, the bit values corresponding to different intra prediction modes.
A gradient calculation unit that calculates a first horizontal gradient, a first vertical gradient, a second horizontal gradient, a second vertical gradient, a third horizontal gradient, and a third vertical gradient using a Sobel operator.
The following examples are provided to further illustrate the present application.
In order to meet the requirement of 1920x1080@60fps at the clock frequency of 300MHz, the target time of a Coding Tree Unit (CTU) stage pipeline is 9800 clock cycles. To avoid the repeated calculation, at the beginning of the first parallel block, gradient information of all pixels in a Largest Coding Unit (LCU) needs to be calculated first using a gradient calculation unit. By default, the raw pixels of the LCU level (4096 =64 × 64) are stored in an 8-bit RAM with a depth of 4096. To meet the clock constraint, pixels of 64 × 64 LCUs are stored in a pixel storage unit including 16 RAMs. As shown in fig. 3, 64 pixels can be accessed in one clock cycle using 16 RAM memory pixels of 32 bits width and 64 bits depth, thereby saving access time significantly. The gradient calculation unit performs convolution calculation on the first size prediction unit, the second size prediction unit and the third size prediction unit by using a Sobel operator to obtain a first horizontal gradient, a first vertical gradient, a second horizontal gradient, a second vertical gradient, a third horizontal gradient and a third vertical gradient, and the process of obtaining the first horizontal gradient, the first vertical gradient, the second horizontal gradient, the second vertical gradient, the third horizontal gradient and the third vertical gradient is executed through a 16-parallel calculation and pipeline structure. The results of the convolution calculations are stored in a 32-bit RAM of depth 496. Then, all gradient directions are calculated. For an 8 × 8 prediction unit, 16 RAMs in the mode storage unit are used to store the current optimal prediction modes for its left block (left) and top block (top). After traversing each 8 × 8 prediction unit, the corresponding RAM of each 8 × 8 prediction unit in the pattern storage unit is updated according to the flow shown in fig. 4. The current optimal prediction modes of the left and top blocks of the 16 × 16 prediction unit are stored using 8 RAMs in the mode storage unit, and the current optimal prediction modes of the left and top blocks of the 32 × 32 prediction unit are stored using 4 RAMs in the mode storage unit. The update schedules of the 16 × 16 prediction unit and the 32 × 32 prediction unit are similar to fig. 4. The throughput rate requirement of 1920x1080@60fps under 300MHz is finally realized through the circuit design.
The serial architecture can be used in algorithm implementation, and although the serial architecture does not affect the coding efficiency, data dependency needs to be eliminated first to realize a real-time ASIC chip with a pipelined and parallel design. When the intra Mode selection is performed, the reconstructed pixels and the optimal Mode selection are the main reasons for data dependency, and as shown in fig. 5, a rough Mode decision of the current coding tree unit is based on the pixels reconstructed using a Most Probable Mode (MPM), whereas the Most Probable Mode is based on the optimal Mode obtained by a series of processes such as prediction, conversion, quantization, inverse conversion, inverse entropy coding, and Mode decision performed by the previous coding tree unit. Therefore, the serial coding function of the coarse decision mode is not suitable for hardware design and needs to be readjusted to adapt to the real-time coding performance. According to the embodiment of the application, the reconstructed pixels are replaced by the original pixels, so that the influence on the coding performance is reduced. At J for coarse mode decision HAD In the calculation, the values shown in Table 1 are directly used as the prediction mode bit number R mode Thereby removing data dependencies.
TABLE 1
Intra prediction mode R mode
0-1 1
2-3 2
4-7 3
8-15 4
16-31 5
32 6
As shown in table 1, the intra modes used in the present application are 33 types in total from No. 0 to No. 32, and the number of prediction mode bits R corresponding to each intra mode mode The values of (A) are shown in Table 1.
In the AVS2 standard, it takes much time to traverse the 4 × 4 prediction unit 256 times in one maximum coding unit, but in the embodiment of the present application, the 4 × 4 prediction unit uses one circuit (the second parallel module) alone, and the prediction units and the gradient calculation unit of other sizes share one circuit (the first parallel module), and the two circuits are executed in parallel.
As shown in FIG. 6, for the second parallel module, first, the first mode decision unit is based on J HAD And the formula and the prediction mode bit number perform coarse mode decision on all the received fourth-size prediction units, select 9 intra-frame prediction modes with the minimum calculation result from the 33 intra-frame prediction modes, and send the intra-frame prediction modes to the second mode decision unit.
J HAD =SATD+λ·R mode
Wherein SATD represents the sum of absolute conversion errors, λ is the Lagrange multiplier, R mode The estimated number of bits for the prediction mode.
The second mode decision unit determines an optimal intra-frame prediction mode corresponding to the fourth size prediction unit from the 9 intra-frame prediction modes sent by the first mode decision unit according to the mode decision and outputs the optimal intra-frame prediction mode to the comparison output module.
At the same time, the gradient calculation unit in the first parallel module first calculates the horizontal gradient and the vertical gradient corresponding to the received first size prediction unit, the second size prediction unit and the third size prediction unit respectively by using a sobel operator. The Sobel operator is a widely used, simple and efficient edge detection operator, and comprises a horizontal direction and a vertical direction. As shown in fig. 7 (a) and 7 (b), the horizontal and vertical convolution kernels of the sobel operator.
The gradient in the horizontal and vertical directions is calculated as shown below, where P i,j Representing the pixel value, gx, corresponding to the pixel at position (i, j) ij And Gy ij Representing the magnitude of the gradient in the horizontal and vertical directions, respectively.
Gx ij =P i+1,j-1 +2×P i+1,j +P i+1,j+1 -P i-1,j-1 -2×P i-1,j -P i-1,j+1
Gy ij =P i-1,j-1 +2×P i,j-1 +P i+1,j-1 -P i-1,j+1 -2×P i,j+1 -P i+1,j+1
As shown in fig. 8, the first PU calculation unit calculates a first gradient angle θ corresponding to the first size prediction unit according to the first horizontal gradient and the first vertical gradient corresponding to the first size prediction unit using the following formula.
Figure BDA0002875986560000101
Since the texture direction angle of the current prediction unit is perpendicular to the first gradient angle θ, the texture direction angle α may be calculated by the following equation.
Figure BDA0002875986560000102
Two intra prediction modes near the texture direction angle alpha are selected to be added into a candidate list, and then the optimal prediction modes corresponding to the left and the upper parts of the first size prediction unit are added, and 4 modes in total form a rough mode decision candidate list of the first size prediction unit. J for coarse mode decision HAD Calculating to select an optimal mode as the corresponding first modeAnd the current optimal prediction mode of the size prediction unit, the DC mode, the Plane mode and the Bilinear mode form a mode decision candidate list of the first size prediction unit, the next mode decision process is entered, and finally an optimal intra-frame prediction mode corresponding to the first size prediction unit is decided and sent to the comparison output module.
The calculation methods and processes of the second PU calculation unit and the third PU calculation unit are the same as those of the first PU calculation unit, and are not described herein again.
Comparison output Module uses J HAD And calculating the received four optimal intra-frame prediction modes, and outputting the intra-frame prediction mode with the minimum calculation result and the size of the prediction unit corresponding to the intra-frame prediction mode as the size division and the intra-frame prediction mode of the coding unit corresponding to the current data.
In a second aspect, according to an embodiment of the present application, a parallel intra mode selection method is provided, as shown in fig. 9, including:
s101, a storage module divides stored pixel data into four prediction units with different sizes, and the prediction units are input into a first parallel module and a second parallel module to perform parallel calculation;
s102, the second parallel module determines the optimal intra-frame prediction mode of the prediction unit with each size according to the rough mode decision, the prediction mode bit number and the mode decision and outputs the optimal intra-frame prediction mode to the comparison output module;
s103, calculating gradients of the prediction units with three different sizes by the gradient calculation unit of the first parallel module, and sending the gradients to the storage module;
s104, the storage module sends the gradient to a plurality of PU calculation units of the first parallel module;
s105, determining an optimal intra-frame prediction mode corresponding to prediction units of three sizes by the PU calculation units of the first parallel module according to the gradient, and sending the optimal intra-frame prediction mode to the comparison output module;
s106, the comparison output module compares the received multiple optimal intra-frame prediction modes and outputs an optimal intra-frame prediction mode.
In the embodiment of the application, the two parallel modules are used for simultaneously calculating, the first parallel module is used for calculating the gradients and the corresponding optimal intra-frame modes of the prediction units with various sizes, and the second parallel module is used for calculating the optimal intra-frame mode of the prediction unit with one size, so that the calculation speed can be increased; and the second parallel module calculates the optimal intra-frame mode according to the prediction mode bit number, so that the data dependency can be reduced, the calculation speed is further increased, and the real-time encoding is realized. The texture directions of prediction units with 3 sizes are determined by using the Sobel operator to determine the intra-frame prediction mode based on the texture characteristics, so that the number of the intra-frame prediction modes needing to be decided is reduced, and the time for deciding the intra-frame mode is shortened. Replacing the reconstructed pixels with original pixels, and directly using the determined bit number of the prediction mode, thereby removing data dependency on hardware; for two large parallel modules, the parallelism of each large parallel module is designed and increased, and the real-time requirement of hardware is met.
The above description is only for the preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A parallel intra mode selection system, comprising: the device comprises a storage module, a first parallel module, a second parallel module and a comparison output module;
the storage module is used for storing pixel data and gradients, dividing the pixel data into four prediction units with different sizes, inputting the prediction units into the first parallel module and the second parallel module, and sending the gradients to the first parallel module;
the first parallel module comprising: a gradient calculation unit and a plurality of PU calculation units;
the gradient calculation unit is used for calculating gradients of the prediction units with three different sizes and sending the gradients to the storage module;
the PU calculation units are all used for determining an optimal intra-frame prediction mode corresponding to the prediction units with three sizes according to the gradient and sending the optimal intra-frame prediction mode to the comparison output module;
the second parallel module is used for determining the optimal intra-frame prediction mode of the prediction unit with each size according to the rough mode decision, the prediction mode bit number and the mode decision and outputting the optimal intra-frame prediction mode to the comparison output module;
the comparison output module is used for comparing the received multiple optimal intra-frame prediction modes and outputting one optimal intra-frame prediction mode.
2. The system of claim 1, wherein the storage module comprises: a pixel storage unit and a gradient storage unit;
the pixel storage unit is used for dividing the pixel data into four prediction units with different sizes, and the four prediction units with different sizes comprise: a first size prediction unit, a second size prediction unit, a third size prediction unit, and a fourth size prediction unit; sending the first size prediction unit, the second size prediction unit and the third size prediction unit to the gradient calculation unit, and sending the fourth size prediction unit to the second parallel module;
the gradient storage unit is configured to store and receive the gradient sent by the gradient calculation unit, and send the gradient to the plurality of PU calculation units in the first parallel module.
3. The system of claim 2, wherein the second parallelization module comprises: a first mode decision unit and a second mode decision unit;
the first mode decision unit is configured to determine multiple intra-frame prediction modes of a fourth size prediction unit according to the coarse mode decision and the prediction mode bit number, and send the multiple intra-frame prediction modes to the second mode decision unit;
and the second mode decision unit is used for determining an optimal intra-frame prediction mode corresponding to the fourth size prediction unit from a plurality of intra-frame prediction modes according to mode decision and outputting the optimal intra-frame prediction mode to the comparison output module.
4. The system according to claim 2, wherein the gradient calculating unit is specifically configured to calculate a first horizontal gradient, a first vertical gradient, a second vertical gradient and a third horizontal gradient corresponding to the first size predicting unit, and a gradient corresponding to the third size predicting unit, and send the third horizontal gradient and the third vertical gradient to the gradient storage unit.
5. The system of claim 4, wherein the plurality of PU compute units comprises: the first PU calculating unit, the second PU calculating unit and the third PU calculating unit;
the first PU calculation unit is configured to calculate a first gradient angle corresponding to the first size prediction unit according to the first horizontal gradient and the first vertical gradient, determine two intra-frame prediction modes corresponding to the first size prediction unit according to the first gradient angle, determine an optimal intra-frame prediction mode corresponding to the first size prediction unit from the two intra-frame prediction modes corresponding to the first size prediction unit and multiple non-angular intra-frame prediction modes, and send the optimal intra-frame prediction mode to the comparison output module;
the second PU calculation unit is configured to calculate a second gradient angle corresponding to the second size prediction unit according to a second horizontal gradient and a second vertical gradient, determine two intra-frame prediction modes corresponding to the second size prediction unit according to the second gradient angle, determine an optimal intra-frame prediction mode corresponding to the second size prediction unit from the two intra-frame prediction modes corresponding to the second size prediction unit and multiple non-angle intra-frame prediction modes, and send the optimal intra-frame prediction mode to the comparison output module;
the third PU calculation unit is configured to calculate a third gradient angle corresponding to the third size prediction unit according to the third horizontal gradient and the third vertical gradient, determine two intra-frame prediction modes corresponding to the third size prediction unit according to the third gradient angle, determine an optimal intra-frame prediction mode corresponding to the third size prediction unit from the two intra-frame prediction modes corresponding to the third size prediction unit and multiple non-angular intra-frame prediction modes, and send the optimal intra-frame prediction mode to the comparison output module.
6. The system of claim 2, wherein the pixel storage unit comprises: 16 RAMs of width 32 and depth 64 bits.
7. The parallel intra mode selection system according to claim 2, wherein the gradient storage unit comprises: 32-bit RAM of depth 496.
8. The system of claim 3, wherein the number of prediction mode bits comprises a plurality of bit values, the bit values corresponding to different intra prediction modes.
9. The system of claim 4, wherein the gradient computation unit computes the first horizontal gradient, the first vertical gradient, the second horizontal gradient, the second vertical gradient, the third horizontal gradient, and the third vertical gradient using a Sobel operator.
10. A method of parallel intra mode selection, comprising:
the storage module divides the stored pixel data into four prediction units with different sizes, and inputs the prediction units into the first parallel module and the second parallel module for parallel calculation;
the second parallel module determines the optimal intra-frame prediction mode of the prediction unit with each size according to the rough mode decision, the prediction mode bit number and the mode decision and outputs the optimal intra-frame prediction mode to the comparison output module;
the gradient calculation unit of the first parallel module calculates gradients of the prediction units with three different sizes and sends the gradients to the storage module;
a storage module sends the gradient to a plurality of PU calculation units of the first parallel module;
the PU calculation units of the first parallel module determine an optimal intra-frame prediction mode corresponding to the prediction units with three sizes according to the gradient and send the optimal intra-frame prediction mode to the comparison output module;
the comparison output module compares the received multiple optimal intra-frame prediction modes and outputs one optimal intra-frame prediction mode.
CN202011629844.4A 2020-12-30 2020-12-30 Parallel intra-frame mode selection system and method Active CN112804523B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011629844.4A CN112804523B (en) 2020-12-30 2020-12-30 Parallel intra-frame mode selection system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011629844.4A CN112804523B (en) 2020-12-30 2020-12-30 Parallel intra-frame mode selection system and method

Publications (2)

Publication Number Publication Date
CN112804523A CN112804523A (en) 2021-05-14
CN112804523B true CN112804523B (en) 2023-01-03

Family

ID=75808122

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011629844.4A Active CN112804523B (en) 2020-12-30 2020-12-30 Parallel intra-frame mode selection system and method

Country Status (1)

Country Link
CN (1) CN112804523B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101252695A (en) * 2008-04-15 2008-08-27 中国科学院计算技术研究所 Video frequency encoder and method for choosing frame inner forecast mode
CN101938643A (en) * 2009-07-03 2011-01-05 哈尔滨工业大学深圳研究生院 Hardware parallel realization structure of video compression by intra-frame predictive 16*16 mode
CN102148994A (en) * 2010-02-04 2011-08-10 成都市世嘉电子实业有限公司 Parallel inter-frame prediction coding method
CN102340659A (en) * 2010-07-23 2012-02-01 联合信源数字音视频技术(北京)有限公司 Parallel mode decision device and method based on AVS (Audio Video Standard)
WO2012093854A2 (en) * 2011-01-04 2012-07-12 에스케이텔레콤 주식회사 Method and device for encoding and decoding by using parallel intra-prediction by a coding unit
CN110324631A (en) * 2019-05-09 2019-10-11 湖南国科微电子股份有限公司 A kind of image parallel processing method, device and electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101252695A (en) * 2008-04-15 2008-08-27 中国科学院计算技术研究所 Video frequency encoder and method for choosing frame inner forecast mode
CN101938643A (en) * 2009-07-03 2011-01-05 哈尔滨工业大学深圳研究生院 Hardware parallel realization structure of video compression by intra-frame predictive 16*16 mode
CN102148994A (en) * 2010-02-04 2011-08-10 成都市世嘉电子实业有限公司 Parallel inter-frame prediction coding method
CN102340659A (en) * 2010-07-23 2012-02-01 联合信源数字音视频技术(北京)有限公司 Parallel mode decision device and method based on AVS (Audio Video Standard)
WO2012093854A2 (en) * 2011-01-04 2012-07-12 에스케이텔레콤 주식회사 Method and device for encoding and decoding by using parallel intra-prediction by a coding unit
CN110324631A (en) * 2019-05-09 2019-10-11 湖南国科微电子股份有限公司 A kind of image parallel processing method, device and electronic equipment

Also Published As

Publication number Publication date
CN112804523A (en) 2021-05-14

Similar Documents

Publication Publication Date Title
US7342964B2 (en) Multi-standard variable block size motion estimation processor
US20060176962A1 (en) Image coding apparatus and image coding method
CN103517069A (en) HEVC intra-frame prediction quick mode selection method based on texture analysis
US8604951B2 (en) System and method for optimizing context-adaptive binary arithmetic coding
US20090180539A1 (en) Interpolated Skip Mode Decision in Video Compression
CN103414895A (en) Encoder intra-frame prediction device and method applicable to HEVC standards
CN109495743B (en) Parallelization video coding method based on heterogeneous multiprocessing platform
US20140133574A1 (en) Method and Apparatus for Sign Data Hiding of Video and Image Data
US8509567B2 (en) Half pixel interpolator for video motion estimation accelerator
JPH0884337A (en) Data processor and its method
JP2007288273A (en) Device and method for searching motion vector
CN115988204A (en) Method and apparatus for predicting video image component, and computer storage medium
CN110113608B (en) Video coding fast searching method in GPU (graphics processing Unit) based on rate distortion optimization
US20080225948A1 (en) Method of Data Reuse for Motion Estimation
CN113038125B (en) Intra-frame prediction mode selection method, device, equipment and storage medium
CN112804523B (en) Parallel intra-frame mode selection system and method
Zhang et al. Reusable architecture and complexity-controllable algorithm for the integer/fractional motion estimation of H. 264
EP2046047A1 (en) Method and device for performing motion estimation
CN101227611A (en) AVS-based motion estimation apparatus and searching method
CN113347417B (en) Method, device, equipment and storage medium for improving rate distortion optimization calculation efficiency
Li et al. A VLSI architecture design of an edge based fast intra prediction mode decision algorithm for H. 264/AVC
US20110299596A1 (en) Fractional motion estimation method with adaptive mode selection
CN101938643A (en) Hardware parallel realization structure of video compression by intra-frame predictive 16*16 mode
KR102050238B1 (en) System and method for effectively deciding intra prediction mode for hevc encoder
CN113315968B (en) Circuit design method, device, equipment and medium for improving code rate calculation efficiency

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant