CN111047498A - GPU hardware copy buffer algorithm-oriented TLM microstructure - Google Patents
GPU hardware copy buffer algorithm-oriented TLM microstructure Download PDFInfo
- Publication number
- CN111047498A CN111047498A CN201911125649.5A CN201911125649A CN111047498A CN 111047498 A CN111047498 A CN 111047498A CN 201911125649 A CN201911125649 A CN 201911125649A CN 111047498 A CN111047498 A CN 111047498A
- Authority
- CN
- China
- Prior art keywords
- copy
- buffer
- tile
- module
- pixel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/20—Handling requests for interconnection or transfer for access to input/output bus
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/60—Memory management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Image Input (AREA)
- Image Processing (AREA)
Abstract
The invention relates to the technical field of computer hardware modeling, in particular to a TLM microstructure facing GPU hardware copy buffer algorithm. The invention provides a GPU hardware copy buffer algorithm-oriented TLM microstructure which comprises a copy parameter calculation module 1, a buffer allocation module 2, a buffer lower boundary processing module 3, a height direction buffer processing module 4, a buffer upper boundary processing module 5 and a tile line pixel copy module 6. The method realizes the TLM model-based copy buffer area algorithm function and the realization structure, solves the problem of GPU hardware copy buffer area algorithm function verification, solves the conditions that the copied coordinate is positioned outside the buffer area or the copy width is larger than the buffer area, and the like, improves the hardware performance of the GPU, reduces the condition of copy errors, and effectively accelerates the RTL design development.
Description
Technical Field
The invention relates to the technical field of computer hardware modeling, in particular to a TLM microstructure facing GPU hardware copy buffer algorithm.
Background
In the design and development of a graphics processor chip (hereinafter referred to as GPU), the correctness and efficiency of an algorithm are important factors determining the function and performance of the GPU. The OpenGL API supports copying pixels from a buffer, but does not define how the copied pixels should be processed when the copy coordinates are outside the buffer. When the copied coordinates are outside the buffer area or the copy width is larger than the buffer area, reading out boundary crossing or copy dislocation or a large number of invalid copy behaviors are easy to process, and the hardware performance of the GPU is reduced, which is a technical problem to be solved. When the GPU chip hardware is used for debugging the details of the algorithm, the verification and debug at the RTL stage are difficult. Therefore, the algorithm needs to be verified as early as possible before the RTL design, and a reference basis is provided for the RTL design.
Disclosure of Invention
Based on the problems in the background art, the TLM microstructure facing the GPU hardware copy buffer algorithm can solve the problems of correctness and high efficiency of the RTL simulation copy buffer algorithm and can assist RTL to perform functional verification on the TLM model on the hardware microstructure of the copy buffer algorithm in advance.
The technical solution of the invention is as follows:
a GPU hardware copy buffer algorithm-oriented TLM microstructure comprises a calculation copy parameter module 1, a buffer dispatching module 2, a buffer lower boundary processing module 3, a height direction buffer processing module 4, a buffer upper boundary processing module 5 and a tile line pixel copy module 6;
the copy parameter calculating module 1, the buffer region dispatching module 2, the lower boundary processing module 3 of the buffer region and the tile row pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1, the buffer allocation module 2, the height direction buffer processing module 4 and the tile row pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1, the buffer area dispatching module 2, the buffer area upper boundary processing module 5 and the tile line pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1 is used for calculating the distance exceeding the upper bound in the y direction, the positive and negative copy distances in the x and y directions, the copy starting coordinates in the x and y directions, the copy starting tile coordinate, the number of copy tiles in the x direction and the number of positive and negative copy tiles in the y direction;
the buffer area allocating module 2 is used for allocating tiles in the y negative direction to the buffer area lower boundary processing module 3, allocating tiles in the y positive direction to the height direction buffer area processing module 4 and allocating out-of-limit tiles rows to the buffer area upper boundary processing module 5;
the lower boundary processing module 3 of the buffer area is used for processing tile row copy pixels in the y negative direction;
the height direction buffer processing module 4 is used for processing tile row copy pixels in the y positive direction;
the buffer upper boundary processing module 5 is used for processing tile line copy pixels exceeding the upper boundary of the video memory;
the tile row pixel copying module 6 is used for copying tile row pixels;
the height direction buffer processing module 4 comprises a read pixel submodule 41, an x direction copy pixel submodule 42 and a tile row position calculation submodule 43;
wherein tile represents a 4x4 pixel block, the x and y coordinates of the leftmost lower pixel are both integer multiples of 4, tile line represents 4 pixel lines, the y coordinate of the starting pixel line is integer multiple of 4, and the left lower corner coordinate (x, y) of the buffer area is set as the origin.
Further, in the above-mentioned case,
the copy parameter calculating module 1 receives the copy coordinates and the copy width and height;
calculating the distance of y direction exceeding the upper bound, positive and negative copy distance in x and y directions, copy initial coordinate in x and y directions, copy initial tile coordinate, number of copy tiles in x direction, and number of positive and negative copy tiles in y direction;
and then the distance that the y direction exceeds the upper bound, the positive and negative copy distances in the x and y directions, the copy start coordinates in the x and y directions, the copy start tile coordinate, the number of the copy tiles in the x direction and the number of the positive and negative copy tiles in the y direction are sent to the buffer allocation module 2 through the TLM interface.
Further, the buffer allocation module 2 receives the positive and negative copy distances in the x and y directions, the copy start coordinates in the x and y directions, the copy start tile coordinates, the number of copy tiles in the x direction, and the number of positive and negative copy tiles in the y direction sent by the calculation copy parameter module 1,
sending the negative copy distance in the y direction to a lower boundary processing module 3 of the buffer area through a TLM interface;
sending the copy starting coordinate in the y direction, the positive copy distance in the y direction, the copy starting tile coordinate, the number of the copy tiles in the x direction and the negative copy distance in the x direction to the height direction buffer processing module 4 through the TLM interface;
sending the distance exceeding the upper bound in the y direction to a buffer upper bound processing module 5 through a TLM interface;
the copy start coordinate in the x-direction is sent to the tile line pixel copy module 6 via the TLM interface.
Further, the lower boundary processing module 3 of the buffer receives the y-direction negative copy distance sent by the buffer allocation module 2,
calculating the start and end positions of the tile row in the y direction, setting all the tile row pixels to 0,
the copy pixel tile line, the start and end positions of the tile line are then sent to the tile line pixel copy module 6 via the TLM interface.
Further, the height direction buffer processing module 4 receives the y direction copy start coordinate, the y direction positive copy distance, the copy start tile coordinate, the x direction copy tile number, and the x direction negative copy distance sent by the buffer allocation module 2,
reading the pixels of the buffer area by calculating the tile coordinates of the buffer area, then performing 0 complementing processing on the pixels outside the buffer area in the x direction, finally calculating the starting position and the ending position of the tile line,
and sending the tile line of the copy pixel and the starting and ending positions of the tile line to a tile line pixel copy module 6 through a TLM interface.
Further, the buffer upper boundary processing module 5 receives the distance that the y direction sent by the buffer dispatching module 2 exceeds the upper boundary,
calculating the start and end positions of the tile row in the y direction, setting all the tile row pixels to 0,
and then sending the tile line of the copy pixel and the starting and ending positions of the tile line to a tile line pixel copy module 6 through a TLM interface.
Further, the tile line pixel copying module 6 receives tile lines of copied pixels and start and end positions of tile lines sent by the buffer lower boundary processing module 3 and the height direction buffer processing module 4, and x-direction start coordinates sent by the buffer dispatching module 2,
the start pixel of the tile row is calculated from the copy start coordinate in the x-direction,
then copy operation of tile row pixels is carried out.
Further, the read pixel submodule 41 receives the copy start tile coordinate sent by the buffer allocation module 2, calculates and reads the coordinate of each tile according to the number of the copy tiles in the x direction,
the buffer pixels are read according to tile coordinates,
and sends the read buffer pixels to the x-direction copy pixel submodule 42.
Further, the x-direction copy pixel sub-module 42 receives the x-direction negative copy distance sent by the buffer allocation module 2 and the buffer pixel sent by the read pixel sub-module 41,
the x-direction negative copy distance is reserved in front of the read line of pixels, which are all filled with 0, as the pixels outside the buffer,
the processed copy pixel lines are then sent to tile line pixel copy modules 6.
Further, the tile row position calculating submodule 43 receives the y start coordinate and the positive copy distance in the y direction sent by the buffer allocation module 2 to calculate the start and end positions of each tile row,
the start and end positions of the tile row are sent to the tile row pixel copy module 6.
The invention has the beneficial effects that:
the method realizes the TLM model-based copy buffer area algorithm function and the realization structure, solves the problem of GPU hardware copy buffer area algorithm function verification, solves the conditions that the copied coordinate is positioned outside the buffer area or the copy width is larger than the buffer area, and the like, improves the hardware performance of the GPU, reduces the condition of copy errors, and effectively accelerates the RTL design development.
Drawings
FIG. 1 is a block diagram of a hardware TLM micro-architecture for a copy buffer algorithm in accordance with the present invention;
Detailed Description
The technical solution of the present invention will be clearly and completely described below with reference to the accompanying drawings and the specific embodiments. It is obvious that the described embodiments are only a part of the embodiments of the present invention, rather than the whole embodiments, and that all other embodiments, which can be derived by a person skilled in the art without inventive step based on the embodiments of the present invention, belong to the scope of protection of the present invention.
The invention provides a GPU hardware copy buffer algorithm-oriented TLM microstructure, which comprises a calculation copy parameter module 1, a buffer allocation module 2, a buffer lower boundary processing module 3, a height direction buffer processing module 4, a buffer upper boundary processing module 5 and a tile line pixel copy module 6, wherein the buffer allocation module is used for allocating a plurality of buffer areas;
the copy parameter calculating module 1, the buffer region dispatching module 2, the lower boundary processing module 3 of the buffer region and the tile row pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1, the buffer allocation module 2, the height direction buffer processing module 4 and the tile row pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1, the buffer area dispatching module 2, the buffer area upper boundary processing module 5 and the tile line pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1 is used for calculating the distance exceeding the upper bound in the y direction, the positive and negative copy distances in the x and y directions, the copy starting coordinates in the x and y directions, the copy starting tile coordinate, the number of copy tiles in the x direction and the number of positive and negative copy tiles in the y direction;
the buffer area allocating module 2 is used for allocating tiles in the y negative direction to the buffer area lower boundary processing module 3, allocating tiles in the y positive direction to the height direction buffer area processing module 4 and allocating out-of-limit tiles rows to the buffer area upper boundary processing module 5;
the lower boundary processing module 3 of the buffer area is used for processing tile row copy pixels in the y negative direction;
the height direction buffer processing module 4 is used for processing tile row copy pixels in the y positive direction;
the buffer upper boundary processing module 5 is used for processing tile line copy pixels exceeding the upper boundary of the video memory;
the tile row pixel copying module 6 is used for copying tile row pixels;
the height direction buffer processing module 4 comprises a read pixel submodule 41, an x direction copy pixel submodule 42 and a tile row position calculation submodule 43;
wherein tile represents a 4x4 pixel block, the x and y coordinates of the leftmost lower pixel are both integer multiples of 4, tile line represents 4 pixel lines, the y coordinate of the starting pixel line is integer multiple of 4, and the left lower corner coordinate (x, y) of the buffer area is set as the origin.
The copy parameter calculating module 1 receives the copy coordinates and the copy width and height;
calculating the distance of y direction exceeding the upper bound, positive and negative copy distance in x and y directions, copy initial coordinate in x and y directions, copy initial tile coordinate, number of copy tiles in x direction, and number of positive and negative copy tiles in y direction;
and then the distance that the y direction exceeds the upper bound, the positive and negative copy distances in the x and y directions, the copy start coordinates in the x and y directions, the copy start tile coordinate, the number of the copy tiles in the x direction and the number of the positive and negative copy tiles in the y direction are sent to the buffer allocation module 2 through the TLM interface.
The buffer allocation module 2 receives the positive and negative copy distances in the x and y directions, the copy start coordinates in the x and y directions, the copy start tile coordinates, the number of copy tiles in the x direction and the number of positive and negative copy tiles in the y direction sent by the calculation copy parameter module 1,
sending the negative copy distance in the y direction to a lower boundary processing module 3 of the buffer area through a TLM interface;
sending the copy starting coordinate in the y direction, the positive copy distance in the y direction, the copy starting tile coordinate, the number of the copy tiles in the x direction and the negative copy distance in the x direction to the height direction buffer processing module 4 through the TLM interface;
sending the distance exceeding the upper bound in the y direction to a buffer upper bound processing module 5 through a TLM interface;
the copy start coordinate in the x-direction is sent to the tile line pixel copy module 6 via the TLM interface.
The lower boundary processing module 3 of the buffer receives the y-direction negative copy distance sent by the buffer allocation module 2,
calculating the start and end positions of the tile row in the y direction, setting all the tile row pixels to 0,
the copy pixel tile line, the start and end positions of the tile line are then sent to the tile line pixel copy module 6 via the TLM interface.
The height direction buffer area processing module 4 receives the copy starting coordinate in the y direction, the positive copy distance in the y direction, the copy starting tile coordinate, the number of the copy tiles in the x direction and the negative copy distance in the x direction sent by the buffer area allocating module 2,
reading the pixels of the buffer area by calculating the tile coordinates of the buffer area, then performing 0 complementing processing on the pixels outside the buffer area in the x direction, finally calculating the starting position and the ending position of the tile line,
and sending the tile line of the copy pixel and the starting and ending positions of the tile line to a tile line pixel copy module 6 through a TLM interface.
The buffer upper boundary processing module 5 receives the distance that the y direction sent by the buffer dispatching module 2 exceeds the upper boundary,
calculating the start and end positions of the tile row in the y direction, setting all the tile row pixels to 0,
and then sending the tile line of the copy pixel and the starting and ending positions of the tile line to a tile line pixel copy module 6 through a TLM interface.
The tile line pixel copying module 6 receives tile lines of copied pixels and the starting and ending positions of the tile lines sent by the buffer lower boundary processing module 3 and the height direction buffer processing module 4, and the x direction starting coordinate sent by the buffer dispatching module 2,
the start pixel of the tile row is calculated from the copy start coordinate in the x-direction,
then copy operation of tile row pixels is carried out.
The read pixel submodule 41 receives the copy start tile coordinate sent by the buffer dispatch module 2 and the number of the copy tiles in the x direction to calculate and read the coordinate of each tile,
the buffer pixels are read according to tile coordinates,
and sends the read buffer pixels to the x-direction copy pixel submodule 42.
The x-direction copy pixel sub-module 42 receives the x-direction negative copy distance sent by the buffer dispatch module 2 and the buffer pixel sent by the read pixel sub-module 41,
the x-direction negative copy distance is reserved in front of the read line of pixels, which are all filled with 0, as the pixels outside the buffer,
the processed copy pixel lines are then sent to tile line pixel copy modules 6.
The tile row position calculating submodule 43 receives the y start coordinate and the positive copy distance in the y direction sent by the buffer allocation module 2 to calculate the start and end positions of each tile row,
the start and end positions of the tile row are sent to the tile row pixel copy module 6.
The GPU hardware copy buffer area oriented algorithm based on the structure comprises the following steps:
1) calculating a copy range parameter:
calculating positive and negative copy distances, initial coordinates and copy initial tile coordinates in the x and y directions according to the input copy coordinates and the width and height; calculating the positive direction copy tile number of x and y according to the copy initial tile coordinate; and calculating the negative direction tile number of the y according to the negative direction length of the y and the total copy tile number in the y direction.
2) Height direction buffer allocation:
dividing the buffer into positive and negative in the height direction, assigning tile lines in the negative buffer to step 3) and tile lines in the positive buffer to step 4) according to the number of copy tiles positive and negative in the y direction.
3) Copy pixel tile line process in y negative direction:
the number of negative copy tiles in the y direction is the number of tile lines needing to be copied outside the buffer area, the tiles are directly given to 0 without being copied, the initial position of the tile line corresponding to the first tile line needs to be calculated according to the copy height in the negative direction, the initial position of the tile line is 0 in other cases, and the end position of the tile line is 4.
4) Copying pixel tile lines in the y positive direction:
4.1) read buffer pixels:
and calculating and reading the coordinate of each tile according to the copy starting tile coordinate and the number of the tiles copied in the x direction, and then reading the pixels of the buffer area according to the tile coordinate.
4.2) x-direction copy pixel processing:
the x-direction negative copy distance is reserved in front of the read pixel row and these pixels are all filled with 0's as the pixels outside the buffer.
4.3) tile line start and end position calculation:
and respectively calculating the starting position and the ending position of the tile line for the first tile line and the last tile line according to the y starting coordinate and the positive copy distance in the y direction. Otherwise, the start position of the tile row is 0 and the end position is 4.
5) y-exceed upper bound copy pixel tile line processing
These tiles also do not need to go through the copy process and are given directly to 0, with the start positions of tile lines all being 0. For the last tile row, the end position of the tile row is calculated based on the distance exceeding the upper bound, and the end positions of the other tile rows are all 4.
6) Tile line pixel copy
Firstly, the restarting pixel of the tile row is calculated according to the copy coordinate in the x direction, and then the copy operation of the pixel of the tile row is carried out.
Example (b):
the invention is described in further detail below with reference to the accompanying drawings, which refer to fig. 1.
The invention provides a GPU hardware copy buffer algorithm-oriented TLM microstructure, which comprises a calculation copy parameter module 1, a buffer allocation module 2, a buffer lower boundary processing module 3, a height direction buffer processing module 4, a buffer upper boundary processing module 5 and a tile line pixel copy module 6, wherein the buffer allocation module is used for allocating a plurality of buffer areas;
the copy parameter calculating module 1, the buffer region dispatching module 2, the lower boundary processing module 3 of the buffer region and the tile row pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1, the buffer allocation module 2, the height direction buffer processing module 4 and the tile row pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1, the buffer area dispatching module 2, the buffer area upper boundary processing module 5 and the tile line pixel copying module 6 are connected in sequence;
the height direction buffer processing module 4 comprises a read pixel submodule 41, an x direction copy pixel submodule 42 and a tile row position calculation submodule 43;
wherein tile represents a 4x4 pixel block, the x and y coordinates of the leftmost lower pixel are both integer multiples of 4, tile line represents 4 pixel lines, the y coordinate of the starting pixel line is integer multiple of 4, and the left lower corner coordinate (x, y) of the buffer area is set as the origin.
The GPU hardware copy buffer area oriented algorithm based on the structure comprises the following steps:
step 1, calculating copy range parameters, and calculating positive and negative copy distances, initial coordinates and copy initial tile coordinates in x and y directions according to input copy coordinates and width and height; calculating the positive direction copy tile number of x and y according to the copy initial tile coordinate; and calculating the negative direction tile number of the y according to the negative direction length of the y and the total copy tile number in the y direction.
And 2, allocating a buffer area in the height direction, dividing the buffer area into positive and negative in the height direction, allocating tile rows in a negative buffer area to the step 3 according to the number of copy tiles in the positive and negative in the y direction, and allocating tile rows in a positive buffer area to the step 4.
And 3, processing the tile rows of the y negative direction copy pixels, wherein the number of the y negative direction copy tiles is the number of the tile rows needing to be copied outside the buffer area, the tiles are directly given to 0 without a copying process, the initial position of the tile row corresponding to the first tile row needs to be calculated according to the negative direction copy height, the initial positions of the tile rows are all 0 under other conditions, and the end positions are all 4.
And 4, copying pixels in the positive y direction for tile line processing, reading the pixels in the buffer area, calculating and reading the coordinate of each tile according to the copy starting tile coordinate and the copy tile number in the positive x direction, and reading the pixels in the buffer area according to the tile coordinate. Then, the x-direction copy pixel processing is performed, and the x-direction negative copy distance is reserved before the read pixel row, and all the pixels are filled with 0 to be used as pixels outside the buffer area. And finally, calculating the starting and ending positions of the tile rows, and respectively calculating the starting and ending positions of the first and last tile rows according to the y starting coordinate and the positive copy distance in the y direction. Otherwise, the start position of the tile row is 0 and the end position is 4.
And 5, processing the tile rows of the y-beyond-upper-bound copy pixels, wherein the tiles are directly given to 0 without a copying process, and the starting positions of the tile rows are all 0. For the last tile row, the end position of the tile row is calculated based on the distance exceeding the upper bound, and the end positions of the other tile rows are all 4.
And 6, copying the pixels of the tile rows, namely firstly calculating the restarting pixels of the tile rows according to the copy coordinates in the x direction, and then copying the pixels of the tile rows.
Claims (10)
1. A GPU hardware copy buffer algorithm-oriented TLM microstructure is characterized in that: the method comprises a copy parameter calculation module 1, a buffer allocation module 2, a buffer lower boundary processing module 3, a height direction buffer processing module 4, a buffer upper boundary processing module 5 and a tile line pixel copy module 6;
the copy parameter calculating module 1, the buffer region dispatching module 2, the lower boundary processing module 3 of the buffer region and the tile row pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1, the buffer allocation module 2, the height direction buffer processing module 4 and the tile row pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1, the buffer area dispatching module 2, the buffer area upper boundary processing module 5 and the tile line pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1 is used for calculating the distance exceeding the upper bound in the y direction, the positive and negative copy distances in the x and y directions, the copy starting coordinates in the x and y directions, the copy starting tile coordinate, the number of copy tiles in the x direction and the number of positive and negative copy tiles in the y direction;
the buffer area allocating module 2 is used for allocating tiles in the y negative direction to the buffer area lower boundary processing module 3, allocating tiles in the y positive direction to the height direction buffer area processing module 4 and allocating out-of-limit tiles rows to the buffer area upper boundary processing module 5;
the lower boundary processing module 3 of the buffer area is used for processing tile row copy pixels in the y negative direction;
the height direction buffer processing module 4 is used for processing tile row copy pixels in the y positive direction;
the buffer upper boundary processing module 5 is used for processing tile line copy pixels exceeding the upper boundary of the video memory;
the tile row pixel copying module 6 is used for copying tile row pixels;
the height direction buffer processing module 4 comprises a read pixel submodule 41, an x direction copy pixel submodule 42 and a tile row position calculation submodule 43;
wherein tile represents a 4x4 pixel block, the x and y coordinates of the leftmost lower pixel are both integer multiples of 4,
tile row represents 4 pixel rows, the y coordinate of the starting pixel row is an integer multiple of 4,
the lower left corner coordinates (x, y) of the buffer are set to the origin.
2. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the copy parameter calculating module 1 receives the copy coordinates and the copy width and height;
calculating the distance of y direction exceeding the upper bound, positive and negative copy distance in x and y directions, copy initial coordinate in x and y directions, copy initial tile coordinate, number of copy tiles in x direction, and number of positive and negative copy tiles in y direction;
and then the distance that the y direction exceeds the upper bound, the positive and negative copy distances in the x and y directions, the copy start coordinates in the x and y directions, the copy start tile coordinate, the number of the copy tiles in the x direction and the number of the positive and negative copy tiles in the y direction are sent to the buffer allocation module 2 through the TLM interface.
3. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the buffer allocation module 2 receives the positive and negative copy distances in the x and y directions, the copy start coordinates in the x and y directions, the copy start tile coordinates, the number of copy tiles in the x direction and the number of positive and negative copy tiles in the y direction sent by the calculation copy parameter module 1,
sending the negative copy distance in the y direction to a lower boundary processing module 3 of the buffer area through a TLM interface;
sending the copy starting coordinate in the y direction, the positive copy distance in the y direction, the copy starting tile coordinate, the number of the copy tiles in the x direction and the negative copy distance in the x direction to the height direction buffer processing module 4 through the TLM interface;
sending the distance exceeding the upper bound in the y direction to a buffer upper bound processing module 5 through a TLM interface;
the copy start coordinate in the x-direction is sent to the tile line pixel copy module 6 via the TLM interface.
4. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the lower boundary processing module 3 of the buffer receives the y-direction negative copy distance sent by the buffer allocation module 2,
calculating the start and end positions of the tile row in the y direction, setting all the tile row pixels to 0,
the copy pixel tile line, the start and end positions of the tile line are then sent to the tile line pixel copy module 6 via the TLM interface.
5. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the height direction buffer area processing module 4 receives the copy starting coordinate in the y direction, the positive copy distance in the y direction, the copy starting tile coordinate, the number of the copy tiles in the x direction and the negative copy distance in the x direction sent by the buffer area allocating module 2,
reading the pixels of the buffer area by calculating the tile coordinates of the buffer area, then performing 0 complementing processing on the pixels outside the buffer area in the x direction, finally calculating the starting position and the ending position of the tile line,
and sending the tile line of the copy pixel and the starting and ending positions of the tile line to a tile line pixel copy module 6 through a TLM interface.
6. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the buffer upper boundary processing module 5 receives the distance that the y direction sent by the buffer dispatching module 2 exceeds the upper boundary,
calculating the start and end positions of the tile row in the y direction, setting all the tile row pixels to 0,
and then sending the tile line of the copy pixel and the starting and ending positions of the tile line to a tile line pixel copy module 6 through a TLM interface.
7. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the tile line pixel copying module 6 receives tile lines of copied pixels and the starting and ending positions of the tile lines sent by the buffer lower boundary processing module 3 and the height direction buffer processing module 4, and the x direction starting coordinate sent by the buffer dispatching module 2,
the start pixel of the tile row is calculated from the copy start coordinate in the x-direction,
then copy operation of tile row pixels is carried out.
8. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the read pixel submodule 41 receives the copy start tile coordinate sent by the buffer dispatch module 2 and the number of the copy tiles in the x direction to calculate and read the coordinate of each tile,
the buffer pixels are read according to tile coordinates,
and sends the read buffer pixels to the x-direction copy pixel submodule 42.
9. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the x-direction copy pixel sub-module 42 receives the x-direction negative copy distance sent by the buffer dispatch module 2 and the buffer pixel sent by the read pixel sub-module 41,
the x-direction negative copy distance is reserved in front of the read line of pixels, which are all filled with 0, as the pixels outside the buffer,
the processed copy pixel lines are then sent to tile line pixel copy modules 6.
10. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the tile row position calculating submodule 43 receives the y start coordinate and the positive copy distance in the y direction sent by the buffer allocation module 2 to calculate the start and end positions of each tile row,
the start and end positions of the tile row are sent to the tile row pixel copy module 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911125649.5A CN111047498B (en) | 2019-11-18 | 2019-11-18 | GPU hardware copy buffer algorithm-oriented TLM microstructure |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911125649.5A CN111047498B (en) | 2019-11-18 | 2019-11-18 | GPU hardware copy buffer algorithm-oriented TLM microstructure |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111047498A true CN111047498A (en) | 2020-04-21 |
CN111047498B CN111047498B (en) | 2022-12-06 |
Family
ID=70232086
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911125649.5A Active CN111047498B (en) | 2019-11-18 | 2019-11-18 | GPU hardware copy buffer algorithm-oriented TLM microstructure |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111047498B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112836872A (en) * | 2021-01-29 | 2021-05-25 | 西安理工大学 | Multi-GPU-based pollutant convection diffusion equation high-performance numerical solution method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102629240A (en) * | 2012-02-13 | 2012-08-08 | 上海创远仪器技术股份有限公司 | Method and device for serial communication |
CN105045726A (en) * | 2015-08-10 | 2015-11-11 | Tcl集团股份有限公司 | Picture operation method based on parallel computation and picture operation system based on parallel computation |
CN109657328A (en) * | 2018-12-12 | 2019-04-19 | 中国航空工业集团公司西安航空计算技术研究所 | A kind of TLM micro-structure towards GPU hardware linear light gated Boundary algorithm |
CN109697743A (en) * | 2018-12-12 | 2019-04-30 | 中国航空工业集团公司西安航空计算技术研究所 | A kind of TLM micro-structure towards GPU hardware LineStipple algorithm |
-
2019
- 2019-11-18 CN CN201911125649.5A patent/CN111047498B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102629240A (en) * | 2012-02-13 | 2012-08-08 | 上海创远仪器技术股份有限公司 | Method and device for serial communication |
CN105045726A (en) * | 2015-08-10 | 2015-11-11 | Tcl集团股份有限公司 | Picture operation method based on parallel computation and picture operation system based on parallel computation |
CN109657328A (en) * | 2018-12-12 | 2019-04-19 | 中国航空工业集团公司西安航空计算技术研究所 | A kind of TLM micro-structure towards GPU hardware linear light gated Boundary algorithm |
CN109697743A (en) * | 2018-12-12 | 2019-04-30 | 中国航空工业集团公司西安航空计算技术研究所 | A kind of TLM micro-structure towards GPU hardware LineStipple algorithm |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112836872A (en) * | 2021-01-29 | 2021-05-25 | 西安理工大学 | Multi-GPU-based pollutant convection diffusion equation high-performance numerical solution method |
CN112836872B (en) * | 2021-01-29 | 2023-08-18 | 西安理工大学 | Multi-GPU-based high-performance numerical solution method for pollutant convection diffusion equation |
Also Published As
Publication number | Publication date |
---|---|
CN111047498B (en) | 2022-12-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109615685B (en) | UML-oriented GPU texture mapping-based texture execution device and method for hardware view model | |
WO2023087893A1 (en) | Object processing method and apparatus, computer device, storage medium and program product | |
CN111047498B (en) | GPU hardware copy buffer algorithm-oriented TLM microstructure | |
CN104715092B (en) | A kind of quick method for setting up Label and figure annexation in level layout verification | |
CN113936081A (en) | Blocking primitives in a graphics processing system | |
CN110941934A (en) | FPGA prototype verification development board segmentation simulation system, method, medium and terminal | |
CN109978964A (en) | A kind of image formation method, device, storage medium and terminal device | |
CN106251291A (en) | Utilize OpenGL with OpenCL to cooperate and realize the method and system of image scaling | |
CN111047504A (en) | TLM microstructure for GPU sub-image processing based on SystemC | |
CN111091487A (en) | TLM microstructure for GPU hardware line element rasterization scanning algorithm | |
CN105068984A (en) | Automatic puzzling and typesetting method | |
CN103065306B (en) | The disposal route of graph data and device | |
CN109871172B (en) | Mouse clicking method and device in automatic test and readable storage medium | |
CN109741433B (en) | Triangle multidirectional parallel scanning method and structure based on Tile | |
CN103310409A (en) | Quick triangle partitioning method of Tile-based rendering architecture central processing unit (CPU) | |
CN114663564A (en) | Browser WebGL large scene rendering method, device, equipment and medium | |
CN103164546A (en) | Generation method of schematic circuit diagram connecting line | |
CN107256281B (en) | FPGA (field programmable Gate array) reconfigurable resource non-rectangular layout method based on cutting method | |
CN111080508B (en) | GPU sub-image processing method based on DMA | |
CN111028130B (en) | TLM microstructure facing GPU hardware texel value taking method | |
CN111402370A (en) | Method and device for detecting floating object | |
CN111008515A (en) | TLM microstructure for GPU hardware sub-texture replacement storage algorithm | |
CN111028131B (en) | TLM microstructure for generating Mipmap multiple detail layer texture algorithm by GPU hardware | |
CN111598992B (en) | Partition removing and rendering method and system based on Unity3D body and surface model | |
CN110941939A (en) | TLM microstructure for GPU hardware pixel replication algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |