CN111047498A - GPU hardware copy buffer algorithm-oriented TLM microstructure - Google Patents

GPU hardware copy buffer algorithm-oriented TLM microstructure Download PDF

Info

Publication number
CN111047498A
CN111047498A CN201911125649.5A CN201911125649A CN111047498A CN 111047498 A CN111047498 A CN 111047498A CN 201911125649 A CN201911125649 A CN 201911125649A CN 111047498 A CN111047498 A CN 111047498A
Authority
CN
China
Prior art keywords
copy
buffer
tile
module
pixel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911125649.5A
Other languages
Chinese (zh)
Other versions
CN111047498B (en
Inventor
陈佳
姜丽云
张少锋
吴晓成
任向隆
赵彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Aeronautics Computing Technique Research Institute of AVIC
Original Assignee
Xian Aeronautics Computing Technique Research Institute of AVIC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Aeronautics Computing Technique Research Institute of AVIC filed Critical Xian Aeronautics Computing Technique Research Institute of AVIC
Priority to CN201911125649.5A priority Critical patent/CN111047498B/en
Publication of CN111047498A publication Critical patent/CN111047498A/en
Application granted granted Critical
Publication of CN111047498B publication Critical patent/CN111047498B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Input (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to the technical field of computer hardware modeling, in particular to a TLM microstructure facing GPU hardware copy buffer algorithm. The invention provides a GPU hardware copy buffer algorithm-oriented TLM microstructure which comprises a copy parameter calculation module 1, a buffer allocation module 2, a buffer lower boundary processing module 3, a height direction buffer processing module 4, a buffer upper boundary processing module 5 and a tile line pixel copy module 6. The method realizes the TLM model-based copy buffer area algorithm function and the realization structure, solves the problem of GPU hardware copy buffer area algorithm function verification, solves the conditions that the copied coordinate is positioned outside the buffer area or the copy width is larger than the buffer area, and the like, improves the hardware performance of the GPU, reduces the condition of copy errors, and effectively accelerates the RTL design development.

Description

GPU hardware copy buffer algorithm-oriented TLM microstructure
Technical Field
The invention relates to the technical field of computer hardware modeling, in particular to a TLM microstructure facing GPU hardware copy buffer algorithm.
Background
In the design and development of a graphics processor chip (hereinafter referred to as GPU), the correctness and efficiency of an algorithm are important factors determining the function and performance of the GPU. The OpenGL API supports copying pixels from a buffer, but does not define how the copied pixels should be processed when the copy coordinates are outside the buffer. When the copied coordinates are outside the buffer area or the copy width is larger than the buffer area, reading out boundary crossing or copy dislocation or a large number of invalid copy behaviors are easy to process, and the hardware performance of the GPU is reduced, which is a technical problem to be solved. When the GPU chip hardware is used for debugging the details of the algorithm, the verification and debug at the RTL stage are difficult. Therefore, the algorithm needs to be verified as early as possible before the RTL design, and a reference basis is provided for the RTL design.
Disclosure of Invention
Based on the problems in the background art, the TLM microstructure facing the GPU hardware copy buffer algorithm can solve the problems of correctness and high efficiency of the RTL simulation copy buffer algorithm and can assist RTL to perform functional verification on the TLM model on the hardware microstructure of the copy buffer algorithm in advance.
The technical solution of the invention is as follows:
a GPU hardware copy buffer algorithm-oriented TLM microstructure comprises a calculation copy parameter module 1, a buffer dispatching module 2, a buffer lower boundary processing module 3, a height direction buffer processing module 4, a buffer upper boundary processing module 5 and a tile line pixel copy module 6;
the copy parameter calculating module 1, the buffer region dispatching module 2, the lower boundary processing module 3 of the buffer region and the tile row pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1, the buffer allocation module 2, the height direction buffer processing module 4 and the tile row pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1, the buffer area dispatching module 2, the buffer area upper boundary processing module 5 and the tile line pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1 is used for calculating the distance exceeding the upper bound in the y direction, the positive and negative copy distances in the x and y directions, the copy starting coordinates in the x and y directions, the copy starting tile coordinate, the number of copy tiles in the x direction and the number of positive and negative copy tiles in the y direction;
the buffer area allocating module 2 is used for allocating tiles in the y negative direction to the buffer area lower boundary processing module 3, allocating tiles in the y positive direction to the height direction buffer area processing module 4 and allocating out-of-limit tiles rows to the buffer area upper boundary processing module 5;
the lower boundary processing module 3 of the buffer area is used for processing tile row copy pixels in the y negative direction;
the height direction buffer processing module 4 is used for processing tile row copy pixels in the y positive direction;
the buffer upper boundary processing module 5 is used for processing tile line copy pixels exceeding the upper boundary of the video memory;
the tile row pixel copying module 6 is used for copying tile row pixels;
the height direction buffer processing module 4 comprises a read pixel submodule 41, an x direction copy pixel submodule 42 and a tile row position calculation submodule 43;
wherein tile represents a 4x4 pixel block, the x and y coordinates of the leftmost lower pixel are both integer multiples of 4, tile line represents 4 pixel lines, the y coordinate of the starting pixel line is integer multiple of 4, and the left lower corner coordinate (x, y) of the buffer area is set as the origin.
Further, in the above-mentioned case,
the copy parameter calculating module 1 receives the copy coordinates and the copy width and height;
calculating the distance of y direction exceeding the upper bound, positive and negative copy distance in x and y directions, copy initial coordinate in x and y directions, copy initial tile coordinate, number of copy tiles in x direction, and number of positive and negative copy tiles in y direction;
and then the distance that the y direction exceeds the upper bound, the positive and negative copy distances in the x and y directions, the copy start coordinates in the x and y directions, the copy start tile coordinate, the number of the copy tiles in the x direction and the number of the positive and negative copy tiles in the y direction are sent to the buffer allocation module 2 through the TLM interface.
Further, the buffer allocation module 2 receives the positive and negative copy distances in the x and y directions, the copy start coordinates in the x and y directions, the copy start tile coordinates, the number of copy tiles in the x direction, and the number of positive and negative copy tiles in the y direction sent by the calculation copy parameter module 1,
sending the negative copy distance in the y direction to a lower boundary processing module 3 of the buffer area through a TLM interface;
sending the copy starting coordinate in the y direction, the positive copy distance in the y direction, the copy starting tile coordinate, the number of the copy tiles in the x direction and the negative copy distance in the x direction to the height direction buffer processing module 4 through the TLM interface;
sending the distance exceeding the upper bound in the y direction to a buffer upper bound processing module 5 through a TLM interface;
the copy start coordinate in the x-direction is sent to the tile line pixel copy module 6 via the TLM interface.
Further, the lower boundary processing module 3 of the buffer receives the y-direction negative copy distance sent by the buffer allocation module 2,
calculating the start and end positions of the tile row in the y direction, setting all the tile row pixels to 0,
the copy pixel tile line, the start and end positions of the tile line are then sent to the tile line pixel copy module 6 via the TLM interface.
Further, the height direction buffer processing module 4 receives the y direction copy start coordinate, the y direction positive copy distance, the copy start tile coordinate, the x direction copy tile number, and the x direction negative copy distance sent by the buffer allocation module 2,
reading the pixels of the buffer area by calculating the tile coordinates of the buffer area, then performing 0 complementing processing on the pixels outside the buffer area in the x direction, finally calculating the starting position and the ending position of the tile line,
and sending the tile line of the copy pixel and the starting and ending positions of the tile line to a tile line pixel copy module 6 through a TLM interface.
Further, the buffer upper boundary processing module 5 receives the distance that the y direction sent by the buffer dispatching module 2 exceeds the upper boundary,
calculating the start and end positions of the tile row in the y direction, setting all the tile row pixels to 0,
and then sending the tile line of the copy pixel and the starting and ending positions of the tile line to a tile line pixel copy module 6 through a TLM interface.
Further, the tile line pixel copying module 6 receives tile lines of copied pixels and start and end positions of tile lines sent by the buffer lower boundary processing module 3 and the height direction buffer processing module 4, and x-direction start coordinates sent by the buffer dispatching module 2,
the start pixel of the tile row is calculated from the copy start coordinate in the x-direction,
then copy operation of tile row pixels is carried out.
Further, the read pixel submodule 41 receives the copy start tile coordinate sent by the buffer allocation module 2, calculates and reads the coordinate of each tile according to the number of the copy tiles in the x direction,
the buffer pixels are read according to tile coordinates,
and sends the read buffer pixels to the x-direction copy pixel submodule 42.
Further, the x-direction copy pixel sub-module 42 receives the x-direction negative copy distance sent by the buffer allocation module 2 and the buffer pixel sent by the read pixel sub-module 41,
the x-direction negative copy distance is reserved in front of the read line of pixels, which are all filled with 0, as the pixels outside the buffer,
the processed copy pixel lines are then sent to tile line pixel copy modules 6.
Further, the tile row position calculating submodule 43 receives the y start coordinate and the positive copy distance in the y direction sent by the buffer allocation module 2 to calculate the start and end positions of each tile row,
the start and end positions of the tile row are sent to the tile row pixel copy module 6.
The invention has the beneficial effects that:
the method realizes the TLM model-based copy buffer area algorithm function and the realization structure, solves the problem of GPU hardware copy buffer area algorithm function verification, solves the conditions that the copied coordinate is positioned outside the buffer area or the copy width is larger than the buffer area, and the like, improves the hardware performance of the GPU, reduces the condition of copy errors, and effectively accelerates the RTL design development.
Drawings
FIG. 1 is a block diagram of a hardware TLM micro-architecture for a copy buffer algorithm in accordance with the present invention;
Detailed Description
The technical solution of the present invention will be clearly and completely described below with reference to the accompanying drawings and the specific embodiments. It is obvious that the described embodiments are only a part of the embodiments of the present invention, rather than the whole embodiments, and that all other embodiments, which can be derived by a person skilled in the art without inventive step based on the embodiments of the present invention, belong to the scope of protection of the present invention.
The invention provides a GPU hardware copy buffer algorithm-oriented TLM microstructure, which comprises a calculation copy parameter module 1, a buffer allocation module 2, a buffer lower boundary processing module 3, a height direction buffer processing module 4, a buffer upper boundary processing module 5 and a tile line pixel copy module 6, wherein the buffer allocation module is used for allocating a plurality of buffer areas;
the copy parameter calculating module 1, the buffer region dispatching module 2, the lower boundary processing module 3 of the buffer region and the tile row pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1, the buffer allocation module 2, the height direction buffer processing module 4 and the tile row pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1, the buffer area dispatching module 2, the buffer area upper boundary processing module 5 and the tile line pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1 is used for calculating the distance exceeding the upper bound in the y direction, the positive and negative copy distances in the x and y directions, the copy starting coordinates in the x and y directions, the copy starting tile coordinate, the number of copy tiles in the x direction and the number of positive and negative copy tiles in the y direction;
the buffer area allocating module 2 is used for allocating tiles in the y negative direction to the buffer area lower boundary processing module 3, allocating tiles in the y positive direction to the height direction buffer area processing module 4 and allocating out-of-limit tiles rows to the buffer area upper boundary processing module 5;
the lower boundary processing module 3 of the buffer area is used for processing tile row copy pixels in the y negative direction;
the height direction buffer processing module 4 is used for processing tile row copy pixels in the y positive direction;
the buffer upper boundary processing module 5 is used for processing tile line copy pixels exceeding the upper boundary of the video memory;
the tile row pixel copying module 6 is used for copying tile row pixels;
the height direction buffer processing module 4 comprises a read pixel submodule 41, an x direction copy pixel submodule 42 and a tile row position calculation submodule 43;
wherein tile represents a 4x4 pixel block, the x and y coordinates of the leftmost lower pixel are both integer multiples of 4, tile line represents 4 pixel lines, the y coordinate of the starting pixel line is integer multiple of 4, and the left lower corner coordinate (x, y) of the buffer area is set as the origin.
The copy parameter calculating module 1 receives the copy coordinates and the copy width and height;
calculating the distance of y direction exceeding the upper bound, positive and negative copy distance in x and y directions, copy initial coordinate in x and y directions, copy initial tile coordinate, number of copy tiles in x direction, and number of positive and negative copy tiles in y direction;
and then the distance that the y direction exceeds the upper bound, the positive and negative copy distances in the x and y directions, the copy start coordinates in the x and y directions, the copy start tile coordinate, the number of the copy tiles in the x direction and the number of the positive and negative copy tiles in the y direction are sent to the buffer allocation module 2 through the TLM interface.
The buffer allocation module 2 receives the positive and negative copy distances in the x and y directions, the copy start coordinates in the x and y directions, the copy start tile coordinates, the number of copy tiles in the x direction and the number of positive and negative copy tiles in the y direction sent by the calculation copy parameter module 1,
sending the negative copy distance in the y direction to a lower boundary processing module 3 of the buffer area through a TLM interface;
sending the copy starting coordinate in the y direction, the positive copy distance in the y direction, the copy starting tile coordinate, the number of the copy tiles in the x direction and the negative copy distance in the x direction to the height direction buffer processing module 4 through the TLM interface;
sending the distance exceeding the upper bound in the y direction to a buffer upper bound processing module 5 through a TLM interface;
the copy start coordinate in the x-direction is sent to the tile line pixel copy module 6 via the TLM interface.
The lower boundary processing module 3 of the buffer receives the y-direction negative copy distance sent by the buffer allocation module 2,
calculating the start and end positions of the tile row in the y direction, setting all the tile row pixels to 0,
the copy pixel tile line, the start and end positions of the tile line are then sent to the tile line pixel copy module 6 via the TLM interface.
The height direction buffer area processing module 4 receives the copy starting coordinate in the y direction, the positive copy distance in the y direction, the copy starting tile coordinate, the number of the copy tiles in the x direction and the negative copy distance in the x direction sent by the buffer area allocating module 2,
reading the pixels of the buffer area by calculating the tile coordinates of the buffer area, then performing 0 complementing processing on the pixels outside the buffer area in the x direction, finally calculating the starting position and the ending position of the tile line,
and sending the tile line of the copy pixel and the starting and ending positions of the tile line to a tile line pixel copy module 6 through a TLM interface.
The buffer upper boundary processing module 5 receives the distance that the y direction sent by the buffer dispatching module 2 exceeds the upper boundary,
calculating the start and end positions of the tile row in the y direction, setting all the tile row pixels to 0,
and then sending the tile line of the copy pixel and the starting and ending positions of the tile line to a tile line pixel copy module 6 through a TLM interface.
The tile line pixel copying module 6 receives tile lines of copied pixels and the starting and ending positions of the tile lines sent by the buffer lower boundary processing module 3 and the height direction buffer processing module 4, and the x direction starting coordinate sent by the buffer dispatching module 2,
the start pixel of the tile row is calculated from the copy start coordinate in the x-direction,
then copy operation of tile row pixels is carried out.
The read pixel submodule 41 receives the copy start tile coordinate sent by the buffer dispatch module 2 and the number of the copy tiles in the x direction to calculate and read the coordinate of each tile,
the buffer pixels are read according to tile coordinates,
and sends the read buffer pixels to the x-direction copy pixel submodule 42.
The x-direction copy pixel sub-module 42 receives the x-direction negative copy distance sent by the buffer dispatch module 2 and the buffer pixel sent by the read pixel sub-module 41,
the x-direction negative copy distance is reserved in front of the read line of pixels, which are all filled with 0, as the pixels outside the buffer,
the processed copy pixel lines are then sent to tile line pixel copy modules 6.
The tile row position calculating submodule 43 receives the y start coordinate and the positive copy distance in the y direction sent by the buffer allocation module 2 to calculate the start and end positions of each tile row,
the start and end positions of the tile row are sent to the tile row pixel copy module 6.
The GPU hardware copy buffer area oriented algorithm based on the structure comprises the following steps:
1) calculating a copy range parameter:
calculating positive and negative copy distances, initial coordinates and copy initial tile coordinates in the x and y directions according to the input copy coordinates and the width and height; calculating the positive direction copy tile number of x and y according to the copy initial tile coordinate; and calculating the negative direction tile number of the y according to the negative direction length of the y and the total copy tile number in the y direction.
2) Height direction buffer allocation:
dividing the buffer into positive and negative in the height direction, assigning tile lines in the negative buffer to step 3) and tile lines in the positive buffer to step 4) according to the number of copy tiles positive and negative in the y direction.
3) Copy pixel tile line process in y negative direction:
the number of negative copy tiles in the y direction is the number of tile lines needing to be copied outside the buffer area, the tiles are directly given to 0 without being copied, the initial position of the tile line corresponding to the first tile line needs to be calculated according to the copy height in the negative direction, the initial position of the tile line is 0 in other cases, and the end position of the tile line is 4.
4) Copying pixel tile lines in the y positive direction:
4.1) read buffer pixels:
and calculating and reading the coordinate of each tile according to the copy starting tile coordinate and the number of the tiles copied in the x direction, and then reading the pixels of the buffer area according to the tile coordinate.
4.2) x-direction copy pixel processing:
the x-direction negative copy distance is reserved in front of the read pixel row and these pixels are all filled with 0's as the pixels outside the buffer.
4.3) tile line start and end position calculation:
and respectively calculating the starting position and the ending position of the tile line for the first tile line and the last tile line according to the y starting coordinate and the positive copy distance in the y direction. Otherwise, the start position of the tile row is 0 and the end position is 4.
5) y-exceed upper bound copy pixel tile line processing
These tiles also do not need to go through the copy process and are given directly to 0, with the start positions of tile lines all being 0. For the last tile row, the end position of the tile row is calculated based on the distance exceeding the upper bound, and the end positions of the other tile rows are all 4.
6) Tile line pixel copy
Firstly, the restarting pixel of the tile row is calculated according to the copy coordinate in the x direction, and then the copy operation of the pixel of the tile row is carried out.
Example (b):
the invention is described in further detail below with reference to the accompanying drawings, which refer to fig. 1.
The invention provides a GPU hardware copy buffer algorithm-oriented TLM microstructure, which comprises a calculation copy parameter module 1, a buffer allocation module 2, a buffer lower boundary processing module 3, a height direction buffer processing module 4, a buffer upper boundary processing module 5 and a tile line pixel copy module 6, wherein the buffer allocation module is used for allocating a plurality of buffer areas;
the copy parameter calculating module 1, the buffer region dispatching module 2, the lower boundary processing module 3 of the buffer region and the tile row pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1, the buffer allocation module 2, the height direction buffer processing module 4 and the tile row pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1, the buffer area dispatching module 2, the buffer area upper boundary processing module 5 and the tile line pixel copying module 6 are connected in sequence;
the height direction buffer processing module 4 comprises a read pixel submodule 41, an x direction copy pixel submodule 42 and a tile row position calculation submodule 43;
wherein tile represents a 4x4 pixel block, the x and y coordinates of the leftmost lower pixel are both integer multiples of 4, tile line represents 4 pixel lines, the y coordinate of the starting pixel line is integer multiple of 4, and the left lower corner coordinate (x, y) of the buffer area is set as the origin.
The GPU hardware copy buffer area oriented algorithm based on the structure comprises the following steps:
step 1, calculating copy range parameters, and calculating positive and negative copy distances, initial coordinates and copy initial tile coordinates in x and y directions according to input copy coordinates and width and height; calculating the positive direction copy tile number of x and y according to the copy initial tile coordinate; and calculating the negative direction tile number of the y according to the negative direction length of the y and the total copy tile number in the y direction.
And 2, allocating a buffer area in the height direction, dividing the buffer area into positive and negative in the height direction, allocating tile rows in a negative buffer area to the step 3 according to the number of copy tiles in the positive and negative in the y direction, and allocating tile rows in a positive buffer area to the step 4.
And 3, processing the tile rows of the y negative direction copy pixels, wherein the number of the y negative direction copy tiles is the number of the tile rows needing to be copied outside the buffer area, the tiles are directly given to 0 without a copying process, the initial position of the tile row corresponding to the first tile row needs to be calculated according to the negative direction copy height, the initial positions of the tile rows are all 0 under other conditions, and the end positions are all 4.
And 4, copying pixels in the positive y direction for tile line processing, reading the pixels in the buffer area, calculating and reading the coordinate of each tile according to the copy starting tile coordinate and the copy tile number in the positive x direction, and reading the pixels in the buffer area according to the tile coordinate. Then, the x-direction copy pixel processing is performed, and the x-direction negative copy distance is reserved before the read pixel row, and all the pixels are filled with 0 to be used as pixels outside the buffer area. And finally, calculating the starting and ending positions of the tile rows, and respectively calculating the starting and ending positions of the first and last tile rows according to the y starting coordinate and the positive copy distance in the y direction. Otherwise, the start position of the tile row is 0 and the end position is 4.
And 5, processing the tile rows of the y-beyond-upper-bound copy pixels, wherein the tiles are directly given to 0 without a copying process, and the starting positions of the tile rows are all 0. For the last tile row, the end position of the tile row is calculated based on the distance exceeding the upper bound, and the end positions of the other tile rows are all 4.
And 6, copying the pixels of the tile rows, namely firstly calculating the restarting pixels of the tile rows according to the copy coordinates in the x direction, and then copying the pixels of the tile rows.

Claims (10)

1. A GPU hardware copy buffer algorithm-oriented TLM microstructure is characterized in that: the method comprises a copy parameter calculation module 1, a buffer allocation module 2, a buffer lower boundary processing module 3, a height direction buffer processing module 4, a buffer upper boundary processing module 5 and a tile line pixel copy module 6;
the copy parameter calculating module 1, the buffer region dispatching module 2, the lower boundary processing module 3 of the buffer region and the tile row pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1, the buffer allocation module 2, the height direction buffer processing module 4 and the tile row pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1, the buffer area dispatching module 2, the buffer area upper boundary processing module 5 and the tile line pixel copying module 6 are connected in sequence;
the copy parameter calculating module 1 is used for calculating the distance exceeding the upper bound in the y direction, the positive and negative copy distances in the x and y directions, the copy starting coordinates in the x and y directions, the copy starting tile coordinate, the number of copy tiles in the x direction and the number of positive and negative copy tiles in the y direction;
the buffer area allocating module 2 is used for allocating tiles in the y negative direction to the buffer area lower boundary processing module 3, allocating tiles in the y positive direction to the height direction buffer area processing module 4 and allocating out-of-limit tiles rows to the buffer area upper boundary processing module 5;
the lower boundary processing module 3 of the buffer area is used for processing tile row copy pixels in the y negative direction;
the height direction buffer processing module 4 is used for processing tile row copy pixels in the y positive direction;
the buffer upper boundary processing module 5 is used for processing tile line copy pixels exceeding the upper boundary of the video memory;
the tile row pixel copying module 6 is used for copying tile row pixels;
the height direction buffer processing module 4 comprises a read pixel submodule 41, an x direction copy pixel submodule 42 and a tile row position calculation submodule 43;
wherein tile represents a 4x4 pixel block, the x and y coordinates of the leftmost lower pixel are both integer multiples of 4,
tile row represents 4 pixel rows, the y coordinate of the starting pixel row is an integer multiple of 4,
the lower left corner coordinates (x, y) of the buffer are set to the origin.
2. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the copy parameter calculating module 1 receives the copy coordinates and the copy width and height;
calculating the distance of y direction exceeding the upper bound, positive and negative copy distance in x and y directions, copy initial coordinate in x and y directions, copy initial tile coordinate, number of copy tiles in x direction, and number of positive and negative copy tiles in y direction;
and then the distance that the y direction exceeds the upper bound, the positive and negative copy distances in the x and y directions, the copy start coordinates in the x and y directions, the copy start tile coordinate, the number of the copy tiles in the x direction and the number of the positive and negative copy tiles in the y direction are sent to the buffer allocation module 2 through the TLM interface.
3. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the buffer allocation module 2 receives the positive and negative copy distances in the x and y directions, the copy start coordinates in the x and y directions, the copy start tile coordinates, the number of copy tiles in the x direction and the number of positive and negative copy tiles in the y direction sent by the calculation copy parameter module 1,
sending the negative copy distance in the y direction to a lower boundary processing module 3 of the buffer area through a TLM interface;
sending the copy starting coordinate in the y direction, the positive copy distance in the y direction, the copy starting tile coordinate, the number of the copy tiles in the x direction and the negative copy distance in the x direction to the height direction buffer processing module 4 through the TLM interface;
sending the distance exceeding the upper bound in the y direction to a buffer upper bound processing module 5 through a TLM interface;
the copy start coordinate in the x-direction is sent to the tile line pixel copy module 6 via the TLM interface.
4. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the lower boundary processing module 3 of the buffer receives the y-direction negative copy distance sent by the buffer allocation module 2,
calculating the start and end positions of the tile row in the y direction, setting all the tile row pixels to 0,
the copy pixel tile line, the start and end positions of the tile line are then sent to the tile line pixel copy module 6 via the TLM interface.
5. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the height direction buffer area processing module 4 receives the copy starting coordinate in the y direction, the positive copy distance in the y direction, the copy starting tile coordinate, the number of the copy tiles in the x direction and the negative copy distance in the x direction sent by the buffer area allocating module 2,
reading the pixels of the buffer area by calculating the tile coordinates of the buffer area, then performing 0 complementing processing on the pixels outside the buffer area in the x direction, finally calculating the starting position and the ending position of the tile line,
and sending the tile line of the copy pixel and the starting and ending positions of the tile line to a tile line pixel copy module 6 through a TLM interface.
6. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the buffer upper boundary processing module 5 receives the distance that the y direction sent by the buffer dispatching module 2 exceeds the upper boundary,
calculating the start and end positions of the tile row in the y direction, setting all the tile row pixels to 0,
and then sending the tile line of the copy pixel and the starting and ending positions of the tile line to a tile line pixel copy module 6 through a TLM interface.
7. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the tile line pixel copying module 6 receives tile lines of copied pixels and the starting and ending positions of the tile lines sent by the buffer lower boundary processing module 3 and the height direction buffer processing module 4, and the x direction starting coordinate sent by the buffer dispatching module 2,
the start pixel of the tile row is calculated from the copy start coordinate in the x-direction,
then copy operation of tile row pixels is carried out.
8. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the read pixel submodule 41 receives the copy start tile coordinate sent by the buffer dispatch module 2 and the number of the copy tiles in the x direction to calculate and read the coordinate of each tile,
the buffer pixels are read according to tile coordinates,
and sends the read buffer pixels to the x-direction copy pixel submodule 42.
9. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the x-direction copy pixel sub-module 42 receives the x-direction negative copy distance sent by the buffer dispatch module 2 and the buffer pixel sent by the read pixel sub-module 41,
the x-direction negative copy distance is reserved in front of the read line of pixels, which are all filled with 0, as the pixels outside the buffer,
the processed copy pixel lines are then sent to tile line pixel copy modules 6.
10. A GPU hardware copy buffer algorithm oriented TLM micro-architecture as claimed in claim 1, wherein:
the tile row position calculating submodule 43 receives the y start coordinate and the positive copy distance in the y direction sent by the buffer allocation module 2 to calculate the start and end positions of each tile row,
the start and end positions of the tile row are sent to the tile row pixel copy module 6.
CN201911125649.5A 2019-11-18 2019-11-18 GPU hardware copy buffer algorithm-oriented TLM microstructure Active CN111047498B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911125649.5A CN111047498B (en) 2019-11-18 2019-11-18 GPU hardware copy buffer algorithm-oriented TLM microstructure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911125649.5A CN111047498B (en) 2019-11-18 2019-11-18 GPU hardware copy buffer algorithm-oriented TLM microstructure

Publications (2)

Publication Number Publication Date
CN111047498A true CN111047498A (en) 2020-04-21
CN111047498B CN111047498B (en) 2022-12-06

Family

ID=70232086

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911125649.5A Active CN111047498B (en) 2019-11-18 2019-11-18 GPU hardware copy buffer algorithm-oriented TLM microstructure

Country Status (1)

Country Link
CN (1) CN111047498B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112836872A (en) * 2021-01-29 2021-05-25 西安理工大学 Multi-GPU-based pollutant convection diffusion equation high-performance numerical solution method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102629240A (en) * 2012-02-13 2012-08-08 上海创远仪器技术股份有限公司 Method and device for serial communication
CN105045726A (en) * 2015-08-10 2015-11-11 Tcl集团股份有限公司 Picture operation method based on parallel computation and picture operation system based on parallel computation
CN109657328A (en) * 2018-12-12 2019-04-19 中国航空工业集团公司西安航空计算技术研究所 A kind of TLM micro-structure towards GPU hardware linear light gated Boundary algorithm
CN109697743A (en) * 2018-12-12 2019-04-30 中国航空工业集团公司西安航空计算技术研究所 A kind of TLM micro-structure towards GPU hardware LineStipple algorithm

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102629240A (en) * 2012-02-13 2012-08-08 上海创远仪器技术股份有限公司 Method and device for serial communication
CN105045726A (en) * 2015-08-10 2015-11-11 Tcl集团股份有限公司 Picture operation method based on parallel computation and picture operation system based on parallel computation
CN109657328A (en) * 2018-12-12 2019-04-19 中国航空工业集团公司西安航空计算技术研究所 A kind of TLM micro-structure towards GPU hardware linear light gated Boundary algorithm
CN109697743A (en) * 2018-12-12 2019-04-30 中国航空工业集团公司西安航空计算技术研究所 A kind of TLM micro-structure towards GPU hardware LineStipple algorithm

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112836872A (en) * 2021-01-29 2021-05-25 西安理工大学 Multi-GPU-based pollutant convection diffusion equation high-performance numerical solution method
CN112836872B (en) * 2021-01-29 2023-08-18 西安理工大学 Multi-GPU-based high-performance numerical solution method for pollutant convection diffusion equation

Also Published As

Publication number Publication date
CN111047498B (en) 2022-12-06

Similar Documents

Publication Publication Date Title
CN109615685B (en) UML-oriented GPU texture mapping-based texture execution device and method for hardware view model
WO2023087893A1 (en) Object processing method and apparatus, computer device, storage medium and program product
CN111047498B (en) GPU hardware copy buffer algorithm-oriented TLM microstructure
CN104715092B (en) A kind of quick method for setting up Label and figure annexation in level layout verification
CN113936081A (en) Blocking primitives in a graphics processing system
CN110941934A (en) FPGA prototype verification development board segmentation simulation system, method, medium and terminal
CN109978964A (en) A kind of image formation method, device, storage medium and terminal device
CN106251291A (en) Utilize OpenGL with OpenCL to cooperate and realize the method and system of image scaling
CN111047504A (en) TLM microstructure for GPU sub-image processing based on SystemC
CN111091487A (en) TLM microstructure for GPU hardware line element rasterization scanning algorithm
CN105068984A (en) Automatic puzzling and typesetting method
CN103065306B (en) The disposal route of graph data and device
CN109871172B (en) Mouse clicking method and device in automatic test and readable storage medium
CN109741433B (en) Triangle multidirectional parallel scanning method and structure based on Tile
CN103310409A (en) Quick triangle partitioning method of Tile-based rendering architecture central processing unit (CPU)
CN114663564A (en) Browser WebGL large scene rendering method, device, equipment and medium
CN103164546A (en) Generation method of schematic circuit diagram connecting line
CN107256281B (en) FPGA (field programmable Gate array) reconfigurable resource non-rectangular layout method based on cutting method
CN111080508B (en) GPU sub-image processing method based on DMA
CN111028130B (en) TLM microstructure facing GPU hardware texel value taking method
CN111402370A (en) Method and device for detecting floating object
CN111008515A (en) TLM microstructure for GPU hardware sub-texture replacement storage algorithm
CN111028131B (en) TLM microstructure for generating Mipmap multiple detail layer texture algorithm by GPU hardware
CN111598992B (en) Partition removing and rendering method and system based on Unity3D body and surface model
CN110941939A (en) TLM microstructure for GPU hardware pixel replication algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant