CN112233008A - Device and method for realizing triangle rasterization in GPU - Google Patents

Device and method for realizing triangle rasterization in GPU Download PDF

Info

Publication number
CN112233008A
CN112233008A CN202011010966.5A CN202011010966A CN112233008A CN 112233008 A CN112233008 A CN 112233008A CN 202011010966 A CN202011010966 A CN 202011010966A CN 112233008 A CN112233008 A CN 112233008A
Authority
CN
China
Prior art keywords
triangle
coordinate
equal
span
boundary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011010966.5A
Other languages
Chinese (zh)
Inventor
阮成肖
李姝仪
张航
苑豪杰
李二磊
刘彤
纪录
张琦
冯蕾
李红星
周吉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
716th Research Institute of CSIC
Original Assignee
716th Research Institute of CSIC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 716th Research Institute of CSIC filed Critical 716th Research Institute of CSIC
Priority to CN202011010966.5A priority Critical patent/CN112233008A/en
Publication of CN112233008A publication Critical patent/CN112233008A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management

Abstract

The invention discloses a device and a method for realizing triangle rasterization in a GPU (graphics processing Unit), wherein the device comprises a vertex data cache Buffer, an edge function normalization module, an endpoint generation module, a span generation module and a span cache Buffer; the realization method comprises the following steps: the vertex coordinates of the triangle and the parameters of the side equations processed by the vertex shader are sent to a vertex data Buffer, the three sides of the triangle are subjected to boundary normalization processing, end points of each side are generated in parallel through a plurality of groups of end point generators, the end points with the same Y coordinate are sent to a span generator, a span of the triangle is obtained through classification and comparison of the end points, and the span of the triangle is sent to the span Buffer. The method can generate the internal scanning lines of the triangle in parallel, save hardware logic resources and improve the efficiency of triangle rasterization.

Description

Device and method for realizing triangle rasterization in GPU
Technical Field
The invention relates to the field of GPU design, in particular to a device and a method for realizing triangle rasterization in a GPU.
Background
Under the push of diversified application requirements, the semiconductor manufacturing process level is rapidly developed, and the functions and the performance of a computer system are greatly enriched and improved. The pursuit of the user for the three-dimensional visual effect makes the image processing only by improving the processing speed of the CPU, which can not meet the requirements of people to a great extent, and the Graphics Processing Unit (GPU) is produced accordingly. The GPU has a strong data calculation capability as a core of a computer display system, realizes functions such as 2D/3D graphics, image processing, display control, and the like in a hardware acceleration manner, frees a general-purpose CPU from a complex graphics algorithm and drawing, and has become a standard configuration of almost all types of computer systems.
From the appearance of the GPU, the hardware architecture has been reformed for several times, but the idea of rasterization of primitives is an essential loop in the graphics pipeline, and the quality and efficiency of rasterization of primitives directly affect the performance of the whole GPU pipeline. Of all the primitive objects of the GPU, the triangle, which is the most basic and most important primitive in the GPU, is the basic primitive that makes up any other more complex two-dimensional or three-dimensional object. The most important indicator of triangle rasterization is efficiency, i.e., how many triangle primitives a GPU can process in a unit time.
For increasing demands on GPU processing performance, a GPU will usually integrate dozens to hundreds of parallel rasterization modules to increase performance. The method of improving the rasterization efficiency by simply increasing the number of modules increases the scale and complexity of a chip, increases the design cost, and improves the efficiency of a single rasterization module on the basis of expanding the number of rasterization modules.
Disclosure of Invention
The invention aims to provide a device and a method for realizing triangle rasterization in a GPU (graphics processing Unit), which realize the generation of a triangle span endpoint by adopting a parallel stepping mode, greatly improve the rasterization efficiency on the premise of not only increasing the number of modules, reduce the complexity of hardware design and improve the triangle rasterization efficiency.
The technical solution for realizing the purpose of the invention is as follows: an apparatus for implementing triangle rasterization in a GPU, comprising:
vertex data Buffer: the system comprises a triangle, a vertex coordinate attribute and three sides, wherein the triangle is used for reading the vertex coordinate attribute and the parameter information of the three sides;
the side function normalization module: the system is used for converting the edge equations of the three edges of the triangle into the edge equation of the same form through normalization and placing the edge equation in a coordinate system of the triangular bounding box;
an endpoint generation module: the system is used for traversing the normalized side equation along the Y coordinate to generate a corresponding normalized endpoint coordinate X and reducing the calculated normalized coordinate to the coordinate of the side equation before normalization;
a span generation module: obtaining two end points of the triangle under the same Y coordinate by comparing the end point coordinates of the left and right boundaries of the triangle;
span Buffer: and the block generator is used for storing the endpoint coordinates of the triangle span and packaging the endpoint coordinates to be sent to the back end for processing.
A method for implementing triangle rasterization in a GPU comprises the following steps:
the vertex data cache Buffer reads the vertex coordinate attribute of the triangle and the parameter information of the three edges;
the side function normalization module transforms the side equations of the three sides of the triangle into the side equation of the same form through normalization and places the side equation in a coordinate system of the triangular bounding box;
the end point generating module traverses the normalized side equation along the Y coordinate to generate a corresponding normalized end point coordinate X, and reduces the calculated normalized coordinate to the coordinate of the 6 types of side equations before normalization;
the span generation module obtains two end points of the triangle under the same Y coordinate by comparing the end point coordinates of the left and right boundaries of the triangle;
the span Buffer stores the endpoint coordinates of the triangle span, and packs the endpoint coordinates and sends the endpoint coordinates to the block generator at the back end for processing.
Compared with the prior art, the invention has the advantages that: 1) parallel processing span generation improves the expandability of hardware; 2) the division is replaced by an addition iteration mode, so that the hardware complexity is simplified, and the logic resource is saved; 3) by the aid of the device, the triangular rasterization processing efficiency can be greatly improved on the basis of not increasing the number of rasterization modules.
Drawings
Fig. 1 is a structural diagram of a device for implementing triangle rasterization in a GPU implemented by the present invention.
FIG. 2 is a schematic diagram of a bounding box coordinate system for triangle primitives according to an embodiment of the present invention.
FIG. 3 is a schematic diagram of a triangle boundary and effective half-plane type division method according to the present invention.
FIG. 4 is a diagram illustrating an embodiment of a boundary traversal search method for a first valid endpoint according to the present invention.
FIG. 5 is a schematic diagram of an implementation method for searching effective endpoints by step traversal according to the present invention.
FIG. 6 is a schematic diagram of a method for selecting and implementing left and right span endpoints according to the present invention.
Detailed Description
As shown in fig. 1, an apparatus for implementing triangle rasterization in a GPU of the present invention is composed of the following components:
(1) vertex data Buffer: and reading the vertex coordinate attribute and the parameter information of three edges of the triangle.
(2) The side function normalization module: the edge equations for three sides of a triangle are transformed into the same form of edge equations by normalization and placed in a triangle Bounding Box (Bounding Box) coordinate system.
(3) An endpoint generation module: and the system is used for traversing the normalized edge equation along the Y coordinate to generate a corresponding normalized endpoint coordinate X, and reducing the calculated normalized coordinate to the coordinate of the 6 types of edge equations before normalization.
(4) A span generation module: and comparing the coordinates of the end points of the left and right boundaries of the triangle to obtain two end points of the triangle under the same Y coordinate.
(5) Span Buffer: and the block generator is used for storing the endpoint coordinates of the triangle span and packaging the endpoint coordinates to be sent to the back end for processing.
The invention also discloses a method for realizing triangle rasterization in the GPU, which comprises the following steps:
(1) the vertex data cache Buffer reads the vertex coordinate attribute of the triangle and the parameter information of the three edges;
the vertex coordinate attribute and the parameter information of the three sides of the triangle comprise three vertex coordinates (x) of the triangle1,y1)、(x2,y2)、(x3,y3) The parameter information of the three sides is the parameter information (a) corresponding to the side equation f (x, y) Ax + By + C01,B1,C1)、(A2,B2,C2)、(A3,B3,C3)。
(2) The side function normalization module transforms the side equations of the three sides of the triangle into the side equation of the same form through normalization and is arranged in the triangle bounding box coordinate system:
step 1: the coordinates of the triangle bounding box, i.e. the coordinates of the smallest rectangle that encloses the triangle, are determined. The coordinate of the upper left corner of the bounding box is (x)min,ymin) The coordinate of the lower right corner is (x)max,ymax) Corresponding bounding box height
Figure BDA0002697540650000032
(in the direction ofUpper rounded) bounding box width W2mWherein
Figure BDA0002697540650000031
As shown in FIG. 2, the coordinates of the upper left corner of the bounding box are defined as (0, 0), then the coordinates of the triangular bounding box are 0 ≦ x ≦ W, and 0 ≦ y ≦ H.
Step 2: the type of the three edges is determined. Because the internal data of the triangle is effective data, the three edges are judged in the clockwise direction, and the right half plane of the edge is effective data, namely the coordinate of f (x, y) is more than or equal to 0; in order to avoid repeated calculation of different triangle boundaries in the GPU, the left boundary of the default triangle is a solid line, the right boundary of the default triangle is a dotted line, namely the relative coordinate range is that x is more than or equal to 0 and less than W, and y is more than or equal to 0 and less than or equal to H; then, according to the edge equation f (x, y) ═ Ax + By + C, the half plane representing the edge of the valid data inside the triangle is divided into a half-closed plane with left boundary f (x, y) ≥ Ax + By + C ≥ 0 and a half-open plane with right boundary f (x, y) ≥ Ax + By + C > 0; accordingly, the edge types can be divided into 6 types as shown in fig. 3:
class 1: when A is less than 0 and B is more than or equal to 0, let f (x, y) be the half-open plane of Ax + By + C > 0;
class 2: when A <0 and B <0, let f (x, y) be half-open plane Ax + By + C > 0;
class 3: when a is 0 and B is <0, let f (x, y) be a half-open plane of Ax + By + C > 0;
class 4: when A is greater than 0 and B is less than or equal to 0, making f (x, y) equal to Ax + By + C equal to or more than 0;
class 5: when A is greater than 0 and B is greater than 0, let f (x, y) equal to Ax + By + C ≧ 0 semi-closed plane;
class 6: when A is equal to 0 and B is equal to or larger than 0, let f (x, y) be equal to Ax + By + C is equal to or larger than 0;
in FIG. 3, the shadow directions of class 1 to class 6 are half planes where f (x, y) is equal to or greater than 0, the dotted line represents a half-open plane, and the solid line represents a half-closed plane. In order to avoid repeated calculation of different triangle boundaries in the GPU, the default triangle has a solid left boundary and a dashed right boundary, i.e. x is greater than or equal to 0 and less than W, and y is greater than or equal to 0 and less than or equal to H, so that the calculation of repeated boundaries is reduced, and hardware resources are saved.
And step 3: according to step 2In the method, after the boundary types of the three sides of the triangle are judged, in order to simplify the boundary traversal mode and save hardware resources, the boundaries of different types are uniformly converted into f (x, y) ═ Ax + By + C ≥ 0 (A)<0, B is more than or equal to 0). Setting the original edge equation of any boundary as f (x, y) ═ Ax + By + C, and replacing x, y, C, that is, x ═ x-xmin,y=y-ymin,C=C+Axmin+ByminThe edge equation f (x, y) of the bounding box coordinate system is obtained as Ax + By + C. Then converting the effective edge equation into f (x, y) ═ Ax + By + C ≧ 0 (A)<0, B is more than or equal to 0), the conversion relation is as follows:
A=-|A|,B=|B|,
Figure BDA0002697540650000041
where W is the bounding box width.
(3) The endpoint generation module traverses the normalized edge equation along the Y coordinate to generate a corresponding normalized endpoint coordinate X, and reduces the calculated normalized coordinate to the coordinate of the 6 types of edge equations before normalization:
step 1: and traversing the boundary, and finding the position where the effective boundary endpoint starts. Since the bounding box already defines the x-coordinate of the boundary between 0, W), the value of the coordinate x outside this interval is meaningless. According to the boundary equation f (x, y) ═ Ax + By + C ═ 0 (a)<0, B is more than or equal to 0), obtaining:
Figure BDA0002697540650000042
therefore, as shown in fig. 4, when the traversal is performed along X ≧ 0, the valid endpoint (X) can be obtained only if f (0, y) ≧ 00,Y0) Wherein X is0、Y0Are all integers.
Step 2: calculating step parameter, calculating effective end point X according to the found effective end point position0Value, slope K of the edge, and related parameters. In order to save hardware resources, integer calculation is adopted in the GPU for relevant calculation, all values are integers, and then the integers are obtained
Figure BDA0002697540650000043
(the rounding is performed downwards),
Figure BDA0002697540650000044
X0value of remainder E0=f(0,Y0) mod | A |, K, the numeric remainder R0| B | mod | a |. In the invention, in order to avoid that the division calculation occupies a large amount of hardware resources, and meanwhile, the value is limited to [0, W ] according to x, the division calculation is carried out by adopting a binary addition iteration-based method, and the calculation principle is as follows:
let b be a q + r, where q is a divisor, and take a value of (0, 2)m) And can be expressed as an m +1 bit binary number: k is a radical ofmkm-1。。。k2k1k0
I.e. q ═ k0+k1*2+k2*22+...+km*2m=k0+2*(k1+2*(k2+2*(...+2*km) ...)) to obtain binary value of q and remainder r through m +1 times of iteration.
And step 3: step-by-step traversal is carried out, and the initial value X obtained in the step 2 is used0Slope K, remainder E0The remainder R0Traversal is performed along the Y-axis direction to obtain all valid x values, as shown in fig. 5, until Y is H, or x>W is added. The invention adopts a step iteration method to perform traversal calculation, and the calculation method comprises the following steps:
Figure BDA0002697540650000051
order to
Figure BDA0002697540650000052
To obtain
Figure BDA0002697540650000053
Where n is 1, 2,. . . H-Y0
The difference from the traditional Bresenham algorithm is that the slope of the edge does not need to be reduced below 1 in the iterative mode, and the calculation of all positive value slopes can be realized.
And 4, step 4: restoring the normalized coordinate, reversely restoring to a coordinate value before normalization according to the normalization process, wherein the restoring condition of the x value is as follows: x ═ X (class 1), X ═ W-1-X (class 2, class 3), X ═ X +1 (class 4), and X ═ W-X (class 5, class 6), where X is the endpoint coordinate before reduction.
(4) The span generation module obtains two endpoints of the triangle under the same Y coordinate by comparing the coordinates of the endpoints of the left and right boundaries of the triangle, and the two endpoints have the following characteristics:
step 1: determining end point values, determining left and right boundaries according to parameters of the edge equation, comparing end point values of the same boundary with the same Y coordinate and with bounding box coordinates, and selecting the left end point (P in the figure) with the maximum value of the left boundary as the span2) Selecting the minimum value of the right boundary as the right end point (P in the figure)3) While the selected value is determined to be between [0, W), the left endpoint is 0 if it is less than 0 and W-1 if it is greater than W.
Step 2: and (3) generating a complete span, namely restoring the coordinate values of the two endpoints selected in the step (1) into original coordinates, namely generating a complete span.
(5) The span cache Buffer stores the endpoint coordinates of the triangle span, packs the endpoint coordinates and sends the endpoint coordinates to the block generator at the rear end for processing, and specifically comprises the following steps:
and sending a group of continuous span bands to the block generation unit for processing in each clock cycle, and if the transmission of the last span band is finished, setting a triangle end flag bit to be valid and sending a primitive end flag to the block generator.
The present invention will be described in detail with reference to examples.
Examples
As shown in fig. 1, an apparatus for implementing triangle rasterization in a GPU is composed of the following parts.
(1) Vertex data Buffer: the method is used for reading the vertex coordinate attribute and the parameter information of three sides of the triangle, and the vertex coordinate attribute and the parameter information of three sides of the triangle comprise three vertex coordinates (x) of the triangle1,y1)、(x2,y2)、(x3,y3) The parameter information of the three sides is the parameter information (a) corresponding to the side equation f (x, y) Ax + By + C01,B1,C1)、(A2,B2,C2)、(A3,B3,C3)。
(2) The side function normalization module: and the edge equation for converting the three edges of the triangle into the edge equation in the same form through normalization and placing the edge equation in the coordinate system of the triangle bounding box.
(3) An endpoint generation module: and the system is used for traversing the normalized edge equation along the Y coordinate to generate a corresponding normalized endpoint coordinate X, and reducing the calculated normalized coordinate to the coordinate of the 6 types of edge equations before normalization.
(4) A span generation module: and comparing the coordinates of the end points of the left and right boundaries of the triangle to obtain two end points of the triangle under the same Y coordinate.
(5) Span Buffer: and the block generator is used for storing the endpoint coordinates of the triangle span and packaging the endpoint coordinates to be sent to the back end for processing. And sending a group of continuous span bands to the block generation unit for processing in each clock cycle, and if the transmission of the last span band is finished, setting a triangle end flag bit to be valid and sending a primitive end flag to the block generator.
As shown in fig. 2, a schematic diagram of a bounding box coordinate system of a triangle primitive is implemented. The coordinates of the triangle bounding box, i.e. the coordinates of the smallest rectangle that encloses the triangle, are determined. The coordinate of the upper left corner of the bounding box is (x)min,ymin) The coordinate of the lower right corner is (x)max,ymax) Corresponding bounding box height
Figure BDA0002697540650000061
(rounded up), bounding box width W2mWherein
Figure BDA0002697540650000062
The coordinate of the upper left corner of the bounding box is defined as (0, 0), then x is more than or equal to 0 and less than or equal to W, and y is more than or equal to 0 and less than or equal to H.
As shown in fig. 3, a schematic diagram of an implementation method of triangle boundary and effective semi-plane type division is shown. The type of the three edges is determined. Because the internal data of the triangle is effective data, the three edges are judged in the clockwise direction, and the right half plane of the boundary is effective data, namely the coordinate of f (x, y) is more than or equal to 0. According to the equation f (x, y) ═ Ax + By + C, the edge types can be divided into 6 forms in the figure, the shadow directions of the classes 1 to 6 are half planes with f (x, y) ≥ 0, the dotted line represents a half-open plane, and the solid line represents a half-closed plane. In order to avoid repeated calculation of different triangle boundaries in the GPU, the default triangle has a solid left boundary and a dashed right boundary, i.e. x is greater than or equal to 0 and less than W, and y is greater than or equal to 0 and less than or equal to H, so that the calculation of repeated boundaries is reduced, and hardware resources are saved.
After the boundary types of the three sides of the triangle are judged, in order to simplify the boundary traversal mode and save hardware resources, the boundaries of different types are uniformly converted into f (x, y) ═ Ax + By + C ≥ 0 (A)<0, B is more than or equal to 0). Setting the original edge equation of any boundary as f (x, y) ═ Ax + By + C, and replacing x, y, C, that is, x ═ x-xmin,y=y-ymin,C=C+Axmin+ByminThe edge equation f (x, y) of the bounding box coordinate system is obtained as Ax + By + C. Then converting the effective edge equation into f (x, y) ═ Ax + By + C ≧ 0 (A)<0, B is more than or equal to 0), the conversion relation is as follows:
A=-|A|,B=|B|,
Figure BDA0002697540650000071
where W is the bounding box width.
As shown in fig. 4, the boundary traversal search first valid endpoint implementation method is schematically illustrated. Since the bounding box already defines the x-coordinate of the boundary between 0, W), the value of the coordinate x outside this interval is meaningless. According to the boundary equation f (x, y) ═ Ax + By + C ═ 0 (a)<0, B is more than or equal to 0), obtaining:
Figure BDA0002697540650000075
when the traversal is performed along X ≧ 0, the valid endpoint (X) can be obtained only if f (0, y) ≧ 00,Y0) Wherein X is0、Y0Are all integers.
As shown in fig. 5, a schematic diagram of an implementation method for finding valid endpoints by step traversal is shown. Calculating the effective end point X according to the position of the effective end point0Value, slope K of the edge, and related parameters. In order to save hardware resources, integer calculation is adopted in the GPU for relevant calculation, all values are integers, and then the integers are obtained
Figure BDA0002697540650000073
(the rounding is performed downwards),
Figure BDA0002697540650000074
X0value of remainder E0=f(0,Y0) mod | A |, K, the numeric remainder R0| B | mod | a |. In the invention, in order to avoid that the division calculation occupies a large amount of hardware resources, and meanwhile, the value is limited to [0, W ] according to x, the division calculation is carried out by adopting a binary addition iteration-based method, and the calculation principle is as follows:
let b be a q + r, where q is a divisor, and take a value of (0, 2)m) And can be expressed as an m +1 bit binary number: k is a radical ofmkm-1。。。k2k1k0
I.e. q ═ k0+k1*2+k2*22+...+km*2m=k0+2*(k1+2*(k2+2*(...+2*km) ...)) to obtain binary value of q and remainder r through m +1 times of iteration.
According to the obtained initial value X0Slope K, remainder E0The remainder R0Traversing along the Y-axis direction to obtain all valid x values until Y is H or x>W is added. The invention adopts a step iteration method to perform traversal calculation, and the calculation method comprises the following steps:
Figure BDA0002697540650000081
order to
Figure BDA0002697540650000082
To obtain
Figure BDA0002697540650000083
Wherein n is 1, 2, …, H-Y0
As shown in fig. 6, a schematic diagram of an implementation method for selecting the left and right end points of the span is shown. Determining left and right boundaries according to parameters of the edge equation, comparing end point values of the same boundary with the same Y coordinate and with bounding box coordinates, and selecting the maximum value of the left boundary as the left end point (P in the figure) of the span2) Selecting the minimum value of the right boundary as the right end point (P in the figure)3) While the selected value is determined to be between [0, W), the left endpoint is 0 if it is less than 0 and W-1 if it is greater than W. And restoring the coordinate values of the two end points into the original coordinate, namely generating a complete span.

Claims (8)

1. An apparatus for implementing triangle rasterization in a GPU, comprising:
vertex data Buffer: the system comprises a triangle, a vertex coordinate attribute and three sides, wherein the triangle is used for reading the vertex coordinate attribute and the parameter information of the three sides;
the side function normalization module: the system is used for converting the edge equations of the three edges of the triangle into the edge equation of the same form through normalization and placing the edge equation in a coordinate system of the triangular bounding box;
an endpoint generation module: the system is used for traversing the normalized side equation along the Y coordinate to generate a corresponding normalized endpoint coordinate X and reducing the calculated normalized coordinate to the coordinate of the side equation before normalization;
a span generation module: obtaining two end points of the triangle under the same Y coordinate by comparing the end point coordinates of the left and right boundaries of the triangle;
span Buffer: and the block generator is used for storing the endpoint coordinates of the triangle span and packaging the endpoint coordinates to be sent to the back end for processing.
2. The apparatus of claim 1, wherein the apparatus is configured to implement triangle rasterization in the GPUThe vertex coordinate attribute of the triangle includes three vertex coordinates (x) of the triangle1,y1)、(x2,y2)、(x3,y3) The parameter information of the three sides is the parameter information (a) corresponding to the side equation f (x, y) Ax + By + C01,B1,C1)、(A2,B2,C2)、(A3,B3,C3)。
3. A method for implementing triangle rasterization in a GPU is characterized by comprising the following steps:
the vertex data cache Buffer reads the vertex coordinate attribute of the triangle and the parameter information of the three edges;
the side function normalization module transforms the side equations of the three sides of the triangle into the side equation of the same form through normalization and places the side equation in a coordinate system of the triangular bounding box;
the end point generating module traverses the normalized side equation along the Y coordinate to generate a corresponding normalized end point coordinate X, and reduces the calculated normalized coordinate to the coordinate of the 6 types of side equations before normalization;
the span generation module obtains two end points of the triangle under the same Y coordinate by comparing the end point coordinates of the left and right boundaries of the triangle;
the span Buffer stores the endpoint coordinates of the triangle span, and packs the endpoint coordinates and sends the endpoint coordinates to the block generator at the back end for processing.
4. A method as defined in claim 3, wherein the vertex coordinates attributes of the triangle comprise three vertex coordinates (x) of the triangle1,y1)、(x2,y2)、(x3,y3) The parameter information of the three sides is the parameter information (a) corresponding to the side equation f (x, y) Ax + By + C01,B1,C1)、(A2,B2,C2)、(A3,B3,C3)。
5. The method according to claim 3, wherein the edge function normalization module transforms the edge equations of three edges of the triangle into the edge equations of the same form through normalization and places the edge equations in a triangle bounding box coordinate system, and specifically comprises the following steps:
step 1: determining the coordinates of a triangular bounding box, namely the coordinates of the smallest rectangle bounding the triangle; the coordinate of the upper left corner of the bounding box is (x)min,ymin) The coordinate of the lower right corner is (x)max,ymax) Corresponding bounding box height
Figure FDA0002697540640000022
Width W of bounding box 2mWherein
Figure FDA0002697540640000023
The coordinate of the upper left corner of the bounding box is defined as (0, 0), then x is more than or equal to 0 and less than or equal to W, and y is more than or equal to 0 and less than or equal to H;
step 2: determining the types of the three edges; because the internal data of the triangle is effective data, the three edges are judged in the clockwise direction, and the right half plane of the edge is effective data, namely the coordinate of f (x, y) is more than or equal to 0; in order to avoid repeated calculation of different triangle boundaries in the GPU, the left boundary of the default triangle is a solid line, the right boundary of the default triangle is a dotted line, namely the relative coordinate range is that x is more than or equal to 0 and less than W, and y is more than or equal to 0 and less than or equal to H; then, according to the edge equation f (x, y) ═ Ax + By + C, the half plane representing the edge of the valid data inside the triangle is divided into a half-closed plane with left boundary f (x, y) ≥ Ax + By + C ≥ 0 and a half-open plane with right boundary f (x, y) ≥ Ax + By + C > 0; accordingly, the types of edges can be classified into the following 6 types:
class 1: when A is less than 0 and B is more than or equal to 0, let f (x, y) be the half-open plane of Ax + By + C > 0;
class 2: when A <0 and B <0, let f (x, y) be half-open plane Ax + By + C > 0;
class 3: when a is 0 and B is <0, let f (x, y) be a half-open plane of Ax + By + C > 0;
class 4: when A is greater than 0 and B is less than or equal to 0, making f (x, y) equal to Ax + By + C equal to or more than 0;
class 5: when A is greater than 0 and B is greater than 0, let f (x, y) equal to Ax + By + C ≧ 0 semi-closed plane;
class 6: when A is equal to 0 and B is equal to or larger than 0, let f (x, y) be equal to Ax + By + C is equal to or larger than 0;
and step 3: judging the boundary types of the three sides of the triangle according to the mode of the step 2, uniformly converting the boundaries of different types into the boundary with f (x, y) Ax + By + C being more than or equal to 0 for processing, wherein A<0, B is more than or equal to 0; setting the original edge equation of any boundary as f (x, y) ═ Ax + By + C, and replacing x, y, C, that is, x ═ x-xmin,y=y-ymin,C=C+Axmin+ByminObtaining an edge equation f (x, y) of the bounding box coordinate system as Ax + By + C; then converting the effective edge equation into a form of f (x, y) ═ Ax + By + C ≧ 0, A<0, B is more than or equal to 0, and the conversion relation is as follows:
Figure FDA0002697540640000021
where W is the bounding box width.
6. The method according to claim 3, wherein the endpoint generation module traverses the normalized edge equation along the Y coordinate to generate a corresponding normalized endpoint coordinate X, and restores the calculated normalized coordinate to the coordinate of the 6 classes of edge equations before normalization, and specifically includes the following steps:
step 1: the boundary traversal is used for finding the position where the effective boundary endpoint starts; since the bounding box already confines the x-coordinate of the boundary between [0, W), the value of the coordinate x outside this interval is meaningless; according to the boundary equation f (x, y) ═ Ax + By + C ═ 0, a<0, B is more than or equal to 0, and the following can be obtained:
Figure FDA0002697540640000031
A<0; when the traversal is performed along X ≧ 0, the valid endpoint (X) can be obtained only if f (0, y) ≧ 00,Y0) Wherein X is0、Y0Are all integers;
step 2: calculating step parameter, calculating effective end point X according to the found effective end point position0Value, slope K of the edge and related parameters; in the GPU, the correlation calculation adopts integer calculation, all values are integers, and then the integer is obtained
Figure FDA0002697540640000032
X0Value of remainder E0=f(0,Y0) mod | A |, K, the numeric remainder R0| B | mod | a |; the method based on binary addition iteration is adopted to carry out division calculation, and the calculation principle is as follows: let b be a q + r, where q is a divisor, and take a value of (0, 2)m) And can be expressed as an m +1 bit binary number: k is a radical ofmkm-1。。。k2k1k0I.e. q ═ k0+k1*2+k2*22+...+km*2m=k0+2*(k1+2*(k2+2*(...+2*km) ...)) to obtain binary value of q by m +1 times of iteration and obtain remainder r;
and step 3: step-by-step traversal is carried out, and the initial value X obtained in the step 2 is used0Slope K, remainder E0The remainder R0Traversing along the Y-axis direction to obtain all valid x values until Y is H or x>W; the step iteration method is adopted for traversal calculation, and the calculation method is as follows:
Figure FDA0002697540640000033
order to
Figure FDA0002697540640000034
To obtain
Figure FDA0002697540640000035
Wherein n is 1, 2, …, H-Y0
And 4, step 4: restoring the normalized coordinate, reversely restoring to a coordinate value before normalization according to the normalization process, wherein the restoring condition of the x value is as follows: class 1: x ═ X, class 2, class 3: X-W-1-X, class 4: x +1, class 5, class 6: and X is W-X, wherein X is the endpoint coordinate before reduction.
7. The method according to claim 3, wherein the span generation module obtains two endpoints of the triangle at the same Y coordinate by comparing the coordinates of the endpoints of the left and right boundaries of the triangle, and specifically comprises the following steps:
step 1: determining end point values, determining left and right boundaries according to parameters of an edge equation, comparing the end point values of the same type of boundary of the same Y coordinate with the bounding box coordinate, selecting a left end point with the maximum value of the left boundary as a span, selecting a right end point with the minimum value of the right boundary, and simultaneously determining that the selected value is between [0, W ], wherein if the left end point is less than 0, the left end point is 0, and if the right end point is more than W, the right end point is W-1.
Step 2: and (3) generating a complete span, namely restoring the coordinate values of the two endpoints selected in the step (1) into original coordinates, namely generating a complete span.
8. The method of claim 3, wherein the span Buffer sends a set of consecutive span bands to the block generator for processing each clock cycle, and if the last span band is over, sets the triangle end flag bit valid and sends the primitive end flag to the block generator.
CN202011010966.5A 2020-09-23 2020-09-23 Device and method for realizing triangle rasterization in GPU Pending CN112233008A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011010966.5A CN112233008A (en) 2020-09-23 2020-09-23 Device and method for realizing triangle rasterization in GPU

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011010966.5A CN112233008A (en) 2020-09-23 2020-09-23 Device and method for realizing triangle rasterization in GPU

Publications (1)

Publication Number Publication Date
CN112233008A true CN112233008A (en) 2021-01-15

Family

ID=74108607

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011010966.5A Pending CN112233008A (en) 2020-09-23 2020-09-23 Device and method for realizing triangle rasterization in GPU

Country Status (1)

Country Link
CN (1) CN112233008A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020171644A1 (en) * 2001-03-31 2002-11-21 Reshetov Alexander V. Spatial patches for graphics rendering
JP2007141196A (en) * 2005-11-15 2007-06-07 Kaadeikku Corporation:Kk Polygon/silhouette line/anti-aliasing circuit
US8040357B1 (en) * 2007-08-15 2011-10-18 Nvidia Corporation Quotient remainder coverage system and method
US20140362101A1 (en) * 2013-06-10 2014-12-11 Sony Computer Entertainment Inc. Fragment shaders perform vertex shader computations
US20150262407A1 (en) * 2014-03-13 2015-09-17 Imagination Technologies Limited Object Illumination in Hybrid Rasterization and Ray Traced 3-D Rendering
CN108510565A (en) * 2018-03-27 2018-09-07 长沙景嘉微电子股份有限公司 A kind of apparatus and method realized line segment and turn triangle drafting in GPU

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020171644A1 (en) * 2001-03-31 2002-11-21 Reshetov Alexander V. Spatial patches for graphics rendering
JP2007141196A (en) * 2005-11-15 2007-06-07 Kaadeikku Corporation:Kk Polygon/silhouette line/anti-aliasing circuit
US8040357B1 (en) * 2007-08-15 2011-10-18 Nvidia Corporation Quotient remainder coverage system and method
US20140362101A1 (en) * 2013-06-10 2014-12-11 Sony Computer Entertainment Inc. Fragment shaders perform vertex shader computations
US20150262407A1 (en) * 2014-03-13 2015-09-17 Imagination Technologies Limited Object Illumination in Hybrid Rasterization and Ray Traced 3-D Rendering
CN108510565A (en) * 2018-03-27 2018-09-07 长沙景嘉微电子股份有限公司 A kind of apparatus and method realized line segment and turn triangle drafting in GPU

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
MICHAEL DEERING等: "The Triangle Processor and Normal Vector Shader: A VLSI System for High Performance Graphics", 《COMPUTER GRAPHICS》, vol. 22, no. 4, pages 21 - 30, XP000618778, DOI: 10.1145/378456.378468 *
小水VV: "三角形光栅化", pages 5, Retrieved from the Internet <URL:https://zhuanlan.zhihu.com/p/138732905> *
张加林: "一种基于改进 Bresenham 算法的三角形光栅化技术", 《电子测量技术》, vol. 42, no. 10, pages 86 - 89 *
桑来: "图形学基础 | 三角形光栅化", pages 1, Retrieved from the Internet <URL:https://www.csdn.net/> *

Similar Documents

Publication Publication Date Title
Karnewar et al. Relu fields: The little non-linearity that could
CN113178014B (en) Scene model rendering method and device, electronic equipment and storage medium
CN108038897B (en) Shadow map generation method and device
JP2011238213A (en) Hierarchical bounding of displaced parametric curves
US20020000996A1 (en) Method for progressively constructing morphs
CN106960470B (en) Three-dimensional point cloud curved surface reconstruction method and device
CN107203962B (en) Method for making pseudo-3D image by using 2D picture and electronic equipment
CN109146808A (en) A kind of portrait U.S. type method and system
CN108537872B (en) Image rendering method, mobile device and cloud device
CN111462205B (en) Image data deformation, live broadcast method and device, electronic equipment and storage medium
CN113592711A (en) Three-dimensional reconstruction method, system and equipment for point cloud data nonuniformity and storage medium
CN113256782B (en) Three-dimensional model generation method and device, storage medium and electronic equipment
Schollmeyer et al. Direct trimming of NURBS surfaces on the GPU
AU2002258107B2 (en) Generating smooth feature lines for subdivision surfaces
CN112233008A (en) Device and method for realizing triangle rasterization in GPU
CN112465946A (en) Ripple rendering method and device, electronic equipment and computer readable medium
CN108960203B (en) Vehicle detection method based on FPGA heterogeneous computation
JP2005332028A (en) Method and apparatus for generating three-dimensional graphic data, generating texture image, and coding and decoding multi-dimensional data, and program therefor
CN112927123A (en) GPU accelerated directed distance field symbolic modeling method
US10593111B2 (en) Method and apparatus for performing high throughput tessellation
JP2000251095A (en) Method and device for dividing area of polygon mesh and information recording medium
CN117726774B (en) Triangle rasterization method and device based on line generation algorithm and related equipment
CN116310060B (en) Method, device, equipment and storage medium for rendering data
KR100283071B1 (en) Fast Texture Mapping Method
US20080218520A1 (en) Acceleration of Triangle Scan Conversion Through Minor Direction Detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 222001 No.18 Shenghu Road, Lianyungang City, Jiangsu Province

Applicant after: The 716th Research Institute of China Shipbuilding Corp.

Address before: 222001 No.18 Shenghu Road, Lianyungang City, Jiangsu Province

Applicant before: 716TH RESEARCH INSTITUTE OF CHINA SHIPBUILDING INDUSTRY Corp.