CN116883228B

CN116883228B - GPU pixel filling rate measuring method

Info

Publication number: CN116883228B
Application number: CN202311158256.0A
Authority: CN
Inventors: 沈晔; 解文华; 付秋; 彭获然
Original assignee: Wuhan Lingjiu Microelectronics Co ltd
Current assignee: Wuhan Lingjiu Microelectronics Co ltd
Priority date: 2023-09-08
Filing date: 2023-09-08
Publication date: 2023-12-01
Anticipated expiration: 2043-09-08
Also published as: CN116883228A

Abstract

The invention provides a method for measuring the filling rate of GPU pixels, which comprises the following steps: writing a pixel filling rate measurement program in a general graphic interface of the GPU; when the general graphic interface receives a drawing instruction, the GPU acquires resource data to be drawn from the video memory, and the general graphic interface calls the pixel filling rate measuring program to draw and render the resource data; and calculating the GPU pixel filling rate based on the drawing rendering result. According to the technical scheme, a set of programs applicable to measurement of pixel filling rates of different GPUs are written based on the general high-performance drawing interface, so that the method has good universality on the premise of ensuring the accuracy of measurement results, and the defect of the measurement means of the pixel filling rates can be overcome.

Description

GPU pixel filling rate measuring method

Technical Field

The invention relates to the field of GPU testing, in particular to a GPU pixel filling rate measuring method.

Background

The "pixel fill rate" refers to the number of pixels GPU (Graphics Processing Unit) can render per 1 second, which is the most commonly used indicator for measuring GPU pixel processing performance. The pixel filling rate has no standard calculation formula, and is usually calculated by multiplying the number of the grating output units by the core clock frequency of the GPU graphics processing unit, which corresponds to the theoretical value of the pixel filling rate provided by the GPU manufacturer, but the theoretical parameter of the pixel filling rate is obtained from the GPU manufacturer, and the actual pixel filling rate may be far from the theoretical value.

The GPU receives the action command of rendering and renders by using the prepared resource data. The most efficient method for measuring the pixel filling rate of the GPU is to directly write an instruction of a drawing action into an instruction register of the GPU, enable the GPU to read resource data from a video memory to render, and then calculate according to the number of the rendered pixels in unit time to obtain a rendering result. However, since each GPU company on the market adopts its own private instruction set, the manner of directly writing instructions to registers is not universal.

Currently, there is no known pixel fill-rate measurement method in the market. How to measure the filling rate of the GPU pixels by using software written by a general graphic interface under the condition of ensuring the accuracy of the measurement result becomes a difficult problem to be solved urgently.

Disclosure of Invention

The invention provides a GPU pixel filling rate measuring method aiming at the technical problems in the prior art.

The invention provides a GPU pixel filling rate measuring method, which comprises the following steps:

writing a pixel filling rate measurement program in a general graphic interface of the GPU;

when the general graphic interface receives a drawing instruction, the GPU acquires resource data to be drawn from the video memory, and the general graphic interface calls the pixel filling rate measuring program to draw and render the resource data;

and calculating the GPU pixel filling rate based on the drawing rendering result.

On the basis of the technical scheme, the invention can also make the following improvements.

Optionally, the writing the pixel filling rate measurement program in the general graphics interface of the GPU includes:

creating a drawing window in a universal graphical interface and setting a fixed pipeline rendering state, the fixed pipeline rendering state including closing an unnecessary test and setting a viewport;

and writing the shader source code, and realizing the compiling of the shader source code and the linking of a shading program, wherein the shader source code is used for determining the programmable pipeline rendering flow.

Optionally, the creating a drawing window in the universal graphics interface and setting the fixed pipeline rendering state includes:

acquiring the current pixel resolution W multiplied by H of the display, taking W as the width of the window, and taking H as the height of the window to create a drawing window;

setting the starting point coordinates of the view port as (0, 0), wherein the length and width dimensions of the view port are equal to the dimensions of the drawing window, so that the view port fills the drawing area of the whole drawing window;

unnecessary functions that increase rendering overhead, including depth testing, stencil testing, and alpha testing, are turned off.

Optionally, the writing the shader source code, implementing compiling of the shader source code and linking of the shader program, is configured to determine a programmable pipeline rendering flow, and includes:

compiling vertex shader source codes, wherein the vertex coordinate values set and output in the vertex shader source codes are as follows: gl_position=gl_vertex, wherein gl_vertex is built-in variable of Vertex shader, is used for receiving the Vertex coordinate data that need to draw obtained in the video memory, gl_position is the Vertex coordinate data that is outputted, vertex shader is used for assembling the Vertex coordinate data that is imported into the primitive;

compiling fragment shader source codes, wherein the fragment shader source codes are provided with output color values for shading and rendering fragments after the rasterization of the graphic elements;

and linking the vertex shader and the fragment shader which are generated by compiling to generate a shading program for determining the pipeline flow when rendering.

Optionally, the method further comprises:

a vertex buffer area is allocated in the video memory and used for storing vertex coordinate data to be drawn;

setting the storage data type of the vertex buffer area as a single byte data type supported by a general graphic interface, and transmitting vertex coordinate data to be drawn to the vertex buffer area of a video memory;

and establishing a binding relation between Vertex coordinate data in the video memory area and a built-in variable gl_vertex of the Vertex shader so that the GPU can read the Vertex coordinate data from the appointed video memory area during drawing.

Optionally, when the general graphics interface receives the drawing instruction, the GPU obtains resource data to be drawn from the video memory, and the general graphics interface calls the pixel filling rate measurement program to draw and render the resource data, including:

setting the size of the vertex coordinate data extracted by the GPU as 2X sizeof (GLbyte), wherein the size represents that the GPU takes 2 bytes to the vertex shader each time, the GLbyte represents a single byte data type, namely 1 group of data { x, y } is taken as the coordinate data of 1 vertex for rendering each time, and x, y are the abscissa and the ordinate of the vertex coordinate;

setting the transmission step length of the vertex coordinate data to be 0, wherein the transmission step length represents compact arrangement of all vertex coordinate data { x, y }, and no interval exists between each group of vertex coordinate data;

and after the data is prepared, executing a drawing process, and rendering by the GPU according to the extracted vertex coordinate data.

Optionally, the GPU performs rendering according to the extracted vertex coordinate data, including:

setting the maximum number n of repeated drawing and recording the current system time t ₀ ；

Repeatedly submitting a drawing command and executing the drawing command, and rendering the extracted vertex coordinate data into a rectangle with a full window by the GPU according to a set rendering pipeline, wherein the drawing command comprises information for assembling 6 vertex coordinate data input each time into 2 triangles, and the two triangles form a rectangle;

detecting the number of times of submitting the drawing command, waiting for the drawing command submitted after the GPU finishes executing when the number of times reaches the maximum number of times, and recording the current system time t ₁ ；

According to the recorded time t ₀ 、t ₁ And calculating the rendering times n and the size of each drawn rectangle to obtain the GPU pixel filling rate.

Optionally, the pixel filling rate measurement p=w×h×n/(t) ₁ -t ₀ ) W×h is the size of a drawing rectangle, which fills the entire drawing window.

According to the GPU pixel filling rate measuring method, a set of programs applicable to measurement of different GPU pixel filling rates is compiled based on the general high-performance drawing interface, and on the premise that accuracy of measuring results is guaranteed, the method has good universality and can make up for the lack of pixel filling rate measuring means.

Drawings

FIG. 1 is a flowchart of a method for measuring the filling rate of GPU pixels provided by the invention;

FIG. 2 is a schematic diagram of a conventional GPU task flow;

FIG. 3 is a schematic diagram of a GPU rendering vertex coordinate data;

fig. 4 is an overall flowchart of a GPU pixel fill rate measurement method.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention. In addition, the technical features of each embodiment or the single embodiment provided by the invention can be combined with each other at will to form a feasible technical scheme, and the combination is not limited by the sequence of steps and/or the structural composition mode, but is necessarily based on the fact that a person of ordinary skill in the art can realize the combination, and when the technical scheme is contradictory or can not realize, the combination of the technical scheme is not considered to exist and is not within the protection scope of the invention claimed.

Based on the defect that the conventional GPU filling rate measurement method does not have universality, a pixel filling rate measurement program is written on the basis of a OpenGL, vulkan and other common high-performance graphic drawing interfaces, and an approximation value of the pixel filling rate can be obtained through relatively accurate measurement on the basis of ensuring the universality.

Compared with a mode of directly writing instructions into a register for drawing, the method has the advantages that redundant rendering flows are easily introduced in the drawing process, and the method is more likely to be influenced by factors such as GPU main display data interaction bottleneck, unnecessary rendering flows, low-efficiency even erroneous measuring methods, insufficient measuring time and the like, so that the deviation between a measuring result and a theoretical result is larger.

Accordingly, the present invention provides a flowchart of a method for measuring a GPU pixel filling rate, as shown in fig. 1, the method includes:

and step 1, writing a pixel filling rate measurement program in a general graphic interface of the GPU.

As an embodiment, writing a pixel filling rate measurement program in a general-purpose graphics interface of a GPU includes: creating a drawing window in a universal graphical interface and setting a fixed pipeline rendering state, the fixed pipeline rendering state including closing an unnecessary test and setting a viewport; and writing the shader source code, and realizing the compiling of the shader source code and the linking of a shading program, wherein the shader source code is used for determining the programmable pipeline rendering flow.

It is understood that pixel fill rate, which refers to the number of pixels rendered by a graphics processing unit per second, in MPixel/S (mega pixels per second), or GPixel/S (billions pixels per second), is the most common indicator used to measure the pixel processing performance of current graphics cards. The rendering pipeline of the graphics card is an important component of the display core, which is a set of specialized channels in the display core that are responsible for color assignment to graphics. The more lines are rendered, the higher the frequency of operation of each group of lines (generally, the core frequency of the graphics card) is, the higher the filling rate of the graphics card is, and the higher the performance of the graphics card is, so that the performance of the graphics card can be roughly judged from the pixel filling rate of the graphics card.

The invention writes a pixel filling rate measurement program based on a general-purpose high-performance graphics rendering interface such as OpenGL, vulkan in a GPU, wherein the steps of creating a rendering window in the general-purpose graphics interface and setting a fixed pipeline rendering state include:

setting the starting point coordinates of the viewport as (0, 0), wherein the length and width dimensions of the viewport are equal to the window dimensions, so that the viewport fills the drawing area of the whole window, and drawing the rendered graph is displayed in the viewport.

Unnecessary functions that increase rendering overhead, including depth testing, stencil testing, alpha testing, etc., are turned off.

The purpose of filling the view port with the drawing area of the whole window is to set the drawing area as large as possible, so that the GPU can render as many pixels as possible in 1 frame time, the frame number required when the same number of pixels are rendered is reduced, and the additional expense caused by GPU state switching when different frames are rendered is reduced.

As an embodiment, writing shader source code includes writing vertex shader source code and writing fragment shader source code, including mainly the steps of:

writing vertex shader source code: the vertex coordinate values set and output in the vertex shader source code are as follows: gl_position=gl_vertex, then the Vertex shader is compiled, which is used only to assemble the incoming Vertex coordinate data into primitives, without additional computation. Wherein gl_Vertex is a built-in variable of the Vertex shader, and is used for receiving Vertex coordinate data prepared in the video memory; gl_position is used to assemble the received vertex coordinate data into primitive outputs.

Writing fragment shader source codes, wherein the color values of the setting output in the fragment shader source codes are as follows: gl_fragcolor=vec4 (1.0 f, 1.0f, 1.0f, 1.0 f), then the fragment shader source code is compiled. Where vec4 (1.0 f, 1.0f, 1.0f, 1.0 f) is a vector constant, here the RGBA format representing the color is white; the gl_FragColor is a built-in variable of the fragment shader, and the received color values are used to shader the fragment rasterized from the primitive. This step sets the color values of the fragment shading to be constant, rather than setting the shading colors in a manner that dynamically transfers data, in order to reduce the overhead of the GPU reading the video memory during the fragment shading phase.

And 2, when the general graphic interface receives a drawing instruction, the GPU acquires resource data to be drawn from the video memory, and the general graphic interface calls the pixel filling rate measuring program to draw and render the resource data.

Before rendering, a vertex buffer area is allocated in a video memory for storing vertex coordinate data to be rendered. Specifically, vertex coordinate data of {1, -1}, {1, -1}, and {1,1}, are prepared, the stored data types are set to be single byte data types supported by a graphic interface (the invention is described by taking GLbyte as an example), the vertex data are transmitted to an allocated video memory area, wherein the vertex coordinate format is { x, y }, and 2 triangles drawn by using the data can be just spliced into 1 rectangle which is full of a viewing port. The reason for using GLbyte and other types of data is that the data type occupies the smallest byte, only 1 byte, and can meet the storage requirement of vertex coordinate data. Vertex coordinate data is prepared before drawing, data transmission is not performed in the drawing process, and GPU overhead caused by data transmission can be effectively reduced; storing the coordinate data with the data type that takes the smallest number of bytes can minimize the time taken by the GPU to read the data from the memory.

It should be noted that, the pixel filling rate is determined by the specification of the display core, that is, the operation capability of the display core, but the operation capability of the GPU is fully exerted, and a good transmission channel is required, that is, the specification of the display memory is reasonable in collocation and not to form a bottleneck. How to reduce the influence of the video memory and not make the data transmission become the bottleneck of drawing is the important consideration place of the method of the invention. Therefore, the invention prepares various needed resource data before the cyclic drawing, and does not carry out data transmission in the drawing process; and adopting the most simplified vertex coordinate data and the coloring program to enable the GPU to only perform necessary rendering work.

The GPU efficiency mainly refers to the utilization rate of the GPU on a time slice, that is, the ratio of the effective running time of the GPU to the running time of the program.

A common GPU task operation flowchart is shown in fig. 2, where GPU tasks alternately use a CPU and a GPU for computation. When CPU calculation becomes a bottleneck, the problem of GPU waiting occurs, the utilization rate of the GPU in idle running is low, and the optimization direction is to shorten the time consumption of all calculation links using the CPU and reduce the blocking condition of the CPU calculation to the GPU. The common CPU computing operation comprises data loading, data preprocessing and the like, the method completes the transmission of data from the video memory to the main memory before the drawing operation is circularly executed, the loading and preprocessing work of the data is not carried out in the drawing process, and the overall efficiency is improved; when the video memory has a bottleneck, the speed of the GPU for reading and writing data from the video memory is limited, and the overall running speed is dragged, so that the method of the invention reduces the data volume of the GPU for reading the video memory as much as possible, namely, the minimum data type and the minimum necessary top point number are adopted.

After compiling the Vertex shader and the fragment shader in the general graphic interface, establishing a binding relation between Vertex coordinate data and a built-in variable gl_Vertex of the Vertex shader so that the GPU reads Vertex position data from a specified video memory area during drawing.

When the general graphics interface receives a drawing command, the GPU extracts vertex coordinate data to be drawn from a vertex buffer area of the video memory, wherein the size of the vertex coordinate data is set to be 2 multiplied by sizeof (GLbyte), which means that the GPU takes 2 bytes to the vertex shader at a time, namely 1 group of data { x, y } as coordinate data of 1 vertex at a time for rendering. The transmission step length of the vertex coordinates is set to be 0, which means that all vertex coordinate data { x, y } are compactly arranged, and no interval exists between each group of vertex coordinate data. After the data is prepared, executing a drawing process, and rendering by the GPU according to the data in the video memory.

And 3, calculating the GPU pixel filling rate based on the drawing rendering result.

It can be appreciated that the process of rendering vertex coordinate data by the GPU includes:

setting the maximum number n of repeated drawing and recording the current system time t when drawing starts ₀ ；

Repeatedly submitting and executing the drawing command, wherein the GPU can render the prepared vertex data into a rectangle full of windows by utilizing the rendering pipeline set in the steps;

detecting the number of times of submitting the drawing command, waiting for the GPU to execute the drawing command submitted after finishing execution of all the drawing commands when the number of times reaches the maximum number of times, and recording the current system time t after finishing execution of all the drawing commands ₁ 。

According to the invention, the maximum number of repeated drawing n is set according to experience, and in theory, the smaller the n value is, the more inaccurate the measured pixel filling rate is; as n increases, the measured pixel fill-up ratio gradually converges, approaching the true pixel fill-up ratio.

After the GPU receives a drawing command, the vertex shader assembles 6 vertices input each time into 2 triangle primitives, and two triangles form a rectangle; then, the graphic element is rasterized, the fragment shader is used for coloring the rasterized graphic element, the vertex shader and the fragment shader are only used for coloring and do not execute the send-display operation, and the invention only needs to draw and does not need to display the drawing result on a screen, thereby preventing the send-display operation from introducing new expenditure.

In the drawing process, the GPU reads vertex coordinate data from the bound video memory, and according to the drawing instruction, 6 vertices are fetched by the vertex shader each time to be assembled into 2 triangles, then the pixels to be colored are determined by rasterization, and finally the pixels are colored by the fragment shader, and the drawing and rendering process can be seen in fig. 3.

After the drawing process is executed, according to the recorded time t ₀ 、t ₁ The rendering times n and the size of each drawn rectangle are calculated to obtain the GPU pixel filling rate: p=w×h×n/(t) ₁ -t ₀ ) W×h is the size of a drawing rectangle, which fills the entire drawing window.

Referring to fig. 4, a schematic diagram of the whole flow of GPU pixel filling rate measurement provided by the present invention mainly includes the following steps:

and S1, creating a drawing window. Setting a fixed pipeline rendering state, including closing unnecessary tests, setting a viewport, etc., so as to exclude unnecessary rendering flows;

step S2, preparing shader source codes, and realizing compiling of the shader source codes and linking of a shading program, wherein the shader source codes are used for determining a programmable pipeline rendering flow;

step S3, uploading all prepared vertex coordinate data to a vertex buffer area in a video memory;

step S4, binding the vertex coordinate data in the vertex buffer area to the corresponding built-in variables in the vertex shader so as to provide the vertex coordinate data for GPU rendering;

s5, setting the maximum number n of repeated drawing;

step S6, recording the current system time t ₀ ；

And S7, repeatedly submitting and executing the drawing command. The GPU can render the prepared vertex data into a rectangle which is full of windows by utilizing the rendering pipeline arranged in the steps;

step S8, detecting the number of times of submitting the drawing command, and waiting for the GPU to execute the submitted drawing command after finishing executing the drawing command when the number of times reaches the maximum number of times;

step S9, recording the current system time t ₁ ；

And step S10, calculating to obtain the approximate pixel filling rate of the GPU according to the recorded time, the rendering times and the size of each drawn rectangle.

According to the GPU pixel filling rate measuring method provided by the embodiment of the invention, a set of programs applicable to measurement of different GPU pixel filling rates is compiled based on the general high-performance drawing interface, and on the premise of ensuring the accuracy of a measuring result, the method has good universality and can make up for the lack of a pixel filling rate measuring means. The main aspects are as follows:

(1) The method for testing and calculating the pixel filling rate is provided, and is realized by using general high-performance graphic drawing interfaces such as OpenGL, vulkan and the like, and the pixel filling rates of different GPUs can be calculated by using the same program;

(2) Creating a window, explicitly setting the state of a rendering pipeline, reducing the driving of an implicit redundant rendering process to ensure the high efficiency of drawing;

(3) Providing the most simplified shader source codes, ensuring that redundant operations are not carried out in the shader, and preparing a shading program before drawing, so that the shader compiling interference is avoided in the drawing process;

(4) The most simplified vertex coordinate data is provided and stored in a video memory before drawing, so that the high efficiency of GPU reading the vertex data is ensured; .

(5) Establishing a binding relation between vertex data and built-in variables of a shader, enabling the GPU to only render without displaying in the drawing process, and eliminating additional expense caused by sending and displaying operation;

(6) Before the second time of recording, it needs to check whether the GPU has completed executing all submitted instructions, so as to prevent the time from being recorded before the GPU completes all drawing tasks, and the subsequent result is inaccurate.

In the foregoing embodiments, the descriptions of the embodiments are focused on, and for those portions of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A method for measuring a GPU pixel fill rate, comprising:

the programming of the pixel filling rate measurement program in the general graphic interface of the GPU comprises the following steps:

writing a shader source code, realizing compiling of the shader source code and linking of a shading program, and determining a programmable pipeline rendering flow;

further comprises: a vertex buffer area is allocated in the video memory and used for storing vertex coordinate data to be drawn;

the GPU acquires resource data to be drawn from a video memory, and a general graphics interface calls the pixel filling rate measuring program to draw and render the resource data, and the method comprises the following steps:

the GPU extracts vertex coordinate data to be drawn from the video memory, and renders according to the extracted vertex coordinate data:

According to the recordTime t of (2) ₀ 、t ₁ The rendering times n and the size of each drawn rectangle are calculated to obtain the GPU pixel filling rate;

the pixel filling rate measurement value p=w×h×n/(t) ₁ -t ₀ ) W×h is the size of a drawing rectangle, which fills the entire drawing window.

2. The GPU pixel fill level measurement method of claim 1, wherein creating a drawing window in a generic graphics interface and setting a fixed pipeline rendering state comprises:

3. The GPU pixel fill-in rate measurement method of claim 1, wherein the writing shader source code, implementing compilation of shader source code and linking of shader programs, is configured to determine a programmable pipeline rendering flow, comprising:

4. The method for measuring a pixel fill rate of a GPU according to claim 3,

5. The method for measuring the pixel filling rate of a GPU according to claim 4, wherein when the general purpose graphics interface receives the drawing command, the GPU obtains resource data to be drawn from the video memory, and the general purpose graphics interface invokes the pixel filling rate measuring program to draw and render the resource data, further comprising: