CN111028132B

CN111028132B - System C-based GPU command processor unit hardware TLM microstructure

Info

Publication number: CN111028132B
Application number: CN201911147475.2A
Authority: CN
Inventors: 张少锋; 姜丽云; 蔡叶芳; 吴晓成; 陈佳; 楼晓强
Original assignee: Xian Aeronautics Computing Technique Research Institute of AVIC
Current assignee: Xian Aeronautics Computing Technique Research Institute of AVIC
Priority date: 2019-11-21
Filing date: 2019-11-21
Publication date: 2023-06-13
Anticipated expiration: 2039-11-21
Also published as: CN111028132A

Abstract

The invention relates to the technical field of computer hardware modeling, in particular to a hardware TLM micro-structure design of a GPU command processor unit based on SystemC. The system C-based GPU command processor unit hardware TLM microstructure comprises: graph Cmd Fifo Model, read Graph Command Model, openGL Command Process Model and Command DMA Process Model. The invention realizes the hardware TLM micro-structure design of the GPU command processor unit based on SystemC, solves the problem of RTL simulation result model comparison of the GPU command processor unit, solves the problem of functional verification of the GPU command processor unit, and accelerates the simulation speed.

Description

System C-based GPU command processor unit hardware TLM microstructure

Technical Field

The invention relates to the technical field of computer hardware modeling, in particular to a system C-based hardware TLM microstructure of a GPU command processor unit.

Background

With the increasing number of graphics applications, early solutions for graphics rendering by CPU alone have been difficult to meet the ever-increasing graphics processing demands of performance and technology, and graphics processors (Graphic Processing Unit, GPU) have grown. The first GPU product released in Nvidia in 1999 has been developed by GPU technology, which mainly goes through a fixed function pipeline stage, a separate shader architecture stage, and a unified shader architecture stage, so that the graphics processing capability is continuously improved, and the application field is gradually expanded from the initial graphics drawing to the general computing field. The GPU pipeline has high speed, parallel characteristics and flexible programmable capability, and provides a good operation platform for graphic processing and general parallel computing.

The development of the GPU chip has huge hardware logic scale and higher complexity, and the design needs to be described at a higher level of abstraction so as to perform higher-speed simulation, software/hardware co-simulation and exploration of a system architecture. When the design is expressed as a system level model, multiple attempts to design by using different algorithms are easy to achieve, and experiments can be completed quickly by changing different structures; if the design is expressed using a register transfer level or gate level model, the scale is typically quite large, and it is time consuming and laborious to try out a different design structure or make some changes, if not too difficult.

The key factor of SystemC as a language to promote its development and standardization is that system level design can be performed, and the architecture of hardware and the algorithm of software can be described, supporting verification and communication of IP. The use of SystemC as a partitioning tradeoff for software and hardware is much easier than other languages at the system level and simulation is much faster than using multiple languages. The use of a microstructure based on SystemC to design and describe the cells thus enables a fully standard simulation environment to be built, modeling directly at a high level of abstraction.

Disclosure of Invention

Based on the problems in the background technology, the hardware TLM microstructure of the GPU command processor unit based on the SystemC can solve the problem of accurate comparison of RTL simulation GPU command processor unit data, and can perform function verification on a TLM model on the hardware microstructure of the GPU command processor unit in advance by RTL.

The technical scheme of the invention is as follows:

the invention provides a GPU command processor unit hardware TLM microstructure based on SystemC, which comprises Graph Cmd Fifo Model, read Graph Command Model, openGL Command Process Model and Command DMA Process Model;

the Graph Cmd Fifo Model, read Graph Command Model, openGL Command Process Model and Command DMA Process Model interfaces are connected in sequence through a transaction-level interface.

Preferably, the Graph Cmd Fifo Model is configured to receive an OpenGL command;

the Read Graph Command Model is configured to continuously read data in Graph Cmd Fifo, and trigger OpenGL Command Process Model to execute when receiving a valid Cmd command;

the OpenGL Command Process Model is configured to execute different operation processes according to a header of the CMD command packet, and trigger Command DMA Process Model to execute when an external dyeing program command, a NewList command, or a CallList command is loaded according to the header judgment;

the Command DMA Process Model is used to move the shader program data or display list data to the DDR.

Preferably, the Graph Cmd Fifo Model, read Graph Command Model, openGL Command Process Model and Command DMA Process Model TLM modeling is performed by SystemC.

Preferably, the transaction-level execution mode of the SystemC-based GPU command processor unit hardware TLM microstructure specifically includes acquiring an OpenGL command and analyzing the OpenGL command:

preferably, the acquiring of the OpenGL command includes the following steps:

1) Graph Cmd Fifo Model continuously receiving OpenGL CMD command packets issued by the GPU driving layer Pcie unit, updating the Graph CMD Fifo state, and sending to the Pcie unit and Read Graph Command Model;

2) Read Graph Command Model reading and judging the Graph Cmd Fifo state, if the Graph Cmd Fifo state is empty, continuing to read the state, and if the Graph Cmd Fifo state is not empty, turning to step 2.1);

2.1 Read Graph Command Model reads the CMD command packet header in the Graph Cmd Fifo, stores the command packet load and the packet header into a command packet linked list together according to the command packet length of the packet header, then judges the validity of the packet header, if the command packet is legal, the process goes to 3), if the command packet is illegal, the process goes to 2.2);

2.2 Discarding the current illegal command packet from the command packet linked list, and returning to 2);

preferably, the parsing of the OpenGL command includes the following steps:

a) OpenGL Command Process Model reads the command packet in the command packet linked list, and performs the defined corresponding operation by parsing the command packet header. Judging a command packet head, executing A.1) if the command packet is a parameter configuration type command, executing A.2) if the command packet is a graphic function type command, and executing A.3) if the command packet is a graphic drawing type command;

a.1 Reading or writing corresponding state parameters through a state parameter management interface according to the functional definition of the GPU system, and returning to the step (2) after the operation is executed;

a.2 According to the functional definition of the GPU system, corresponding state parameters are read or written in through the state parameter management interface, and corresponding function codes are issued through the graphic function management interface. If the function code involves the DMA data transmission between the main memory and other modules, the corresponding DMA operation is started by configuring the DMA descriptor through the Pcie DMA interface, the operation is returned to the step 2) after the operation is executed, and if the function code is the DMA path for loading the external dyeing program function code, the newList function code and the CallList function code, the DMA descriptor in the command processor is configured and the step B) is carried out;

a.3 According to the functional definition of the GPU system, corresponding state parameters are read or written in through a state parameter management interface, corresponding graphic drawing commands are issued through a graphic drawing management interface, and the operation returns to the step 2) after the operation is executed;

b) Judging a DMA path of the DMA descriptor, switching to B.1) if the path is from an external ROM to the DDR, switching to B.2) if the path is from NewListFifo to the DDR, and switching to B.3) if the path is from the DDR to the NewListFifo;

b.1 Firstly, moving the dyeing machine program data to a command processor local storage area, then writing the dyeing machine program data from the command processor local storage to the DDR appointed position, and returning to the step 2) after the operation is executed;

b.2 Writing the display list data from NewListFifo to the DDR display list appointed position, and returning to the step 2) after the operation is executed;

b.3 Display list data is written into NewListFifo from the DDR display list designated position, and the operation returns to the step 2) after the operation is completed.

Preferably, control and graphics data information is transferred sequentially from the Graph Cmd Fifo Model-Read Graph Command Model-OpenGL Command Process Model-Command DMA Process Model via a transaction level interface;

and the GPU command processor unit of the hardware TLM microstructure of the GPU command processor unit based on the SystemC is respectively connected with the GPU PCIE bus unit, the GPU state parameter and graphic function management unit, the GPU unified dyeing unit, the GPU SPMU unit, the GPU external ROM unit and the GPU DDR unit through transaction-level interfaces.

The beneficial technical effects of the invention are as follows:

1. the invention provides a hardware TLM microstructure design of a GPU command processor unit based on SystemC, which comprises the following steps: graph Cmd Fifo Model, read Graph Command Model, openGL Command Process Model, command DMA Process Model.

2. The invention realizes the hardware TLM micro-structure design of the GPU command processor unit based on SystemC, solves the problem of RTL simulation result model comparison of the GPU command processor unit, solves the problem of functional verification of the GPU command processor unit, and accelerates the simulation speed.

Drawings

FIG. 1 is a block diagram of a system C based GPU command processor unit hardware TLM microstructure; wherein 1, graph Cmd Fifo Model; 2. read Graph Command Model; 3. OpenGL Command Process Model; 4. command DMA Process Model.

Detailed Description

The present invention will now be described in detail with reference to the accompanying drawings.

One embodiment of the present invention proposes a SystemC-based GPU command processor unit hardware TLM microstructure, as shown in fig. 1, comprising Graph Cmd Fifo Model1, read Graph Command Model2, openGL Command Process Model3 and Command DMA Process Model4;

graph Cmd Fifo Model1, read Graph Command Model, openGL Command Process Model, and Command DMA Process Model4 are connected in sequence by a transaction-level interface.

In one embodiment, graph Cmd Fifo Model1 is to receive OpenGL commands;

read Graph Command Model2 is for continuously reading data in Graph Cmd Fifo, and triggering OpenGL Command Process Model3 to execute when receiving a valid Cmd command;

OpenGL Command Process Model3 the packet header of the CMD command packet is used for executing different operation processes, and when an external dyeing program command, a NewList command or a CallList command is loaded, the execution of Command DMA Process Model is triggered by the packet header judgment;

command DMA Process Model4 is used to move the shader program data or display list data to the DDR.

In one embodiment, graph Cmd Fifo Model1, read Graph Command Model2, openGL Command Process Model3, and Command DMA Process Model4 are TLM modeled by SystemC.

In one embodiment, the transaction-level execution mode of the SystemC-based GPU command processor unit hardware TLM microstructure specifically includes the acquisition of OpenGL commands and the parsing of OpenGL commands:

in one embodiment, the acquisition of OpenGL commands includes the steps of:

1) Graph Cmd Fifo Model 1) continuously receiving OpenGL CMD command packets issued by the Pcie unit of the GPU driving layer, updating the Graph CMD Fifo state, and sending the command packets to the Pcie unit and Read Graph Command Model 2);

2) Read Graph Command Model 2) reading and judging the Graph Cmd Fifo state, if the Graph Cmd Fifo state is empty, continuing to read the state, and if the Graph Cmd Fifo state is not empty, turning to step 2.1);

2.1 Read Graph Command Model 2) reading a CMD command packet header in the Graph Cmd Fifo, storing the command packet load and the packet header into a command packet linked list together according to the command packet length of the packet header, judging the validity of the packet header, and switching to 3) if the command packet is a legal command packet, and switching to 2.2) if the command packet is an illegal command packet;

in one embodiment, the parsing of OpenGL commands includes the steps of:

a) OpenGL Command Process Model 3) reading the command packet in the command packet linked list, and executing defined corresponding operation by parsing the command packet header. Judging a command packet head, executing A.1) if the command packet is a parameter configuration type command, executing A.2) if the command packet is a graphic function type command, and executing A.3) if the command packet is a graphic drawing type command;

a.1 According to the functional definition of the GPU system, corresponding state parameters are read or written in through a state parameter management interface, and the operation returns to the step 2) after the operation is executed;

a.2 According to the functional definition of the GPU system, corresponding state parameters are read or written in through the state parameter management interface, and corresponding function codes are issued through the graphic function management interface. If the function code involves the DMA data transmission between the main memory and other modules, the corresponding DMA operation is started by configuring the DMA descriptor through the Pcie DMA interface, and the operation returns to 2) after the operation is executed), if the function code is the DMA path for loading the external dyeing program function code, the newList function code and the CallList function code, the DMA descriptor in the command processor is configured, and the step B) is carried out;

b.1 Firstly, moving the dyeing machine program data to a local storage area of the command processor, then, writing the dyeing machine program data from the local storage of the command processor into the DDR appointed position, and returning to the step 2) after the operation is executed;

b.2 Writing the display list data from NewListFifo to the DDR display list appointed position, and returning to 2) after the operation is performed;

b.3 Display list data is written into NewListFifo from the DDR display list designated position), and the operation is returned to 2) after the completion.

In one embodiment, control and graphics data information is transferred sequentially from Graph Cmd Fifo Model 1-Read Graph Command Model 2-OpenGL Command Process Model 3-Command DMA Process Model4 over a transaction level interface;

the GPU command processor unit based on the hardware TLM microstructure of the GPU command processor unit of the SystemC is respectively connected with the GPU PCIE bus unit and the GPU state parameter and graphic function management unit, the GPU unified dyeing unit, the GPU SPMU unit, the GPU external ROM unit and the GPU DDR unit through transaction-level interfaces.

Finally, it should be noted that the above embodiments are merely illustrative of the technical solution of the present invention, and not limiting thereof; although the invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that; the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. The GPU command processor unit hardware TLM microstructure based on SystemC is characterized in that:

including Graph Cmd Fifo Model (1), read Graph Command Model (2), openGL Command Process Model (3), and Command DMA Process Model (4);

the Graph Cmd Fifo Model (1), read Graph Command Model (2), openGL Command Process Model (3) and Command DMA Process Model (4) are sequentially connected through transaction-level interfaces;

the Graph Cmd Fifo Model (1) is configured to receive OpenGL commands;

the Read Graph Command Model (2) is used for continuously reading data in Graph Cmd Fifo, and triggering OpenGL Command Process Model (3) to execute when a valid Cmd command is received;

the OpenGL Command Process Model (3) is used for executing different operation processes according to the header of the CMD command packet, and triggering Command DMA Process Model (4) to execute when an external dyeing program command, a newList command or a CallList command is loaded through the judgment of the header;

the Command DMA Process Model (4) is used for moving the dyeing machine program data or the display list data to the DDR;

the transaction level execution mode of the hardware TLM microstructure of the GPU command processor unit based on SystemC specifically comprises the acquisition of OpenGL commands and the analysis of the OpenGL commands.

2. The SystemC-based GPU command processor unit hardware TLM micro-architecture of claim 1, wherein the fetching of OpenGL commands comprises the steps of:

1) Graph Cmd Fifo Model (1) continuously receiving OpenGL CMD command packets issued by a Pcie unit of a GPU driving layer, updating the Graph CMD Fifo state, and sending to the Pcie unit and Read Graph Command Model (2);

2) Read Graph Command Model (2) reading and judging the Graph Cmd Fifo state, if the Graph Cmd Fifo state is empty, continuing to read the state, and if the Graph Cmd Fifo state is not empty, turning to step 2.1);

2.1 Read Graph Command Model (2) reading the CMD command packet header in the Graph Cmd Fifo, storing the command packet load and the packet header into a command packet linked list together according to the command packet length of the packet header, judging the validity of the packet header, switching to 3) if the command packet is a legal command packet, and switching to 2.2) if the command packet is an illegal command packet;

2.2 Discarding the current illegal command packet from the command packet linked list), and returning to 2).

3. The SystemC-based GPU command processor unit hardware TLM micro-architecture of claim 2, wherein the parsing of OpenGL commands comprises the steps of:

a) OpenGL Command Process Model (3) reads the command packet in the command packet linked list, and performs the defined corresponding operation by parsing the command packet header. Judging a command packet head, executing A.1) if the command packet is a parameter configuration type command, executing A.2) if the command packet is a graphic function type command, and executing A.3) if the command packet is a graphic drawing type command;

a.2 According to the functional definition of the GPU system, corresponding state parameters are read or written in through a state parameter management interface, corresponding function codes are issued through a graphic function management interface, if the function codes relate to DMA data transmission between a main memory and other modules, a DMA descriptor is configured through a Pcie DMA interface to start corresponding DMA operation, the operation is carried out and then the operation is returned to the step (2), if the function codes are the DMA path of the DMA descriptor in a command processor is configured and converted into the step (B) by loading external dyeing program function codes, newList function codes and CallList function codes);

4. A system C-based GPU command processor unit hardware TLM microstructure according to claim 1, wherein,

control and graphics data information is transferred sequentially from said Graph Cmd Fifo Model (1) -Read Graph Command Model (2) -OpenGL Command Process Model (3) -Command DMA Process Model (4) through a transaction level interface;