CN112559139A

CN112559139A - SystemC-based multi-GPU transaction-level model device and operation method

Info

Publication number: CN112559139A
Application number: CN202011393054.0A
Authority: CN
Inventors: 田泽; 张少锋; 吴晓成; 陈佳; 王泉; 姜丽云
Original assignee: Xian Xiangteng Microelectronics Technology Co Ltd
Current assignee: Xian Xiangteng Microelectronics Technology Co Ltd
Priority date: 2020-12-05
Filing date: 2020-12-05
Publication date: 2021-03-26
Anticipated expiration: 2040-12-05
Also published as: CN112559139B

Abstract

The invention relates to a multi-GPU transaction-level model device based on SystemC and an operation method. The device comprises an OpenGL driving model, a host model, a multi-GPU TLM model, a video memory model, a display model and transaction interfaces for connecting all model components, wherein the OpenGL driving model, the host model, the multi-GPU TLM model, the video memory model and the display model are sequentially connected; the operation method comprises the following steps: 1) the OpenGL driving model receives an OpenGL command issued by a graphic application; 2) the host model implements a transaction-level interface; 3) the video memory model provides writing and reading of graphic drawing data for each GPU model; 4) the display model converts the image data in the video memory model into a displayable picture format. The invention provides a system C-based multi-GPU transaction-level model device and an operation method, which solve the requirement of early verification of a chip system architecture and related functional units, provide virtual prototypes for multi-GPU graphic processing application development and provide reference design for RTL design.

Description

SystemC-based multi-GPU transaction-level model device and operation method

Technical Field

The invention relates to the technical field of graphic processor modeling, relates to a multi-GPU transaction-level model device, and particularly relates to a multi-GPU transaction-level model device based on SystemC and an operation method.

Background

With the continuous development of graphics applications, a graphics rendering solution based on a single GPU has been difficult to meet the demand for fast graphics processing, and a graphics rendering solution with multiple GPUs in parallel has come into play. In addition to the vigorous development of artificial intelligence, GPUs have been applied to AI parallel computing applications in a comprehensive manner, and since the computing power of a single GPU is effective, a plurality of GPUs are required to process data cooperatively in order to deal with the processing of massive training data in the AI field. The ability of multiple GPUs to co-process data is therefore fundamental to the development of AI applications.

From 1999, the first GPU product released by Nvidia to date, the development of GPU technology mainly goes through the fixed function pipeline stage, the separation stainer architecture stage, and the unified stainer architecture stage, the graphics processing capability of the GPU technology is continuously improved, and the application field is gradually expanded from the initial graphics drawing to the general computing field. The GPU pipeline has high speed, parallel characteristics and flexible programmability, and provides a good running platform for graphic processing and general parallel computing.

At present, the GPU development capability in China is weak, and a large number of commercial GPU chips imported from abroad are adopted in display control systems in various fields. Especially in the military field, the foreign imported commercial GPU chip has hidden dangers in the aspects of safety, reliability, guarantee and the like, and cannot meet the requirements of the military environment; moreover, for political, military, economic reasons and the like, technology blocking and product monopoly are carried out in China abroad, and bottom technical data of the GPU chip, such as register data, detailed internal micro-architecture, core software source codes and the like, are difficult to obtain, so that the functions and the performances of the GPU cannot be fully exerted, and the portability is poor; the problems seriously restrict the independent development and autonomous development of the display system in China, break through the key technology of the graphics processor and develop the graphics processor chip at will.

The development system architecture and the graphic algorithm of the GPU chip supporting the multi-GPU cooperative work are complex, the logic scale of hardware is huge, and the design needs to be described on a higher abstraction level so as to be capable of carrying out higher-speed simulation, software/hardware cooperative simulation and system architecture exploration. When the design is expressed as a system-level model, the design can be easily tried for many times by selecting different algorithms, and the test can be quickly completed by using different structures; if a design is expressed using a register transfer level or gate level model, the scale is usually quite large, and it is time consuming and laborious if different design structures are tried and some changes are made, if not too difficult. The key factor of SystemC as a language for promoting the development and standardization is that system level design can be performed, and the hardware architecture and software algorithm can be described, so as to support verification and IP communication. Using SystemC as a partitioning tradeoff of software and hardware at the system level is much easier than other languages and the simulation is much faster than using multiple languages. Therefore, a multi-GPU transaction level model is designed by adopting a SystemC language, and modeling is directly carried out on a high abstraction level.

Disclosure of Invention

Based on the problems in the background art, the invention provides a multi-GPU transaction-level model device based on SystemC and an operation method thereof. The technical solution of the invention is as follows:

a multi-GPU transaction level model device based on SystemC is characterized in that:

the system comprises an OpenGL driving model, a host model, a multi-GPU TLM model, a video memory model, a display model and a plurality of transaction-level interfaces for connecting each model component, wherein the OpenGL driving model, the host model, the multi-GPU TLM model, the video memory model and the display model are sequentially connected;

the OpenGL driving model is used for receiving an OpenGL command issued by a graphic application, carrying out software preprocessing, packaging the OpenGL command into an OpenGL command packet according to a preset format, and then sending an OpenGL command code to the host model;

the host model comprises a command packet broadcast transaction level interface, an architecture register read-write transaction level interface and a transaction level interface for acquiring and clearing a host interrupt state. After receiving the OpenGL command, the host model broadcasts each GPU model OpenGL command of the multi-GPU TLM model to a PCIe unit of each GPU and acquires and clears a host interrupt state of each GPU through a PCIe configuration interface of each GPU model;

the multi-GPU TLM model is formed by a plurality of GPU models in an interconnected mode, and each GPU model comprises a command processing unit, a PCIe unit, a geometric engine unit, a graphic processing subset and a fragment processing unit;

the video memory model provides a writing and reading interface of graphic drawing data for each GPU TLM model;

the display model is used for converting the image data in the RGBA format in the video memory model into a displayable BMP picture format.

The multi-GPU transaction-level model device based on SystemC is characterized in that:

the GPU model is constructed on the basis of an object-oriented modeling language SystemC;

the GPU TLM model is connected with the host model through an interface provided by a PCIe unit and provides an interrupt state read-write interface method and an architecture register access interface method for the GPU TLM model;

the PCIe unit preprocesses an OpenGL command packet issued by a host model broadcast and then sends the OpenGL command packet to the command processing unit;

and the GPU model command processing unit determines whether to continuously issue the OpenGL command packet or not by judging the OpenGL command function code and the GPU ID.

the multi-GPU TLM model divides OpenGL commands into broadcast full execution and broadcast partial execution;

and the multi-GPU model command processing unit determines whether to continuously issue the OpenGL command packet or not by judging the OpenGL command function codes and the GPU ID. If the OpenGL command function code type is broadcast full execution, the command processing units of all GPU models send the command packet to subsequent processing units; if the OpenGL command function code type is broadcast partial execution and the GPU ID is 0, the command processing unit of the GPU model with the GPU ID of 0 sends the command packet to a subsequent processing unit; if the OpenGL command function code type is broadcast partial execution and the GPU ID is not 0, the command processing unit of the GPU model whose GPU ID is not 0 directly discards the command packet without sending it to the subsequent unit.

The multi-GPU transaction-level model device based on SystemC is characterized in that: and the multiple GPU models perform corresponding graphic drawing and rendering according to the drawing areas indicated by the GPU IDs and write the graphics into the corresponding video memory models.

The operating method of the multi-GPU transaction-level model device based on SystemC is characterized in that: the method comprises the following operation steps:

1) the OpenGL driving model receives an OpenGL command issued by a graphic application;

2) the host model implements a transaction-level interface;

3) the video memory model provides a writing-in and reading-out interface of graphic drawing data for each GPU model;

4) the display model converts the image data in the video memory model into a displayable picture format.

The operating steps of the SystemC-based multi-GPU transaction-level model device are characterized in that: the step 2) comprises the following steps:

1) assemble it into one of two custom command packages-CM 0 and CM2 command packages, according to OpenGL graphics command features;

2) calling a command packet broadcast transaction-level interface in the host model to broadcast and send the command packet to each GPU model;

3) calling a read-write transaction level interface of an architecture register in a host model to configure a cooperation mode of the multi-GPU TLM model and the coordinates, the width and the height of the starting point of a region which is complexly drawn by the multi-GPU TLM model;

4) and calling an acquisition and host interrupt state clearing transaction level interface in the host model to acquire the frame drawing completion interrupt of each multi-GPU TLM model so as to synchronize drawing actions among the multi-GPU TLM models.

The operating method of the multi-GPU transaction-level model device based on SystemC is characterized in that: the step 3) comprises the following steps:

1) command packet broadcast

The method comprises the steps that DMA channel configuration methods provided by PCIe configuration interfaces of all GPU models are called, and command packets are broadcasted to PCIe units of all GPU models for processing;

2) acquiring and clearing host interrupt status

Acquiring and clearing the host interrupt state of the GPU model by calling an interrupt state read-write method provided by a PCIe configuration interface of each GPU model;

3) architectural register read and write

And configuring the cooperation mode of the GPU model and the coordinates, the width and the height of the starting point of the area which is drawn by the GPU model in a complex way by calling an architecture register access method provided by a PCIe configuration interface of each GPU model.

The invention has the technical effects that:

1. the invention provides a multi-GPU transaction-level model device based on SystemC, wherein connection and communication between models of the device are realized through calling of transaction-level interfaces.

2. The SystemC-based multi-GPU transaction-level model device solves the problem that a multi-GPU system architecture and a related graphic algorithm are difficult to verify at the early stage of a project, provides a virtual prototype of multi-GPU cooperative work for software development, provides a verified architecture, algorithm and reference design for RTL development, improves the speed of software and hardware cooperative verification, and accelerates the progress of project development.

Drawings

FIG. 1 is a block diagram of a multi-GPU transaction level model device architecture based on SystemC.

Detailed Description

The technical solution of the present invention will be clearly and completely described below with reference to the accompanying drawings and the specific embodiments. It is obvious that the described embodiments are only a part of the embodiments of the present invention, rather than the whole embodiments, and that all other embodiments, which can be derived by a person skilled in the art without inventive step based on the embodiments of the present invention, belong to the scope of protection of the present invention.

Referring to fig. 1, a SystemC-based multi-GPU transaction-level model device includes an OpenGL driving model, a host model, a multi-GPU TLM model, a video memory model, a display model, and a plurality of transaction-level interfaces for connecting each model component, which are connected in sequence;

the host model comprises a command packet broadcast transaction level interface, an architecture register read-write transaction level interface and a transaction level interface for acquiring and clearing the host interrupt state. After receiving the OpenGL command, the host model broadcasts each GPU model OpenGL command of the multi-GPU TLM model to a PCIe unit of each GPU and acquires and clears a host interrupt state of each GPU through a PCIe configuration interface of each GPU model;

the multi-GPU TLM model is formed by a plurality of GPU models which are interconnected, wherein each GPU model comprises a command processing unit, a PCIe unit, a geometric engine unit, a graphic processing subset and a fragment processing unit;

the display model is used for converting the image data in the RGBA format in the video memory model into the displayable BMP picture format.

Preferably, the method comprises the following steps: the multi-GPU model of the multi-GPU transaction-level model device based on SystemC is constructed based on an object-oriented modeling language SystemC;

the GPU TLM model is connected with the host model through an interface provided by a PCIe unit, and an interrupt state read-write interface method and an architecture register access interface method are provided for the GPU TLM model;

the GPU model command processing unit determines whether to continue issuing the OpenGL command packet by determining the OpenGL command function code and the GPU ID.

The multi-GPU TLM model of the multi-GPU transaction-level model device based on SystemC divides OpenGL commands into broadcast full execution and broadcast partial execution;

the multi-GPU model command processing unit determines whether to continuously issue the OpenGL command packet or not by judging the OpenGL command function codes and the GPU ID. If the OpenGL command function code type is broadcast full execution, the command processing units of all GPU models send the command packet to subsequent processing units; if the OpenGL command function code type is broadcast partial execution and the GPU ID is 0, the command processing unit of the GPU model with the GPU ID of 0 sends the command packet to a subsequent processing unit; if the OpenGL command function code type is broadcast partial execution and the GPU ID is not 0, the command processing unit of the GPU model whose GPU ID is not 0 directly discards the command packet without sending it to the subsequent unit.

The multi-GPU model of the multi-GPU transaction-level model device based on SystemC performs corresponding graph drawing and rendering according to the drawing area indicated by the GPU ID and writes the drawing area into the corresponding video memory model.

The operation method of the multi-GPU transaction-level model device based on SystemC comprises the following operation steps:

2) the host model implements a transaction-level interface;

The step 2) of the above system c-based multi-GPU transaction-level model device includes the following steps:

Step 3) of the above multi-GPU transaction level model apparatus based on SystemC comprises the following steps:

1) command packet broadcast

2) acquiring and clearing host interrupt status

3) architectural register read and write

Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art; the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A multi-GPU transaction level model device based on SystemC is characterized in that:

2. The SystemC-based multi-GPU transaction level model device of claim 1, wherein:

3. The SystemC-based multi-GPU transaction level model device of claim 2, wherein:

4. The SystemC-based multi-GPU transaction level model device of claim 3, wherein: and the multiple GPU models perform corresponding graphic drawing and rendering according to the drawing areas indicated by the GPU IDs and write the graphics into the corresponding video memory models.

5. The method of operating a SystemC-based multi-GPU transaction-level model device according to claim 1, characterized in that: the method comprises the following operation steps:

2) the host model implements a transaction-level interface;

6. The operational steps of a SystemC-based multi-GPU transaction-level model device according to claim 5, characterized in that: the step 2) comprises the following steps:

6.1 assemble OpenGL graphics command features into one of two custom command packages-CM 0 and CM 2;

6.2 calling a command packet broadcast transaction-level interface in the host model to broadcast and send the command packet to each GPU model;

6.3 calling a system structure register read-write transaction level interface in the host model to configure a cooperation mode of the multi-GPU TLM model and the coordinates, the width and the height of the starting point of a region which is complexly drawn by the multi-GPU TLM model;

6.4 calling the transaction level interface for obtaining and clearing the host interrupt state in the host model to obtain the frame drawing completion interrupt of each multi-GPU TLM model so as to synchronize the drawing action among the multi-GPU TLM models.

7. The method of operation of a SystemC-based multi-GPU transaction-level model device according to claim 6, wherein: the step 3) comprises the following steps:

7.1 Command packet broadcast

7.2 getting and clearing host interrupt State

7.3 architecture register read and write