CN110888737A

CN110888737A - Ringbuffer implementation system and method supporting multiple GPUs

Info

Publication number: CN110888737A
Application number: CN201911125585.9A
Authority: CN
Inventors: 马城城; 聂曌; 刘晖; 张琛; 张兴雷; 王晨光
Original assignee: Xian Aeronautics Computing Technique Research Institute of AVIC
Current assignee: Xian Aeronautics Computing Technique Research Institute of AVIC
Priority date: 2019-11-18
Filing date: 2019-11-18
Publication date: 2020-03-17

Abstract

The invention belongs to the technical field of computer application, and particularly relates to a Ringbuffer implementation system supporting multiple GPUs. The system comprises a Ringbuffer page table management module (1), a multi-GPU task management module (2), a data buffer module (3) and a GPU task buffer module (4). The invention provides a Ringbuffer implementation method supporting multiple GPUs by constructing a mapping relation between a Ringbuffer page space and the multiple GPUs.

Description

Ringbuffer implementation system and method supporting multiple GPUs

The invention belongs to the technical field of computer application, and particularly relates to a Ringbuffer implementation system and method supporting multiple GPUs.

Background

Since the advent of the GPU, the GPU has played an important role in the fields requiring high performance computing, such as image video processing, physics, bioscience, chemistry, artificial intelligence, and the like, due to its ultra-strong computing power. In order to complete the increasingly complex graphic image processing task, the general computing task and the reduction of the computing time, a plurality of GPU cards are required to be used for computing at the same time. Ringbuffer is a technology that can effectively improve the memory allocation and usage efficiency. Whether Ringbuffer can be flexibly and efficiently used to manage and distribute the calculation tasks on a plurality of GPUs has direct influence on the task completion efficiency.

Disclosure of Invention

The purpose of the invention is as follows:

in order to solve the problems, the invention provides a Ringbuffer implementation method supporting multiple GPUs, and the method is characterized in that through constructing the mapping relation between the Ringbuffer page space and the multiple GPUs, data of the multiple GPUs are uniformly managed by one Ringbuffer module, so that the memory space is saved, the flexible distribution of graphics or the operation of calculation tasks on the multiple GPUs is realized, and the operation efficiency of the tasks is improved.

The technical scheme is as follows:

the invention provides a Ringbuffer implementation system supporting multiple GPUs (graphic processing units), which comprises a first GPU and a second GPU

Comprises a Ringbuffer page table management module (1), a multi-GPU task management module (2) and a number

And (4) GPU task buffer modules according to the buffer modules (3) and (4).

Further, the Ringbuffer page table management module (1) is used for realizing page management of the data buffering module (3), the module divides the Ringbuffer data buffering module (3) into a plurality of page spaces with the same size, and each page has own internal attribute;

the internal attribute comprises a first address, a data input address, a use state and a target GPU;

when a user inputs data to the data buffer module (3), firstly, judging the size of the residual space of the current Ringbuffer page and a data input address through the write-in management function of the Ringbuffer page table management module (1), and when the size of the Ringbuffer page space is not enough or a page switching instruction is received, realizing switching of the Ringbuffer buffer page through the synchronous management function;

meanwhile, combining (2) a multi-GPU task management module to realize target GPU setting operation of a Ringbuffer page, and after (1) the Ringbuffer page table management module records GPU task buffering to which data in the Ringbuffer page is to be sent according to multi-GPU task allocation information obtained from (4) a GPU task buffering module;

and the Ringbuffer page table management module (1) receives task completion signals fed back by each GPU task buffer, realizes the Ringbuffer page recovery operation, and recovers a page in a sent state in the Ringbuffer page table management module (1) after receiving the task completion signals returned by all target GPU task buffers, and initializes the page to be in an unused state.

Further, the multi-GPU task management module (2) is configured to receive multi-GPU task configuration information configured by a user, allocate a GPU to be executed for a graphics or a computing task generated by the Ringbuffer page table management module (1), and then return task allocation information to the Ringbuffer page table management module (1).

Further, the data buffer module (3) is configured to receive and store incoming data according to the division of the Ringbuffer page table management module (1), and send data in a page to a corresponding GPU task buffer under the control of the Ringbuffer page table management module (1).

Further, the GPU task buffer module (4) is configured to buffer tasks of each GPU, independently receive task data from the data buffer module (3), and reply a task completion signal to the Ringbuffer page of the data sent from the Ringbuffer page table management module (1) after sending the data to the GPU.

Another object of the present invention is to provide an implementation method of a Ringbuffer implementation system supporting multiple GPUs, which includes the following steps:

①, acquiring the multi-GPU task configuration information of the user and storing the information in the multi-GPU task management module (2);

②, acquiring the current page data input address and the page residual space of the Ringbuffer data buffer module (3) through the write-in management function of the Ringbuffer page table management module (1);

③, writing command data;

④, repeating steps ② - ③ until the residual space of the current Ringbuffer page is not enough or a page synchronization management command is received, setting the page state to be configured, then switching the write page to the next page in an unused state, and continuing to execute step ④;

⑤, the Ringbuffer page table management module (1) generates the data content in the page to be configured into graphics or calculation tasks, and sends the graphics or calculation tasks to the multi-GPU task management module (2);

⑥, the multi-GPU task management module (2) allocates tasks to a plurality of GPU task buffers of the GPU task buffer module (4) according to the multi-GPU task configuration information input by a user;

⑦, acquiring multi-GPU task allocation information from the GPU task buffer module (4) by the Ringbuffer page table management module (1), and recording GPU task buffer to which data in a page to be configured is sent;

⑧, sending data to GPU by GPU task buffer in GPU task buffer module (4), replying task completion signal to Ringbuffer page table management module (1), when all corresponding GPU task buffers of Ringbuffer page which has sent data have replied task completion signal, recovering the page, and initializing it to unused state.

Has the advantages that:

the invention provides a Ringbuffer implementation method supporting multiple GPUs by constructing the mapping relation between the Ringbuffer page space and the multiple GPUs.

Drawings

FIG. 1 is a schematic diagram of a Ringbuffer implementation method supporting multiple GPUs according to the present invention;

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The invention will be further described with reference to the accompanying drawings in which:

as shown in fig. 1, the system for implementing Ringbuffer supporting multiple GPUs includes a Ringbuffer page table management module (1), a multiple-GPU task management module (2), a data buffering module (3), and a GPU task buffering module (4).

The Ringbuffer page table management module (1) is used for realizing page management of the data buffer module (3), the data buffer module (3) of the Ringbuffer is divided into a plurality of page spaces with the same size by the Ringbuffer page table management module, and each page has own internal attribute;

The multi-GPU task management module (2) is used for receiving multi-GPU task configuration information configured by a user, distributing executed GPUs for graphics or computing tasks generated by the Ringbuffer page table management module (1), and then returning task distribution information to the Ringbuffer page table management module (1).

The data buffer module (3) is used for receiving and storing the incoming data according to the division of the Ringbuffer page table management module (1), and sending the data in the page to the corresponding GPU task buffer under the control of the Ringbuffer page table management module (1).

The GPU task buffer module (4) is used for buffering tasks of each GPU, independently receives task data from the data buffer module (3), and replies a task completion signal to a Ringbuffer page of the data sent from the Ringbuffer page table management module (1) after sending the data to the GPU.

The implementation method of the Ringbuffer implementation system supporting multiple GPUs comprises the following steps:

③, writing command data;

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. The Ringbuffer implementation system supporting multiple GPUs is characterized by comprising a Ringbuffer page table management module (1), a multiple GPU task management module (2), a data buffering module (3) and a GPU task buffering module (4).

2. The multiple-GPU supporting Ringbuffer implementation system of claim 1, characterized in that the Ringbuffer page table management module (1) is used for implementing page management to the data buffer module (3), and the Ringbuffer page table management module divides the data buffer module (3) of the Ringbuffer into a plurality of page spaces with the same size, and each page has its own internal attribute;

3. The multiple-GPU supporting Ringbuffer implementation system according to claim 1, wherein the multiple-GPU task management module (2) is configured to accept multiple-GPU task configuration information configured by a user, allocate a GPU for execution for graphics or computing tasks generated by the Ringbuffer page table management module (1), and then return the task allocation information to the Ringbuffer page table management module (1).

4. The multiple-GPU-capable Ringbuffer implementation system according to claim 1, wherein the data buffer module (3) is configured to receive and store incoming data according to the division of the Ringbuffer page table management module (1), and the Ringbuffer page table management module (1) controls to send data in a page to a corresponding GPU task buffer.

5. The multiple-GPU supporting Ringbuffer implementation system of claim 1, wherein the GPU task buffer module (4) is used for buffering tasks of each GPU, independently receives task data from the data buffer module (3), and replies a task completion signal to a Ringbuffer page of data sent from the Ringbuffer page table management module (1) after sending the data to the GPU.

6. The method for implementing a multi-GPU enabled Ringbuffer implementation system as claimed in any of claims 1-5, said method comprising the steps of:

③, writing command data;