CN111880916A - Multi-drawing task processing method, device, terminal, medium and host in GPU - Google Patents

Multi-drawing task processing method, device, terminal, medium and host in GPU Download PDF

Info

Publication number
CN111880916A
CN111880916A CN202010730288.3A CN202010730288A CN111880916A CN 111880916 A CN111880916 A CN 111880916A CN 202010730288 A CN202010730288 A CN 202010730288A CN 111880916 A CN111880916 A CN 111880916A
Authority
CN
China
Prior art keywords
gpu
command
host
task
register
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010730288.3A
Other languages
Chinese (zh)
Other versions
CN111880916B (en
Inventor
单晋奎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changsha Jingmei Integrated Circuit Design Co ltd
Changsha Jingjia Microelectronics Co ltd
Original Assignee
Changsha Jingmei Integrated Circuit Design Co ltd
Changsha Jingjia Microelectronics Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changsha Jingmei Integrated Circuit Design Co ltd, Changsha Jingjia Microelectronics Co ltd filed Critical Changsha Jingmei Integrated Circuit Design Co ltd
Priority to CN202010730288.3A priority Critical patent/CN111880916B/en
Publication of CN111880916A publication Critical patent/CN111880916A/en
Application granted granted Critical
Publication of CN111880916B publication Critical patent/CN111880916B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4812Task transfer initiation or dispatching by interrupt, e.g. masked
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Advance Control (AREA)
  • Image Generation (AREA)

Abstract

The embodiment of the application provides a method, a device, a terminal, a medium and a host for processing multiple drawing tasks in a GPU, relates to a GPU chip design technology, and is used for solving the problem that in the prior art, the execution efficiency of the host and the GPU is influenced due to frequent interaction between the host and the GPU in the process of monitoring the execution state of the GPU by the host. The method comprises the following steps: the GPU obtains a first command generated by the host; the first command comprises a register configuration command; the GPU writes the value configured in the first command into a corresponding address of the RAM, and stores the address into a register corresponding to the current drawing task; the GPU obtains a parameter configuration command generated by the host for the current drawing task; when the GPU executes the parameter configuration command, reading a value from the RAM according to the address stored by the register; and when the value read from the RAM is equal to the value configured by the host in the parameter configuration command, the GPU confirms that the current drawing task is completed, generates corresponding interrupt information and sends the interrupt information to the host.

Description

Multi-drawing task processing method, device, terminal, medium and host in GPU
Technical Field
The present disclosure relates to a GPU chip design technology, and in particular, to a method, an apparatus, a terminal, a medium, and a host for processing multiple rendering tasks in a GPU.
Background
A GPU (Graphics Processing Unit) is a microprocessor applied to terminals such as mobile devices and personal computers, and is used for executing a corresponding drawing task according to a drawing task command sent by a host.
In the related art, the GPU generally employs a time-division multiplexing method when executing a plurality of rendering tasks simultaneously. Specifically, the host sends a batch of commands of the rendering tasks to the GPU, monitors the execution state of the GPU, and sends the next batch of commands of the rendering tasks to the GPU and monitors the execution state of the GPU after monitoring that the GPU has executed the batch of rendering tasks. However, in the process of monitoring the execution state of the GPU by the host, the host needs to send information for acquiring the execution state of the rendering task to the GPU and receive GPU return information at a high frequency, which results in frequent interactions between the host and the GPU.
Disclosure of Invention
The embodiment of the application provides a method, a device, a terminal, a medium and a host for processing multiple drawing tasks in a GPU, and aims to solve the problem that in the prior art, the execution efficiency of the host and the GPU is affected due to frequent interaction between the host and the GPU in the process of monitoring the execution state of the GPU by the host.
A first aspect of the embodiments of the present application provides a method for processing multiple rendering tasks in a GPU, including:
the GPU obtains a first command generated by the host for the current drawing task; the first command comprises a register configuration command;
the GPU writes the value configured in the first command into a corresponding address of a Random Access Memory (RAM), and stores the address into a register corresponding to the current drawing task;
the GPU acquires a second command generated by the host for the current drawing task, wherein the second command comprises a parameter configuration command;
when the GPU executes the parameter configuration command, reading a value from the RAM according to the address stored by the register; and when the value read from the RAM is equal to the value configured by the host in the parameter configuration command, the GPU confirms that the current drawing task is completed, generates interrupt information corresponding to the current drawing task and sends the interrupt information to the host.
A second aspect of the present disclosure provides a device for processing multiple rendering tasks in a GPU of a graphics processor, including:
the system comprises a first acquisition module, a second acquisition module and a processing module, wherein the first acquisition module is used for acquiring a first command generated by a host for a current drawing task, and the first command comprises a register configuration command;
the first processing module is used for writing the value configured in the first command into a corresponding address of an RAM (random access memory), and storing the corresponding address into a register corresponding to the current drawing task;
a second obtaining module, configured to obtain a second command generated by the host for the current drawing task, where the second command includes a parameter configuration command;
the second processing module is used for reading values from the RAM according to corresponding addresses stored in the register when the parameter configuration command is executed; and the processor is also used for confirming that the current drawing task is completed when the value read from the RAM is equal to the value configured by the host in the parameter configuration command, generating interrupt information corresponding to the current drawing task and sending the interrupt information to the host.
A third aspect of the embodiments of the present application provides a terminal, including:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement a method as claimed in any preceding claim.
A fourth aspect of embodiments of the present application provides a computer-readable storage medium having a computer program stored thereon; the computer program is executed by a processor to implement a method as claimed in any preceding claim.
A fifth aspect of the embodiments of the present application provides a host computer, which includes a motherboard and a display card, where the display card is in communication connection with the motherboard, and the display card has the apparatus according to any one of the foregoing.
The embodiment of the application provides a method, a device, a terminal, a medium and a host for processing multiple drawing tasks in a GPU (graphics processing Unit), wherein the GPU compares a value in an acquired parameter configuration command with a value pre-written into a RAM (random access memory), and determines that a current drawing task is finished and generates interrupt information to be sent to the host when the value read from the RAM by the GPU is equal to the value in the parameter configuration command according to a comparison result, so that the host is facilitated to determine the execution of a subsequent task. Therefore, the communication times between the host and the GPU are reduced, and the execution efficiency of the host and the GPU is favorably ensured.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a flowchart illustrating a method for processing multiple rendering tasks in a GPU according to an exemplary embodiment;
FIG. 2 is a schematic diagram illustrating an application of a method for processing multiple rendering tasks in a GPU according to an exemplary embodiment;
FIG. 3 is a block diagram of a multi-rendering task processing device in a GPU according to an exemplary embodiment.
Detailed Description
In order to make the technical solutions and advantages of the embodiments of the present application more apparent, the following further detailed description of the exemplary embodiments of the present application with reference to the accompanying drawings makes it clear that the described embodiments are only a part of the embodiments of the present application, and are not exhaustive of all embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
In the related art, a GPU (Graphics Processing Unit) generally employs a time-division multiplexing method when simultaneously executing a plurality of rendering tasks. Specifically, the host sends a batch of commands of the rendering tasks to the GPU, monitors the execution state of the GPU, and sends the next batch of commands of the rendering tasks to the GPU and monitors the execution state of the GPU after monitoring that the GPU has executed the batch of rendering tasks. However, in the process of monitoring the execution state of the GPU by the host, the host needs to send information for acquiring the execution state of the rendering task to the GPU and receive GPU return information at a high frequency, which causes frequent interaction between the host and the GPU and affects the execution efficiency of the host and the GPU.
In order to overcome the above problems, embodiments of the present application provide a method, an apparatus, a terminal, a medium, and a host for processing multiple rendering tasks in a GPU, where the GPU compares a value in an acquired parameter configuration command with a value pre-written into a RAM, and determines that a current rendering task is completed and generates interrupt information to be sent to the host when the value read from the RAM by the GPU is equal to the value in the parameter configuration command according to a comparison result, thereby facilitating the host to determine execution of a subsequent task. Therefore, the communication times between the host and the GPU are reduced, and the execution efficiency of the host and the GPU is favorably ensured.
Fig. 1 is a flowchart illustrating a method for processing multiple rendering tasks in a GPU according to an exemplary embodiment.
The following describes the functions and implementation processes of the method provided by this embodiment with reference to fig. 1.
The method for processing multiple rendering tasks in the GPU provided by this embodiment includes:
s101, a GPU acquires a first command generated by a host for a current drawing task; the first command comprises a register configuration command;
s102, the GPU writes the value configured in the first command into a corresponding address of the RAM, and stores the corresponding address into a register corresponding to the current drawing task;
s103, the GPU acquires a second command generated by the host for the current drawing task, wherein the second command comprises a parameter configuration command;
s104, when the GPU executes the parameter configuration command, reading a value from the RAM according to a corresponding address stored in the register; and when the value read from the RAM is equal to the value configured by the host in the parameter configuration command, the GPU confirms that the current drawing task is completed, generates interrupt information corresponding to the current drawing task and sends the interrupt information to the host.
In this embodiment, the host can configure one register for each rendering task. The configuration of the register is written by the host and read by the GPU. When the GPU has multiple registers and the GPU allows multiple rendering tasks to be performed simultaneously, the host can configure one register for each rendering task, respectively. When the GPU allows only one rendering task to be performed, the host can configure one register for the rendering task.
In step S101, when the GPU allows only one rendering task to be executed, the host can generate a corresponding first command for the rendering task and write the first command into the GPU. When the GPU allows a plurality of drawing tasks to be executed simultaneously, the host can respectively generate a first command for each drawing task and write the first command into the GPU. The first command may include a management-related parameter configuration command such as a register configuration command. The first command writes the value of a register corresponding to the drawing task. When the GPU allows a plurality of rendering tasks to be executed simultaneously, the values of the registers corresponding to different rendering tasks are different.
Illustratively, when the GPU allows N rendering tasks to be executed simultaneously, the GPU sets N registers, which can be represented by R [ N ] for each rendering task; wherein N is an integer of 1 or more, and N is 1 or more and N or less. For example, when n ═ 1, the GPU sets 1 register R [1 ]; or when n is 2, the GPU sets two registers, wherein the two registers are respectively R [1] and R [2], and the values of the register R [1] and the register R [2] are different.
In particular implementations, the GPU may obtain the first command through a variety of ways, that is, the host may write the first command to the GPU through a variety of ways.
In step S102, the GPU writes the value configured for the corresponding register in the acquired first command into a corresponding address of a Random Access Memory (RAM). When the first command is plural, the RAM has plural addresses corresponding to the respective first commands. In a specific implementation, the GPU may store the address configured for the corresponding register in the first command in the register after writing the address into the address of the PAM; or, the GPU allocates a RAM address to the value of the corresponding register in advance, stores the address in the register, and writes the value of the register into the RAM according to the address stored in the register.
Optionally, before the GPU writes the values configured in the first command into the corresponding addresses of the RAM, initializing the RAM, and setting the content of the address for storing the values of the registers in the RAM to 0; illustratively, the contents of all addresses of the RAM are set to 0. The process of initializing the RAM may use the existing technology in the field, and the embodiment is not limited in this respect.
In step S103, when the GPU allows only one rendering task to be executed, the host can generate a corresponding second command for the rendering task and write the second command into the GPU. When the GPU allows a plurality of drawing tasks to be executed simultaneously, the host can respectively generate a second command for each drawing task and write the second command into the GPU. Wherein the second command comprises a parameter configuration command; the parameter configuration command has written therein a value configured for the corresponding register. It should be noted that the execution sequence of step S103, step S101, and step S102 is not specifically limited here.
In step S104, when the GPU executes the parameter configuration command, reading a value from the RAM according to a corresponding address stored in the register; and when the value read from the RAM is not equal to the value configured for the corresponding register by the host in the parameter configuration command, the GPU confirms that the current task is not completed, and the GPU temporarily does not send information related to the execution state of the drawing task to the host. And when the value read from the RAM is equal to the value configured for the corresponding register in the parameter configuration command by the host, the GPU confirms that the current drawing task is completed.
Illustratively, the corresponding address of register R [ N ] in RAM is AddrN; that is, the address in RAM where the value of register R [ N ] is stored is AddrN. When the GPU executes the parameter configuration command, the value in the parameter configuration command is compared with the value read by the GPU from the address AddrN of the RAM according to the address stored in the register R [ N ], until the value configured for the corresponding register in the parameter configuration command is equal to the value read by the GPU from the address AddrN of the RAM according to the address stored in the register R [ N ], and the completion of the current drawing task is confirmed. That is, when the GPU executes the parameter configuration command, and when the value configured for the corresponding register in the parameter configuration command is equal to the value that the GPU reads from the address AddrN of the RAM according to the address stored by the register R [ N ], it is confirmed that the current drawing task is completed.
And when the GPU confirms that the current drawing is considered to be finished, triggering the analysis unit to interrupt analysis, generating interrupt information corresponding to the current drawing task and sending the interrupt information to the host. The interrupt information may carry a value of a corresponding register or information such as an identifier of a corresponding rendering task, so that the host can quickly determine the currently interrupted rendering task according to the interrupt information.
After receiving the interrupt information generated by the GPU, the host determines the currently interrupted drawing task according to the interrupt information, determines that the drawing task is completed, and executes the subsequent tasks. When the GPU allows a plurality of rendering tasks to be executed simultaneously, the GPU may send the interrupt information of each rendering task to the host, that is, the GPU may generate corresponding interrupt information and send the interrupt information to the host when it is determined that one of the rendering tasks is completed. The host may send the related commands of the next drawing task to the GPU after receiving the interrupt information of one of the drawing tasks, or the host may send the related commands of the next drawing task to the GPU after receiving the interrupt information of each of the drawing tasks of the batch.
According to the multi-drawing task processing method in the GPU, after the host sends the command of the drawing task to the GPU, the GPU monitors the execution condition of the drawing task, generates the interrupt information when the drawing task is confirmed to be completed, and sends the state of completion of the execution task to the host, so that the host can determine the execution of the subsequent task. Therefore, the communication frequency between the host and the GPU is reduced, and the execution efficiency of the host and the GPU is ensured.
In one possible implementation, the GPU is able to obtain a first command that the host writes through the register configuration bus. The GPU can also acquire a command word generated by the host for the current drawing task, wherein the command word corresponds to a second command; the second commands include drawing commands and parameter configuration commands.
The GPU acquiring a second command corresponding to a command word generated by the host for the current rendering task may specifically include: the GPU obtains command words generated by the host for the current drawing task; and the GPU performs command analysis on the obtained command words, and separates the parameter configuration command from the drawing command to obtain the parameter configuration command and the drawing command. And after the GPU performs command analysis on the obtained command words, the GPU sends drawing commands to the functional units.
In specific implementation, the host writes a parameter configuration command in a command word generated for the current rendering task. That is, the host inserts parameter configuration commands in the drawing commands generated for the current drawing task. After the host writes the command words into the GPU, the GPU performs command analysis on the drawing commands inserted with the parameter configuration commands, and separates the parameter configuration commands from the drawing commands; the GPU carries out parameter analysis on the parameter configuration command to obtain task management related parameters and drawing related parameters; the task management related parameters comprise values configured for the register by the host; and sending the related drawing parameters to the functional unit.
And after the GPU analyzes the obtained command words of the current drawing task, a command sequence of the current drawing task can be obtained, and the GPU executes corresponding commands according to the sequence in the command sequence. Taking the example that the GPU executes one of the rendering tasks: the GPU executes corresponding commands according to the command sequence of the drawing task; when the GPU executes the parameter configuration command, the GPU compares the value configured for the register in the parameter configuration command with the value read from the RAM by the GPU according to the address stored by the register corresponding to the drawing task, when the value configured for the register in the parameter configuration command is equal to the value read from the RAM by the GPU according to the address stored by the register corresponding to the drawing task, the drawing task is confirmed to be completed, the parameter analysis unit is interrupted, and interrupt information is generated and sent to the host. And the host receives the interrupt information, inquires the interrupt register according to the interrupt information to obtain the current drawing task generating the interrupt, and further determines the execution of the subsequent drawing task.
For example, as shown in FIG. 2, assume that the GPU allows n rendering tasks to be performed simultaneously. The host computer sets registers R1 and R2 … R N for N drawing tasks including task 0 and task1 … task N-1, and configures corresponding values for registers R1 and R2 … R N. Each register is written by the host and read by the GPU.
The host respectively generates a first command for n drawing tasks including task 0 and task1 … task n-1; and the host writes the first commands of the drawing tasks into the GPU through the register configuration bus. The first command may be a configuration command of the task management related parameter, the configuration command of the task management related parameter includes a command of register configuration, and the task management related parameter may include a value configured for a corresponding register.
The host respectively generates command words for n drawing tasks including task 0 and task1 … task n-1, and parameter configuration commands are written in the command words; drawing commands are also included in the command word. And the host writes the command words corresponding to the drawing tasks into the GPU.
And after the GPU acquires the first command of each drawing task through the register configuration bus, writing the value configured for the corresponding register in the first command of each drawing task into the corresponding address of the RAM. Illustratively, the register corresponding to the drawing task 0 is R [1], and the value of the register R [1] is stored into the address Addr1 of the RAM; drawing a register corresponding to the task1 as R < 2 >, and storing the value of the register R < 2 > into an address Addr2 of the RAM; …, respectively; the register corresponding to task N-1 is R [ N ], and the value of register R [ N ] is stored into address AddrN of RAM.
After the GPU obtains the command words, the obtained command words are analyzed, and the separated drawing commands and parameter configuration commands are obtained. And performing parameter analysis on the parameter configuration command to obtain a value configured for the corresponding register and drawing related parameters. Taking the GPU to execute the rendering task n-1 as an example: the GPU executes a corresponding command according to the command sequence of task n-1; when the GPU executes the parameter configuration command, the GPU compares the value configured for the corresponding register in the parameter configuration command with the value read from the RAM by the GPU according to the address stored by the register R [ N ], and when the value configured for the corresponding register in the parameter configuration command is not equal to the value read from the RAM by the GPU according to the address stored by the register R [ N ], the parameter analysis unit does not generate interruption; when the value configured for the corresponding register in the parameter configuration command is equal to the value read by the GPU from the RAM according to the address stored by the register R [ N ], the GPU confirms that the task N-1 is completed, the parameter analysis unit is interrupted, and interrupt information is generated and sent to the host. And the host receives the interrupt information, inquires the interrupt register according to the interrupt information to obtain the current drawing task generating the interrupt, and further determines the execution of the subsequent drawing task.
In the process, the GPU only needs to generate interrupt information and send the interrupt information to the host when monitoring that the corresponding drawing task is completed; therefore, the GPU does not need to frequently interact with the host, so that the communication frequency between the host and the GPU is reduced, the parallel operation of the host and the GPU is facilitated, the execution efficiency of the host and the GPU is facilitated to be ensured, the frequency of system switching tasks is facilitated to be reduced, and the execution efficiency of the system is facilitated to be ensured.
The present embodiment provides a multi-rendering task processing apparatus in a GPU, which can be used as an execution subject of the aforementioned method embodiments. The function and implementation process of the device may be the same as or similar to those of the previous embodiments, and the description of this embodiment is omitted.
FIG. 3 is a block diagram of a multi-rendering task processing device in a GPU according to an exemplary embodiment.
As shown in fig. 3, the multi-rendering task processing apparatus in the GPU of the present embodiment includes:
a first obtaining module 31, configured to obtain a first command generated by a host for a current drawing task;
the first processing module 32 is configured to write the value configured in the first command into a corresponding address of the RAM, and store the corresponding address into a register corresponding to the current drawing task;
a second obtaining module 33, configured to obtain a second command generated by the host for the current drawing task, where the second command includes a parameter configuration command;
the second processing module 34 is used for reading values from the RAM according to corresponding addresses stored in the register when the parameter configuration command is executed; and the device is also used for confirming that the current drawing task is completed when the value read from the RAM is equal to the value configured in the parameter configuration command by the host, generating interrupt information corresponding to the current drawing task and sending the interrupt information to the host.
In one possible implementation manner, the first obtaining module 31 is specifically configured to:
a first command written by the host through the register configuration bus is obtained.
In one possible implementation manner, the second obtaining module 33 is specifically configured to:
acquiring command words generated by a host for a current drawing task;
and the GPU performs command analysis on the obtained command words and separates the parameter configuration command from the drawing command.
The second obtaining module 33 is further configured to: and sending the drawing command obtained by analyzing the command of the acquired command word to the functional unit.
In one possible implementation manner, the first obtaining module 31 is specifically configured to:
when the GPU allows a plurality of drawing tasks to be executed simultaneously, acquiring first commands respectively generated by the host for the plurality of current drawing tasks.
In one possible implementation manner, the second obtaining module 33 is specifically configured to:
and when the GPU allows the simultaneous execution of the plurality of drawing tasks, the GPU acquires second commands respectively generated by the host computer for the plurality of current drawing tasks.
In one possible implementation manner, the first processing module 32 is further configured to: before writing the value configured in the first command into the corresponding address of the RAM, initializing the RAM, and setting the content of the corresponding address of the PAM to be 0.
In the device provided by this embodiment, after the host sends the command of the rendering task to the GPU, the GPU monitors the execution state of the current rendering task by using the value of the register configured for the current rendering task, and generates interrupt information to send to the host when it is determined that the current rendering task is completed, which is beneficial for the host to determine the execution of the subsequent task. Therefore, the communication frequency between the host and the GPU is reduced, the execution efficiency of the host and the GPU is favorably ensured, the frequency of system switching tasks is favorably reduced, and the execution efficiency of the system is favorably ensured.
The present embodiment further provides a terminal, including:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement a method as in any of the preceding examples.
The memory is used for storing a computer program, and the processor executes the computer program after receiving the execution instruction, and the method executed by the apparatus defined by the flow process disclosed in the foregoing corresponding embodiments can be applied to or implemented by the processor.
The Memory may comprise a Random Access Memory (RAM) and may also include a non-volatile Memory, such as at least one disk Memory. The memory can implement communication connection between the system network element and at least one other network element through at least one communication interface (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, and the like can be used.
The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the method disclosed in the first embodiment may be implemented by hardware integrated logic circuits in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The corresponding methods, steps, and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software elements in the decoding processor. The software elements may be located in ram, flash, rom, prom, or eprom, registers, among other storage media that are well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
The present embodiment also provides a computer-readable storage medium having a computer program stored thereon; the computer program is executed by a processor to implement a method as in any of the preceding examples.
The present embodiment further provides a host, including a motherboard and a graphics card, where the graphics card is in communication with the motherboard, and the graphics card has the apparatus as in any of the above examples. The mainboard is used for transmitting the signal to the display card or transmitting the output signal of the display card to other components.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (11)

1. A method for processing multiple rendering tasks in a Graphics Processing Unit (GPU), comprising:
the GPU obtains a first command generated by the host for the current drawing task; the first command comprises a register configuration command;
the GPU writes the value configured in the first command into a corresponding address of a Random Access Memory (RAM), and stores the address into a register corresponding to the current drawing task;
the GPU acquires a second command generated by the host for the current drawing task, wherein the second command comprises a parameter configuration command;
when the GPU executes the parameter configuration command, reading a value from the RAM according to the address stored by the register; and when the value read from the RAM is equal to the value configured by the host in the parameter configuration command, the GPU confirms that the current drawing task is completed, generates interrupt information corresponding to the current drawing task and sends the interrupt information to the host.
2. The method of claim 1, wherein the GPU obtaining the first command generated by the host for the current rendering task comprises:
the GPU obtains a first command written by the host through the register configuration bus.
3. The method of claim 1, wherein the GPU obtaining the second command generated by the host for the current rendering task comprises:
the GPU obtains command words generated by the host for the current drawing task;
and the GPU performs command analysis on the obtained command words and separates the parameter configuration command from the drawing command.
4. The method of claim 3, further comprising: and after the GPU performs command analysis on the obtained command words, sending the drawing command to a functional unit.
5. The method of claim 1, wherein the GPU obtaining the first command generated by the host for the current rendering task comprises:
when the GPU allows a plurality of drawing tasks to be executed simultaneously, the GPU acquires first commands respectively generated by the host computer for the plurality of current drawing tasks.
6. The method of claim 1, wherein the GPU obtaining the second command generated by the host for the current rendering task comprises:
and when the GPU allows the simultaneous execution of a plurality of drawing tasks, the GPU acquires second commands respectively generated by the host computer for the plurality of current drawing tasks.
7. The method of claim 1, further comprising, before the GPU writes the values configured in the first command to respective addresses of RAM:
initializing the RAM, and setting the content of the corresponding address of the RAM to be 0.
8. A device for processing multiple rendering tasks in a Graphics Processing Unit (GPU), comprising:
the system comprises a first acquisition module, a second acquisition module and a processing module, wherein the first acquisition module is used for acquiring a first command generated by a host for a current drawing task, and the first command comprises a register configuration command;
the first processing module is used for writing the value configured in the first command into a corresponding address of an RAM (random access memory), and storing the corresponding address into a register corresponding to the current drawing task;
a second obtaining module, configured to obtain a second command generated by the host for the current drawing task, where the second command includes a parameter configuration command;
the second processing module is used for reading values from the RAM according to corresponding addresses stored in the register when the parameter configuration command is executed; and the processor is also used for confirming that the current drawing task is completed when the value read from the RAM is equal to the value configured by the host in the parameter configuration command, generating interrupt information corresponding to the current drawing task and sending the interrupt information to the host.
9. A terminal, comprising:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any one of claims 1-7.
10. A computer-readable storage medium, having stored thereon a computer program; the computer program is executed by a processor to implement the method of any one of claims 1-7.
11. A host computer comprising a motherboard and a graphics card communicatively coupled to the motherboard, the graphics card having the apparatus of claim 8.
CN202010730288.3A 2020-07-27 2020-07-27 Method, device, terminal, medium and host for processing multiple drawing tasks in GPU Active CN111880916B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010730288.3A CN111880916B (en) 2020-07-27 2020-07-27 Method, device, terminal, medium and host for processing multiple drawing tasks in GPU

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010730288.3A CN111880916B (en) 2020-07-27 2020-07-27 Method, device, terminal, medium and host for processing multiple drawing tasks in GPU

Publications (2)

Publication Number Publication Date
CN111880916A true CN111880916A (en) 2020-11-03
CN111880916B CN111880916B (en) 2024-08-16

Family

ID=73200655

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010730288.3A Active CN111880916B (en) 2020-07-27 2020-07-27 Method, device, terminal, medium and host for processing multiple drawing tasks in GPU

Country Status (1)

Country Link
CN (1) CN111880916B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559102A (en) * 2020-12-21 2021-03-26 交控科技股份有限公司 Task operation time sequence display method and device, electronic equipment and storage medium
CN115878521A (en) * 2023-01-17 2023-03-31 北京象帝先计算技术有限公司 Command processing system, electronic device and electronic equipment
CN116188247A (en) * 2023-02-06 2023-05-30 格兰菲智能科技有限公司 Register information processing method, device, computer equipment and storage medium
CN116339944A (en) * 2023-03-14 2023-06-27 海光信息技术股份有限公司 Task processing method, chip, multi-chip module, electronic device and storage medium
CN118035163A (en) * 2024-04-10 2024-05-14 深圳中微电科技有限公司 Method, system and storage medium for processing data in real time by GPU

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100110083A1 (en) * 2008-11-06 2010-05-06 Via Technologies, Inc. Metaprocessor for GPU Control and Synchronization in a Multiprocessor Environment
CN108520489A (en) * 2018-04-12 2018-09-11 长沙景美集成电路设计有限公司 It is a kind of in GPU to realize that command analysis and vertex obtain parallel device and method
CN109840878A (en) * 2018-12-12 2019-06-04 中国航空工业集团公司西安航空计算技术研究所 It is a kind of based on SystemC towards GPU parameter management method
CN110415161A (en) * 2019-07-19 2019-11-05 龙芯中科技术有限公司 Graphic processing method, device, equipment and storage medium
CN111158875A (en) * 2019-12-25 2020-05-15 眸芯科技(上海)有限公司 Multi-module-based multi-task processing method, device and system
CN111221476A (en) * 2020-01-08 2020-06-02 深圳忆联信息系统有限公司 Front-end command processing method and device for improving SSD performance, computer equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100110083A1 (en) * 2008-11-06 2010-05-06 Via Technologies, Inc. Metaprocessor for GPU Control and Synchronization in a Multiprocessor Environment
CN108520489A (en) * 2018-04-12 2018-09-11 长沙景美集成电路设计有限公司 It is a kind of in GPU to realize that command analysis and vertex obtain parallel device and method
CN109840878A (en) * 2018-12-12 2019-06-04 中国航空工业集团公司西安航空计算技术研究所 It is a kind of based on SystemC towards GPU parameter management method
CN110415161A (en) * 2019-07-19 2019-11-05 龙芯中科技术有限公司 Graphic processing method, device, equipment and storage medium
CN111158875A (en) * 2019-12-25 2020-05-15 眸芯科技(上海)有限公司 Multi-module-based multi-task processing method, device and system
CN111221476A (en) * 2020-01-08 2020-06-02 深圳忆联信息系统有限公司 Front-end command processing method and device for improving SSD performance, computer equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
雷元武;陈小文;彭元喜;: "DSP芯片中的高能效FFT加速器", 计算机研究与发展, no. 07 *
鲍云峰;曾张帆;唐文龙;田茂;: "基于OpenCL与FPGA异构模式的Sobel算法研究", 计算机测量与控制, no. 01 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559102A (en) * 2020-12-21 2021-03-26 交控科技股份有限公司 Task operation time sequence display method and device, electronic equipment and storage medium
CN115878521A (en) * 2023-01-17 2023-03-31 北京象帝先计算技术有限公司 Command processing system, electronic device and electronic equipment
CN116188247A (en) * 2023-02-06 2023-05-30 格兰菲智能科技有限公司 Register information processing method, device, computer equipment and storage medium
CN116188247B (en) * 2023-02-06 2024-04-12 格兰菲智能科技有限公司 Register information processing method, device, computer equipment and storage medium
CN116339944A (en) * 2023-03-14 2023-06-27 海光信息技术股份有限公司 Task processing method, chip, multi-chip module, electronic device and storage medium
CN116339944B (en) * 2023-03-14 2024-05-17 海光信息技术股份有限公司 Task processing method, chip, multi-chip module, electronic device and storage medium
CN118035163A (en) * 2024-04-10 2024-05-14 深圳中微电科技有限公司 Method, system and storage medium for processing data in real time by GPU

Also Published As

Publication number Publication date
CN111880916B (en) 2024-08-16

Similar Documents

Publication Publication Date Title
CN111880916B (en) Method, device, terminal, medium and host for processing multiple drawing tasks in GPU
CN112181522B (en) Data processing method and device and electronic equipment
CN104461698A (en) Dynamic virtual disk mounting method, virtual disk management device and distributed storage system
CN113407414A (en) Program operation monitoring method, device, terminal and storage medium
CN112395093A (en) Multithreading data processing method and device, electronic equipment and readable storage medium
CN113127314A (en) Method and device for detecting program performance bottleneck and computer equipment
CN115794317A (en) Processing method, device, equipment and medium based on virtual machine
CN109408208B (en) Multitasking method, device and system of navigation chip and storage medium
CN112988458A (en) Data backup method and device, electronic equipment and storage medium
CN113918233A (en) AI chip control method, electronic equipment and AI chip
CN117896351A (en) Slave address updating method and related device
CN111930651A (en) Instruction execution method, device, equipment and readable storage medium
CN111915475B (en) Processing method of drawing command, GPU, host, terminal and medium
CN116301775A (en) Code generation method, device, equipment and medium based on reset tree prototype graph
CN115358331A (en) Device type identification method and device, computer readable storage medium and terminal
US20150323602A1 (en) Monitoring method, monitoring apparatus, and electronic device
CN109976778B (en) Software updating method and system of vehicle electronic product, upper computer and storage medium
CN112579305A (en) Task processing method and device, nonvolatile storage medium and equipment
CN112817534B (en) Method, device, computer equipment and storage medium for improving SSD read-write performance
CN112580086A (en) Access protection method, device, equipment and storage medium for configuration file
CN117215966B (en) Test method and test device for chip SDK interface and electronic equipment
CN118245231B (en) Distribution method, device, equipment and storage medium of server PCIe (peripheral component interconnect express) resources
CN114297135B (en) Method, device and storage medium for dynamically adjusting allocation of high-speed input/output channels
US9336011B2 (en) Server and booting method
CN114490044A (en) Component sharing method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant