CN111880916A - Multi-drawing task processing method, device, terminal, medium and host in GPU - Google Patents
Multi-drawing task processing method, device, terminal, medium and host in GPU Download PDFInfo
- Publication number
- CN111880916A CN111880916A CN202010730288.3A CN202010730288A CN111880916A CN 111880916 A CN111880916 A CN 111880916A CN 202010730288 A CN202010730288 A CN 202010730288A CN 111880916 A CN111880916 A CN 111880916A
- Authority
- CN
- China
- Prior art keywords
- gpu
- command
- host
- task
- register
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003672 processing method Methods 0.000 title description 3
- 238000000034 method Methods 0.000 claims abstract description 45
- 238000012545 processing Methods 0.000 claims abstract description 37
- 238000009877 rendering Methods 0.000 claims description 56
- 238000004590 computer program Methods 0.000 claims description 21
- 238000004458 analytical method Methods 0.000 claims description 13
- 230000008569 process Effects 0.000 abstract description 10
- 238000012544 monitoring process Methods 0.000 abstract description 7
- 230000003993 interaction Effects 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 abstract description 3
- 238000013461 design Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 6
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4812—Task transfer initiation or dispatching by interrupt, e.g. masked
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30098—Register arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Advance Control (AREA)
- Image Generation (AREA)
Abstract
The embodiment of the application provides a method, a device, a terminal, a medium and a host for processing multiple drawing tasks in a GPU, relates to a GPU chip design technology, and is used for solving the problem that in the prior art, the execution efficiency of the host and the GPU is influenced due to frequent interaction between the host and the GPU in the process of monitoring the execution state of the GPU by the host. The method comprises the following steps: the GPU obtains a first command generated by the host; the first command comprises a register configuration command; the GPU writes the value configured in the first command into a corresponding address of the RAM, and stores the address into a register corresponding to the current drawing task; the GPU obtains a parameter configuration command generated by the host for the current drawing task; when the GPU executes the parameter configuration command, reading a value from the RAM according to the address stored by the register; and when the value read from the RAM is equal to the value configured by the host in the parameter configuration command, the GPU confirms that the current drawing task is completed, generates corresponding interrupt information and sends the interrupt information to the host.
Description
Technical Field
The present disclosure relates to a GPU chip design technology, and in particular, to a method, an apparatus, a terminal, a medium, and a host for processing multiple rendering tasks in a GPU.
Background
A GPU (Graphics Processing Unit) is a microprocessor applied to terminals such as mobile devices and personal computers, and is used for executing a corresponding drawing task according to a drawing task command sent by a host.
In the related art, the GPU generally employs a time-division multiplexing method when executing a plurality of rendering tasks simultaneously. Specifically, the host sends a batch of commands of the rendering tasks to the GPU, monitors the execution state of the GPU, and sends the next batch of commands of the rendering tasks to the GPU and monitors the execution state of the GPU after monitoring that the GPU has executed the batch of rendering tasks. However, in the process of monitoring the execution state of the GPU by the host, the host needs to send information for acquiring the execution state of the rendering task to the GPU and receive GPU return information at a high frequency, which results in frequent interactions between the host and the GPU.
Disclosure of Invention
The embodiment of the application provides a method, a device, a terminal, a medium and a host for processing multiple drawing tasks in a GPU, and aims to solve the problem that in the prior art, the execution efficiency of the host and the GPU is affected due to frequent interaction between the host and the GPU in the process of monitoring the execution state of the GPU by the host.
A first aspect of the embodiments of the present application provides a method for processing multiple rendering tasks in a GPU, including:
the GPU obtains a first command generated by the host for the current drawing task; the first command comprises a register configuration command;
the GPU writes the value configured in the first command into a corresponding address of a Random Access Memory (RAM), and stores the address into a register corresponding to the current drawing task;
the GPU acquires a second command generated by the host for the current drawing task, wherein the second command comprises a parameter configuration command;
when the GPU executes the parameter configuration command, reading a value from the RAM according to the address stored by the register; and when the value read from the RAM is equal to the value configured by the host in the parameter configuration command, the GPU confirms that the current drawing task is completed, generates interrupt information corresponding to the current drawing task and sends the interrupt information to the host.
A second aspect of the present disclosure provides a device for processing multiple rendering tasks in a GPU of a graphics processor, including:
the system comprises a first acquisition module, a second acquisition module and a processing module, wherein the first acquisition module is used for acquiring a first command generated by a host for a current drawing task, and the first command comprises a register configuration command;
the first processing module is used for writing the value configured in the first command into a corresponding address of an RAM (random access memory), and storing the corresponding address into a register corresponding to the current drawing task;
a second obtaining module, configured to obtain a second command generated by the host for the current drawing task, where the second command includes a parameter configuration command;
the second processing module is used for reading values from the RAM according to corresponding addresses stored in the register when the parameter configuration command is executed; and the processor is also used for confirming that the current drawing task is completed when the value read from the RAM is equal to the value configured by the host in the parameter configuration command, generating interrupt information corresponding to the current drawing task and sending the interrupt information to the host.
A third aspect of the embodiments of the present application provides a terminal, including:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement a method as claimed in any preceding claim.
A fourth aspect of embodiments of the present application provides a computer-readable storage medium having a computer program stored thereon; the computer program is executed by a processor to implement a method as claimed in any preceding claim.
A fifth aspect of the embodiments of the present application provides a host computer, which includes a motherboard and a display card, where the display card is in communication connection with the motherboard, and the display card has the apparatus according to any one of the foregoing.
The embodiment of the application provides a method, a device, a terminal, a medium and a host for processing multiple drawing tasks in a GPU (graphics processing Unit), wherein the GPU compares a value in an acquired parameter configuration command with a value pre-written into a RAM (random access memory), and determines that a current drawing task is finished and generates interrupt information to be sent to the host when the value read from the RAM by the GPU is equal to the value in the parameter configuration command according to a comparison result, so that the host is facilitated to determine the execution of a subsequent task. Therefore, the communication times between the host and the GPU are reduced, and the execution efficiency of the host and the GPU is favorably ensured.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a flowchart illustrating a method for processing multiple rendering tasks in a GPU according to an exemplary embodiment;
FIG. 2 is a schematic diagram illustrating an application of a method for processing multiple rendering tasks in a GPU according to an exemplary embodiment;
FIG. 3 is a block diagram of a multi-rendering task processing device in a GPU according to an exemplary embodiment.
Detailed Description
In order to make the technical solutions and advantages of the embodiments of the present application more apparent, the following further detailed description of the exemplary embodiments of the present application with reference to the accompanying drawings makes it clear that the described embodiments are only a part of the embodiments of the present application, and are not exhaustive of all embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
In the related art, a GPU (Graphics Processing Unit) generally employs a time-division multiplexing method when simultaneously executing a plurality of rendering tasks. Specifically, the host sends a batch of commands of the rendering tasks to the GPU, monitors the execution state of the GPU, and sends the next batch of commands of the rendering tasks to the GPU and monitors the execution state of the GPU after monitoring that the GPU has executed the batch of rendering tasks. However, in the process of monitoring the execution state of the GPU by the host, the host needs to send information for acquiring the execution state of the rendering task to the GPU and receive GPU return information at a high frequency, which causes frequent interaction between the host and the GPU and affects the execution efficiency of the host and the GPU.
In order to overcome the above problems, embodiments of the present application provide a method, an apparatus, a terminal, a medium, and a host for processing multiple rendering tasks in a GPU, where the GPU compares a value in an acquired parameter configuration command with a value pre-written into a RAM, and determines that a current rendering task is completed and generates interrupt information to be sent to the host when the value read from the RAM by the GPU is equal to the value in the parameter configuration command according to a comparison result, thereby facilitating the host to determine execution of a subsequent task. Therefore, the communication times between the host and the GPU are reduced, and the execution efficiency of the host and the GPU is favorably ensured.
Fig. 1 is a flowchart illustrating a method for processing multiple rendering tasks in a GPU according to an exemplary embodiment.
The following describes the functions and implementation processes of the method provided by this embodiment with reference to fig. 1.
The method for processing multiple rendering tasks in the GPU provided by this embodiment includes:
s101, a GPU acquires a first command generated by a host for a current drawing task; the first command comprises a register configuration command;
s102, the GPU writes the value configured in the first command into a corresponding address of the RAM, and stores the corresponding address into a register corresponding to the current drawing task;
s103, the GPU acquires a second command generated by the host for the current drawing task, wherein the second command comprises a parameter configuration command;
s104, when the GPU executes the parameter configuration command, reading a value from the RAM according to a corresponding address stored in the register; and when the value read from the RAM is equal to the value configured by the host in the parameter configuration command, the GPU confirms that the current drawing task is completed, generates interrupt information corresponding to the current drawing task and sends the interrupt information to the host.
In this embodiment, the host can configure one register for each rendering task. The configuration of the register is written by the host and read by the GPU. When the GPU has multiple registers and the GPU allows multiple rendering tasks to be performed simultaneously, the host can configure one register for each rendering task, respectively. When the GPU allows only one rendering task to be performed, the host can configure one register for the rendering task.
In step S101, when the GPU allows only one rendering task to be executed, the host can generate a corresponding first command for the rendering task and write the first command into the GPU. When the GPU allows a plurality of drawing tasks to be executed simultaneously, the host can respectively generate a first command for each drawing task and write the first command into the GPU. The first command may include a management-related parameter configuration command such as a register configuration command. The first command writes the value of a register corresponding to the drawing task. When the GPU allows a plurality of rendering tasks to be executed simultaneously, the values of the registers corresponding to different rendering tasks are different.
Illustratively, when the GPU allows N rendering tasks to be executed simultaneously, the GPU sets N registers, which can be represented by R [ N ] for each rendering task; wherein N is an integer of 1 or more, and N is 1 or more and N or less. For example, when n ═ 1, the GPU sets 1 register R [1 ]; or when n is 2, the GPU sets two registers, wherein the two registers are respectively R [1] and R [2], and the values of the register R [1] and the register R [2] are different.
In particular implementations, the GPU may obtain the first command through a variety of ways, that is, the host may write the first command to the GPU through a variety of ways.
In step S102, the GPU writes the value configured for the corresponding register in the acquired first command into a corresponding address of a Random Access Memory (RAM). When the first command is plural, the RAM has plural addresses corresponding to the respective first commands. In a specific implementation, the GPU may store the address configured for the corresponding register in the first command in the register after writing the address into the address of the PAM; or, the GPU allocates a RAM address to the value of the corresponding register in advance, stores the address in the register, and writes the value of the register into the RAM according to the address stored in the register.
Optionally, before the GPU writes the values configured in the first command into the corresponding addresses of the RAM, initializing the RAM, and setting the content of the address for storing the values of the registers in the RAM to 0; illustratively, the contents of all addresses of the RAM are set to 0. The process of initializing the RAM may use the existing technology in the field, and the embodiment is not limited in this respect.
In step S103, when the GPU allows only one rendering task to be executed, the host can generate a corresponding second command for the rendering task and write the second command into the GPU. When the GPU allows a plurality of drawing tasks to be executed simultaneously, the host can respectively generate a second command for each drawing task and write the second command into the GPU. Wherein the second command comprises a parameter configuration command; the parameter configuration command has written therein a value configured for the corresponding register. It should be noted that the execution sequence of step S103, step S101, and step S102 is not specifically limited here.
In step S104, when the GPU executes the parameter configuration command, reading a value from the RAM according to a corresponding address stored in the register; and when the value read from the RAM is not equal to the value configured for the corresponding register by the host in the parameter configuration command, the GPU confirms that the current task is not completed, and the GPU temporarily does not send information related to the execution state of the drawing task to the host. And when the value read from the RAM is equal to the value configured for the corresponding register in the parameter configuration command by the host, the GPU confirms that the current drawing task is completed.
Illustratively, the corresponding address of register R [ N ] in RAM is AddrN; that is, the address in RAM where the value of register R [ N ] is stored is AddrN. When the GPU executes the parameter configuration command, the value in the parameter configuration command is compared with the value read by the GPU from the address AddrN of the RAM according to the address stored in the register R [ N ], until the value configured for the corresponding register in the parameter configuration command is equal to the value read by the GPU from the address AddrN of the RAM according to the address stored in the register R [ N ], and the completion of the current drawing task is confirmed. That is, when the GPU executes the parameter configuration command, and when the value configured for the corresponding register in the parameter configuration command is equal to the value that the GPU reads from the address AddrN of the RAM according to the address stored by the register R [ N ], it is confirmed that the current drawing task is completed.
And when the GPU confirms that the current drawing is considered to be finished, triggering the analysis unit to interrupt analysis, generating interrupt information corresponding to the current drawing task and sending the interrupt information to the host. The interrupt information may carry a value of a corresponding register or information such as an identifier of a corresponding rendering task, so that the host can quickly determine the currently interrupted rendering task according to the interrupt information.
After receiving the interrupt information generated by the GPU, the host determines the currently interrupted drawing task according to the interrupt information, determines that the drawing task is completed, and executes the subsequent tasks. When the GPU allows a plurality of rendering tasks to be executed simultaneously, the GPU may send the interrupt information of each rendering task to the host, that is, the GPU may generate corresponding interrupt information and send the interrupt information to the host when it is determined that one of the rendering tasks is completed. The host may send the related commands of the next drawing task to the GPU after receiving the interrupt information of one of the drawing tasks, or the host may send the related commands of the next drawing task to the GPU after receiving the interrupt information of each of the drawing tasks of the batch.
According to the multi-drawing task processing method in the GPU, after the host sends the command of the drawing task to the GPU, the GPU monitors the execution condition of the drawing task, generates the interrupt information when the drawing task is confirmed to be completed, and sends the state of completion of the execution task to the host, so that the host can determine the execution of the subsequent task. Therefore, the communication frequency between the host and the GPU is reduced, and the execution efficiency of the host and the GPU is ensured.
In one possible implementation, the GPU is able to obtain a first command that the host writes through the register configuration bus. The GPU can also acquire a command word generated by the host for the current drawing task, wherein the command word corresponds to a second command; the second commands include drawing commands and parameter configuration commands.
The GPU acquiring a second command corresponding to a command word generated by the host for the current rendering task may specifically include: the GPU obtains command words generated by the host for the current drawing task; and the GPU performs command analysis on the obtained command words, and separates the parameter configuration command from the drawing command to obtain the parameter configuration command and the drawing command. And after the GPU performs command analysis on the obtained command words, the GPU sends drawing commands to the functional units.
In specific implementation, the host writes a parameter configuration command in a command word generated for the current rendering task. That is, the host inserts parameter configuration commands in the drawing commands generated for the current drawing task. After the host writes the command words into the GPU, the GPU performs command analysis on the drawing commands inserted with the parameter configuration commands, and separates the parameter configuration commands from the drawing commands; the GPU carries out parameter analysis on the parameter configuration command to obtain task management related parameters and drawing related parameters; the task management related parameters comprise values configured for the register by the host; and sending the related drawing parameters to the functional unit.
And after the GPU analyzes the obtained command words of the current drawing task, a command sequence of the current drawing task can be obtained, and the GPU executes corresponding commands according to the sequence in the command sequence. Taking the example that the GPU executes one of the rendering tasks: the GPU executes corresponding commands according to the command sequence of the drawing task; when the GPU executes the parameter configuration command, the GPU compares the value configured for the register in the parameter configuration command with the value read from the RAM by the GPU according to the address stored by the register corresponding to the drawing task, when the value configured for the register in the parameter configuration command is equal to the value read from the RAM by the GPU according to the address stored by the register corresponding to the drawing task, the drawing task is confirmed to be completed, the parameter analysis unit is interrupted, and interrupt information is generated and sent to the host. And the host receives the interrupt information, inquires the interrupt register according to the interrupt information to obtain the current drawing task generating the interrupt, and further determines the execution of the subsequent drawing task.
For example, as shown in FIG. 2, assume that the GPU allows n rendering tasks to be performed simultaneously. The host computer sets registers R1 and R2 … R N for N drawing tasks including task 0 and task1 … task N-1, and configures corresponding values for registers R1 and R2 … R N. Each register is written by the host and read by the GPU.
The host respectively generates a first command for n drawing tasks including task 0 and task1 … task n-1; and the host writes the first commands of the drawing tasks into the GPU through the register configuration bus. The first command may be a configuration command of the task management related parameter, the configuration command of the task management related parameter includes a command of register configuration, and the task management related parameter may include a value configured for a corresponding register.
The host respectively generates command words for n drawing tasks including task 0 and task1 … task n-1, and parameter configuration commands are written in the command words; drawing commands are also included in the command word. And the host writes the command words corresponding to the drawing tasks into the GPU.
And after the GPU acquires the first command of each drawing task through the register configuration bus, writing the value configured for the corresponding register in the first command of each drawing task into the corresponding address of the RAM. Illustratively, the register corresponding to the drawing task 0 is R [1], and the value of the register R [1] is stored into the address Addr1 of the RAM; drawing a register corresponding to the task1 as R < 2 >, and storing the value of the register R < 2 > into an address Addr2 of the RAM; …, respectively; the register corresponding to task N-1 is R [ N ], and the value of register R [ N ] is stored into address AddrN of RAM.
After the GPU obtains the command words, the obtained command words are analyzed, and the separated drawing commands and parameter configuration commands are obtained. And performing parameter analysis on the parameter configuration command to obtain a value configured for the corresponding register and drawing related parameters. Taking the GPU to execute the rendering task n-1 as an example: the GPU executes a corresponding command according to the command sequence of task n-1; when the GPU executes the parameter configuration command, the GPU compares the value configured for the corresponding register in the parameter configuration command with the value read from the RAM by the GPU according to the address stored by the register R [ N ], and when the value configured for the corresponding register in the parameter configuration command is not equal to the value read from the RAM by the GPU according to the address stored by the register R [ N ], the parameter analysis unit does not generate interruption; when the value configured for the corresponding register in the parameter configuration command is equal to the value read by the GPU from the RAM according to the address stored by the register R [ N ], the GPU confirms that the task N-1 is completed, the parameter analysis unit is interrupted, and interrupt information is generated and sent to the host. And the host receives the interrupt information, inquires the interrupt register according to the interrupt information to obtain the current drawing task generating the interrupt, and further determines the execution of the subsequent drawing task.
In the process, the GPU only needs to generate interrupt information and send the interrupt information to the host when monitoring that the corresponding drawing task is completed; therefore, the GPU does not need to frequently interact with the host, so that the communication frequency between the host and the GPU is reduced, the parallel operation of the host and the GPU is facilitated, the execution efficiency of the host and the GPU is facilitated to be ensured, the frequency of system switching tasks is facilitated to be reduced, and the execution efficiency of the system is facilitated to be ensured.
The present embodiment provides a multi-rendering task processing apparatus in a GPU, which can be used as an execution subject of the aforementioned method embodiments. The function and implementation process of the device may be the same as or similar to those of the previous embodiments, and the description of this embodiment is omitted.
FIG. 3 is a block diagram of a multi-rendering task processing device in a GPU according to an exemplary embodiment.
As shown in fig. 3, the multi-rendering task processing apparatus in the GPU of the present embodiment includes:
a first obtaining module 31, configured to obtain a first command generated by a host for a current drawing task;
the first processing module 32 is configured to write the value configured in the first command into a corresponding address of the RAM, and store the corresponding address into a register corresponding to the current drawing task;
a second obtaining module 33, configured to obtain a second command generated by the host for the current drawing task, where the second command includes a parameter configuration command;
the second processing module 34 is used for reading values from the RAM according to corresponding addresses stored in the register when the parameter configuration command is executed; and the device is also used for confirming that the current drawing task is completed when the value read from the RAM is equal to the value configured in the parameter configuration command by the host, generating interrupt information corresponding to the current drawing task and sending the interrupt information to the host.
In one possible implementation manner, the first obtaining module 31 is specifically configured to:
a first command written by the host through the register configuration bus is obtained.
In one possible implementation manner, the second obtaining module 33 is specifically configured to:
acquiring command words generated by a host for a current drawing task;
and the GPU performs command analysis on the obtained command words and separates the parameter configuration command from the drawing command.
The second obtaining module 33 is further configured to: and sending the drawing command obtained by analyzing the command of the acquired command word to the functional unit.
In one possible implementation manner, the first obtaining module 31 is specifically configured to:
when the GPU allows a plurality of drawing tasks to be executed simultaneously, acquiring first commands respectively generated by the host for the plurality of current drawing tasks.
In one possible implementation manner, the second obtaining module 33 is specifically configured to:
and when the GPU allows the simultaneous execution of the plurality of drawing tasks, the GPU acquires second commands respectively generated by the host computer for the plurality of current drawing tasks.
In one possible implementation manner, the first processing module 32 is further configured to: before writing the value configured in the first command into the corresponding address of the RAM, initializing the RAM, and setting the content of the corresponding address of the PAM to be 0.
In the device provided by this embodiment, after the host sends the command of the rendering task to the GPU, the GPU monitors the execution state of the current rendering task by using the value of the register configured for the current rendering task, and generates interrupt information to send to the host when it is determined that the current rendering task is completed, which is beneficial for the host to determine the execution of the subsequent task. Therefore, the communication frequency between the host and the GPU is reduced, the execution efficiency of the host and the GPU is favorably ensured, the frequency of system switching tasks is favorably reduced, and the execution efficiency of the system is favorably ensured.
The present embodiment further provides a terminal, including:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement a method as in any of the preceding examples.
The memory is used for storing a computer program, and the processor executes the computer program after receiving the execution instruction, and the method executed by the apparatus defined by the flow process disclosed in the foregoing corresponding embodiments can be applied to or implemented by the processor.
The Memory may comprise a Random Access Memory (RAM) and may also include a non-volatile Memory, such as at least one disk Memory. The memory can implement communication connection between the system network element and at least one other network element through at least one communication interface (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, and the like can be used.
The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the method disclosed in the first embodiment may be implemented by hardware integrated logic circuits in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The corresponding methods, steps, and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software elements in the decoding processor. The software elements may be located in ram, flash, rom, prom, or eprom, registers, among other storage media that are well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
The present embodiment also provides a computer-readable storage medium having a computer program stored thereon; the computer program is executed by a processor to implement a method as in any of the preceding examples.
The present embodiment further provides a host, including a motherboard and a graphics card, where the graphics card is in communication with the motherboard, and the graphics card has the apparatus as in any of the above examples. The mainboard is used for transmitting the signal to the display card or transmitting the output signal of the display card to other components.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.
Claims (11)
1. A method for processing multiple rendering tasks in a Graphics Processing Unit (GPU), comprising:
the GPU obtains a first command generated by the host for the current drawing task; the first command comprises a register configuration command;
the GPU writes the value configured in the first command into a corresponding address of a Random Access Memory (RAM), and stores the address into a register corresponding to the current drawing task;
the GPU acquires a second command generated by the host for the current drawing task, wherein the second command comprises a parameter configuration command;
when the GPU executes the parameter configuration command, reading a value from the RAM according to the address stored by the register; and when the value read from the RAM is equal to the value configured by the host in the parameter configuration command, the GPU confirms that the current drawing task is completed, generates interrupt information corresponding to the current drawing task and sends the interrupt information to the host.
2. The method of claim 1, wherein the GPU obtaining the first command generated by the host for the current rendering task comprises:
the GPU obtains a first command written by the host through the register configuration bus.
3. The method of claim 1, wherein the GPU obtaining the second command generated by the host for the current rendering task comprises:
the GPU obtains command words generated by the host for the current drawing task;
and the GPU performs command analysis on the obtained command words and separates the parameter configuration command from the drawing command.
4. The method of claim 3, further comprising: and after the GPU performs command analysis on the obtained command words, sending the drawing command to a functional unit.
5. The method of claim 1, wherein the GPU obtaining the first command generated by the host for the current rendering task comprises:
when the GPU allows a plurality of drawing tasks to be executed simultaneously, the GPU acquires first commands respectively generated by the host computer for the plurality of current drawing tasks.
6. The method of claim 1, wherein the GPU obtaining the second command generated by the host for the current rendering task comprises:
and when the GPU allows the simultaneous execution of a plurality of drawing tasks, the GPU acquires second commands respectively generated by the host computer for the plurality of current drawing tasks.
7. The method of claim 1, further comprising, before the GPU writes the values configured in the first command to respective addresses of RAM:
initializing the RAM, and setting the content of the corresponding address of the RAM to be 0.
8. A device for processing multiple rendering tasks in a Graphics Processing Unit (GPU), comprising:
the system comprises a first acquisition module, a second acquisition module and a processing module, wherein the first acquisition module is used for acquiring a first command generated by a host for a current drawing task, and the first command comprises a register configuration command;
the first processing module is used for writing the value configured in the first command into a corresponding address of an RAM (random access memory), and storing the corresponding address into a register corresponding to the current drawing task;
a second obtaining module, configured to obtain a second command generated by the host for the current drawing task, where the second command includes a parameter configuration command;
the second processing module is used for reading values from the RAM according to corresponding addresses stored in the register when the parameter configuration command is executed; and the processor is also used for confirming that the current drawing task is completed when the value read from the RAM is equal to the value configured by the host in the parameter configuration command, generating interrupt information corresponding to the current drawing task and sending the interrupt information to the host.
9. A terminal, comprising:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of any one of claims 1-7.
10. A computer-readable storage medium, having stored thereon a computer program; the computer program is executed by a processor to implement the method of any one of claims 1-7.
11. A host computer comprising a motherboard and a graphics card communicatively coupled to the motherboard, the graphics card having the apparatus of claim 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010730288.3A CN111880916B (en) | 2020-07-27 | 2020-07-27 | Method, device, terminal, medium and host for processing multiple drawing tasks in GPU |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010730288.3A CN111880916B (en) | 2020-07-27 | 2020-07-27 | Method, device, terminal, medium and host for processing multiple drawing tasks in GPU |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111880916A true CN111880916A (en) | 2020-11-03 |
CN111880916B CN111880916B (en) | 2024-08-16 |
Family
ID=73200655
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010730288.3A Active CN111880916B (en) | 2020-07-27 | 2020-07-27 | Method, device, terminal, medium and host for processing multiple drawing tasks in GPU |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111880916B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112559102A (en) * | 2020-12-21 | 2021-03-26 | 交控科技股份有限公司 | Task operation time sequence display method and device, electronic equipment and storage medium |
CN115878521A (en) * | 2023-01-17 | 2023-03-31 | 北京象帝先计算技术有限公司 | Command processing system, electronic device and electronic equipment |
CN116188247A (en) * | 2023-02-06 | 2023-05-30 | 格兰菲智能科技有限公司 | Register information processing method, device, computer equipment and storage medium |
CN116339944A (en) * | 2023-03-14 | 2023-06-27 | 海光信息技术股份有限公司 | Task processing method, chip, multi-chip module, electronic device and storage medium |
CN118035163A (en) * | 2024-04-10 | 2024-05-14 | 深圳中微电科技有限公司 | Method, system and storage medium for processing data in real time by GPU |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100110083A1 (en) * | 2008-11-06 | 2010-05-06 | Via Technologies, Inc. | Metaprocessor for GPU Control and Synchronization in a Multiprocessor Environment |
CN108520489A (en) * | 2018-04-12 | 2018-09-11 | 长沙景美集成电路设计有限公司 | It is a kind of in GPU to realize that command analysis and vertex obtain parallel device and method |
CN109840878A (en) * | 2018-12-12 | 2019-06-04 | 中国航空工业集团公司西安航空计算技术研究所 | It is a kind of based on SystemC towards GPU parameter management method |
CN110415161A (en) * | 2019-07-19 | 2019-11-05 | 龙芯中科技术有限公司 | Graphic processing method, device, equipment and storage medium |
CN111158875A (en) * | 2019-12-25 | 2020-05-15 | 眸芯科技(上海)有限公司 | Multi-module-based multi-task processing method, device and system |
CN111221476A (en) * | 2020-01-08 | 2020-06-02 | 深圳忆联信息系统有限公司 | Front-end command processing method and device for improving SSD performance, computer equipment and storage medium |
-
2020
- 2020-07-27 CN CN202010730288.3A patent/CN111880916B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100110083A1 (en) * | 2008-11-06 | 2010-05-06 | Via Technologies, Inc. | Metaprocessor for GPU Control and Synchronization in a Multiprocessor Environment |
CN108520489A (en) * | 2018-04-12 | 2018-09-11 | 长沙景美集成电路设计有限公司 | It is a kind of in GPU to realize that command analysis and vertex obtain parallel device and method |
CN109840878A (en) * | 2018-12-12 | 2019-06-04 | 中国航空工业集团公司西安航空计算技术研究所 | It is a kind of based on SystemC towards GPU parameter management method |
CN110415161A (en) * | 2019-07-19 | 2019-11-05 | 龙芯中科技术有限公司 | Graphic processing method, device, equipment and storage medium |
CN111158875A (en) * | 2019-12-25 | 2020-05-15 | 眸芯科技(上海)有限公司 | Multi-module-based multi-task processing method, device and system |
CN111221476A (en) * | 2020-01-08 | 2020-06-02 | 深圳忆联信息系统有限公司 | Front-end command processing method and device for improving SSD performance, computer equipment and storage medium |
Non-Patent Citations (2)
Title |
---|
雷元武;陈小文;彭元喜;: "DSP芯片中的高能效FFT加速器", 计算机研究与发展, no. 07 * |
鲍云峰;曾张帆;唐文龙;田茂;: "基于OpenCL与FPGA异构模式的Sobel算法研究", 计算机测量与控制, no. 01 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112559102A (en) * | 2020-12-21 | 2021-03-26 | 交控科技股份有限公司 | Task operation time sequence display method and device, electronic equipment and storage medium |
CN115878521A (en) * | 2023-01-17 | 2023-03-31 | 北京象帝先计算技术有限公司 | Command processing system, electronic device and electronic equipment |
CN116188247A (en) * | 2023-02-06 | 2023-05-30 | 格兰菲智能科技有限公司 | Register information processing method, device, computer equipment and storage medium |
CN116188247B (en) * | 2023-02-06 | 2024-04-12 | 格兰菲智能科技有限公司 | Register information processing method, device, computer equipment and storage medium |
CN116339944A (en) * | 2023-03-14 | 2023-06-27 | 海光信息技术股份有限公司 | Task processing method, chip, multi-chip module, electronic device and storage medium |
CN116339944B (en) * | 2023-03-14 | 2024-05-17 | 海光信息技术股份有限公司 | Task processing method, chip, multi-chip module, electronic device and storage medium |
CN118035163A (en) * | 2024-04-10 | 2024-05-14 | 深圳中微电科技有限公司 | Method, system and storage medium for processing data in real time by GPU |
Also Published As
Publication number | Publication date |
---|---|
CN111880916B (en) | 2024-08-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111880916B (en) | Method, device, terminal, medium and host for processing multiple drawing tasks in GPU | |
CN112181522B (en) | Data processing method and device and electronic equipment | |
CN104461698A (en) | Dynamic virtual disk mounting method, virtual disk management device and distributed storage system | |
CN113407414A (en) | Program operation monitoring method, device, terminal and storage medium | |
CN112395093A (en) | Multithreading data processing method and device, electronic equipment and readable storage medium | |
CN113127314A (en) | Method and device for detecting program performance bottleneck and computer equipment | |
CN115794317A (en) | Processing method, device, equipment and medium based on virtual machine | |
CN109408208B (en) | Multitasking method, device and system of navigation chip and storage medium | |
CN112988458A (en) | Data backup method and device, electronic equipment and storage medium | |
CN113918233A (en) | AI chip control method, electronic equipment and AI chip | |
CN117896351A (en) | Slave address updating method and related device | |
CN111930651A (en) | Instruction execution method, device, equipment and readable storage medium | |
CN111915475B (en) | Processing method of drawing command, GPU, host, terminal and medium | |
CN116301775A (en) | Code generation method, device, equipment and medium based on reset tree prototype graph | |
CN115358331A (en) | Device type identification method and device, computer readable storage medium and terminal | |
US20150323602A1 (en) | Monitoring method, monitoring apparatus, and electronic device | |
CN109976778B (en) | Software updating method and system of vehicle electronic product, upper computer and storage medium | |
CN112579305A (en) | Task processing method and device, nonvolatile storage medium and equipment | |
CN112817534B (en) | Method, device, computer equipment and storage medium for improving SSD read-write performance | |
CN112580086A (en) | Access protection method, device, equipment and storage medium for configuration file | |
CN117215966B (en) | Test method and test device for chip SDK interface and electronic equipment | |
CN118245231B (en) | Distribution method, device, equipment and storage medium of server PCIe (peripheral component interconnect express) resources | |
CN114297135B (en) | Method, device and storage medium for dynamically adjusting allocation of high-speed input/output channels | |
US9336011B2 (en) | Server and booting method | |
CN114490044A (en) | Component sharing method, device, equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |