CN116521606B - Task processing method, device, computing equipment and storage medium - Google Patents

Task processing method, device, computing equipment and storage medium Download PDF

Info

Publication number
CN116521606B
CN116521606B CN202310762527.7A CN202310762527A CN116521606B CN 116521606 B CN116521606 B CN 116521606B CN 202310762527 A CN202310762527 A CN 202310762527A CN 116521606 B CN116521606 B CN 116521606B
Authority
CN
China
Prior art keywords
task
coprocessor
queue
description information
tail
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310762527.7A
Other languages
Chinese (zh)
Other versions
CN116521606A (en
Inventor
周伟
李宇轩
冯健德
彭文涛
孙子坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taichu Wuxi Electronic Technology Co ltd
Original Assignee
Taichu Wuxi Electronic Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taichu Wuxi Electronic Technology Co ltd filed Critical Taichu Wuxi Electronic Technology Co ltd
Priority to CN202310762527.7A priority Critical patent/CN116521606B/en
Publication of CN116521606A publication Critical patent/CN116521606A/en
Application granted granted Critical
Publication of CN116521606B publication Critical patent/CN116521606B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the invention discloses a task processing method, a task processing device, computing equipment and a storage medium. The method comprises the following steps: acquiring a coprocessor task, and updating the head of a task issuing queue according to the coprocessor task, wherein the task issuing queue is used for storing task description information corresponding to the coprocessor task, and updating the tail of the task issuing queue through the coprocessor; determining a target coprocessor task which is completed according to a task completion queue, wherein the task completion queue is used for storing task end description information corresponding to the coprocessor task, and updating the head of the task completion queue through the coprocessor; and updating the tail of the task completion queue according to the target coprocessor task, and recycling the target coprocessor task. The embodiment of the invention can realize concurrent execution between the main processor and the coprocessor, greatly improve the occupancy rate of the main processor and the coprocessor and greatly improve the performance of the equipment side.

Description

Task processing method, device, computing equipment and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a task processing method, a task processing device, a computing device, and a storage medium.
Background
In order to achieve the purpose of high performance and low power consumption, the heterogeneous many-core system adopts a system architecture of a main processor and a coprocessor. The main processor is responsible for processing complex logic control tasks, and the coprocessor is responsible for processing large-scale data parallel tasks with high computation density and simple logic branches.
The traditional heterogeneous many-core system execution flow comprises the following flows: 1) Copying the coprocessor task to the equipment side by the host side; 2) The main processor at the equipment side sequentially transmits the tasks of the coprocessor to finish data calculation; 3) The coprocessor at the device side returns the calculation result to the main processor, and copies the calculation result back to the host side through the main processor at the device side.
However, in the conventional scheme, the coprocessor is in an idle state in the process of issuing a task to the coprocessor by the main processor, and in the process of calculating by the coprocessor, the main processor is in the idle state, so that the performance waste of the equipment side is serious.
Disclosure of Invention
The invention provides a task processing method, a device, a computing device and a storage medium, which realize concurrent execution of a main processor and a coprocessor and can improve the performance of the device side.
According to an aspect of the present invention, there is provided a task processing method applied to a main processor, including:
Acquiring a coprocessor task, and updating the head of a task issuing queue according to the coprocessor task, wherein the task issuing queue is used for storing task description information corresponding to the coprocessor task, and updating the tail of the task issuing queue through a coprocessor;
determining a target coprocessor task which is completed according to a task completion queue, wherein the task completion queue is used for storing task end description information corresponding to the coprocessor task, and updating the head of the task completion queue through the coprocessor;
and updating the tail of the task completion queue according to the target coprocessor task so as to recycle the target coprocessor task.
According to another aspect of the present invention, there is provided another task processing method, applied to a coprocessor, including:
under the condition that the task issuing exists, reading target task description information at the tail part of a task issuing queue, and updating the tail part of the task issuing queue, wherein the task issuing queue is used for storing task description information corresponding to a coprocessor task, and updating the head part of the task issuing queue based on the coprocessor task through a main processor;
Determining an object code according to the object task description information, and executing the coprocessor task by running the object code;
and for executing the completed target coprocessor task, updating the head of the task completion queue according to task end description information corresponding to the target coprocessor task, and updating the tail of the task completion queue by the main processor according to the target coprocessor task so as to recycle the target coprocessor task.
According to another aspect of the present invention, there is provided a task processing device including:
the task issuing queue updating module is used for acquiring the coprocessor task, updating the head of the task issuing queue according to the coprocessor task, wherein the task issuing queue is used for storing task description information corresponding to the coprocessor task, and updating the tail of the task issuing queue through the coprocessor;
the task determining module is used for determining a target coprocessor task which is completed according to a task completion queue, wherein the task completion queue is used for storing task end description information corresponding to the coprocessor task, and the head of the task completion queue is updated through the coprocessor;
And the task recycling module is used for updating the tail part of the task completion queue according to the target coprocessor task so as to recycle the target coprocessor task.
According to another aspect of the present invention, there is provided another task processing device including:
the information reading module is used for reading target task description information at the tail part of a task issuing queue and updating the tail part of the task issuing queue under the condition that the task issuing exists, wherein the task issuing queue is used for storing task description information corresponding to a coprocessor task and updating the head part of the task issuing queue based on the coprocessor task through the main processor;
the code execution module is used for determining target codes according to the target task description information and executing the coprocessor tasks by running the target codes;
and the task completion queue updating module is used for updating the head of the task completion queue according to task end description information corresponding to the target coprocessor task for executing the completed target coprocessor task, and updating the tail of the task completion queue according to the target coprocessor task by the main processor so as to recycle the target coprocessor task.
According to another aspect of the present invention, there is provided a computing device comprising:
at least one main processor;
at least one coprocessor; and a memory communicatively coupled to the at least one host processor and/or the at least one co-processor, the memory storing computer program instructions; wherein the computer program instructions, when executed by the at least one host processor, enable the at least one host processor to perform the task processing method according to any one of the embodiments of the present invention;
the computer program instructions, when executed by the at least one coprocessor, enable the at least one coprocessor to perform the task processing method according to any one of the embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to execute a task processing method according to any one of the embodiments of the present invention.
The embodiment of the invention provides a task processing method, which is characterized in that a task issuing queue and a task completion queue are shared between a main processor and a coprocessor, task description information corresponding to a coprocessor task is stored through the task issuing queue, task ending description information corresponding to the coprocessor task is stored through the task completion queue, the head of the task issuing queue and the tail of the task completion queue are updated through the main processor, and the tail of the task issuing queue and the head of the task completion queue are updated through the coprocessor. The main processor can issue tasks to the coprocessor in the task execution process through the task issue queue and the task completion queue, and the coprocessor can submit information of task end in the task issue process of the main processor, so that the main processor and the coprocessor are always in a working state, the occupancy rate of the main processor and the coprocessor can be greatly improved, and the performance of the equipment side can be greatly improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a task processing flow provided in the prior art.
Fig. 2 is a flowchart of a task processing method according to an embodiment of the present invention.
Fig. 3 is a flowchart of another task processing method according to an embodiment of the present invention.
Fig. 4 is a flowchart of another task processing method according to an embodiment of the present invention.
Fig. 5 is a schematic diagram of a task issue queue according to an embodiment of the present disclosure.
Fig. 6 is a schematic diagram of a task completion queue provided in an embodiment of the present disclosure.
Fig. 7 is a schematic diagram of a task processing flow according to an embodiment of the present invention.
Fig. 8 is a schematic structural diagram of a task processing device according to an embodiment of the present invention.
Fig. 9 is a schematic structural diagram of another task processing device according to an embodiment of the present invention.
FIG. 10 illustrates a schematic diagram of a computing device that may be used to implement an embodiment of the invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Fig. 1 is a schematic diagram of a task processing flow provided in the prior art. As shown in fig. 1, the heterogeneous many-core architecture implements a work queue between the host side and the device side, so that task delivery on the host side and task execution on the device side can implement asynchronization. The method specifically comprises the following steps: the host CPU110 continuously submits the tasks to the task queue 120, acquires the tasks from the task queue 120 through the host processor 130 at the device side, sequentially submits the tasks to the coprocessor 140 at the device side, executes the tasks by the coprocessor 140 at the device side, and returns the calculation results to the host processor.
However, in the above process, the coprocessor is in an idle state during the process of issuing the task to the coprocessor by the main processor, and in the process of calculating the coprocessor, the main processor is in the idle state, which is a serious waste of calculation force.
In order to solve the technical problems, the invention provides a task processing method, which can realize concurrent execution between a main processor and a coprocessor and can greatly improve the performance of a device side.
Fig. 2 is a flowchart of a task processing method according to an embodiment of the present invention, where the method may be performed by a task processing device, and the task processing device may be implemented in hardware and/or software, and the task processing device may be configured in a host processor, where the host processor and the coprocessor are involved in cooperatively performing computation. As shown in fig. 2, the method includes:
S210, acquiring a coprocessor task, and updating the head of a task issuing queue according to the coprocessor task.
Wherein the coprocessor tasks may be tasks performed by the coprocessor. Under the heterogeneous many-core architecture, the coprocessor can be responsible for large-scale data parallel tasks with high computing density and simple logic branches, and the tasks are coprocessor tasks.
The task issuing queue is used for storing task description information corresponding to the coprocessor task, and the tail of the task issuing queue is updated through the coprocessor. In addition, the head of the task issuing queue is updated by the main processor. The task description information is used for describing task execution related address information. The coprocessor can obtain the association relation between the memory mapping and the instruction set codes based on the task description information, so that the instruction set codes are obtained through the memory mapping. For example, the task description information may include information such as task ID, instruction address, and parameter address.
In some embodiments, acquiring the coprocessor task, updating the head of the task issuing queue according to the coprocessor task, including: the CPU of the host side continuously submits the task to the equipment side, and a host processor of the equipment side acquires the coprocessor task and analyzes the coprocessor task to obtain alternative description information; generating task description information corresponding to the coprocessor task according to the target information in the alternative description information; and adding the task description information to the head of the task issuing queue. Because the task description information is added to the head of the task issuing queue, the positions of the head and the tail of the task issuing queue are inconsistent.
Specifically, the coprocessor analyzes the coprocessor task to obtain alternative description information, and extracts target information from the alternative description information, wherein the target information comprises task ID, instruction address, parameter address and other information. And taking the target information as task description information corresponding to the coprocessor task, and adding the task description information into the head of the task issuing queue. And acquiring task description information from the task issuing queue through the coprocessor, jumping to target codes corresponding to the kernel function according to instruction addresses, parameter addresses and the like in the task description information, and running the codes to execute the coprocessor task.
S220, determining to execute the completed target coprocessor task according to the task completion queue.
The task completion queue is used for storing task end description information corresponding to the coprocessor task, and the head of the task completion queue is updated through the coprocessor. In addition, the tail of the task completion queue is updated by the main processor based on task end description information corresponding to the target coprocessor task. The target coprocessor task may be a task that the coprocessor performs. After the task execution is completed through the coprocessor, jumping out of the target code corresponding to the kernel function, and filling task end description information of the execution completion task into a task completion queue. For example, after the task is completed by the coprocessor, task end description information is generated according to a task identifier of the task to be completed, and the task end description information is added to the head of the task completion queue, and the positions of the head and the tail of the task completion queue are inconsistent due to the fact that the task end description information is filled into the task completion queue.
In some embodiments, the determining to execute the completed target coprocessor task according to the task completion queue includes: judging whether the positions of the head part and the tail part of the task completion queue are equal or not; and under the condition that the positions of the head part and the tail part are unequal, determining to execute the completed target coprocessor task according to the task end description information of the tail part of the task completion queue. For example, if the head of the task completion queue is not equal to the tail, the main processor indicates that the task is completed, reads the task end description information of the tail of the task completion queue, and determines to execute the completed target coprocessor task according to the task end description information.
Specifically, determining to execute the completed target coprocessor task according to the task end description information of the tail part of the task completion queue, including: and reading task end description information of the tail part of the task completion queue, and analyzing the read task end description information to obtain a task identifier. And matching task description information in the task issuing queue according to the task identification, and determining to execute the completed target coprocessor task according to a matching result. For example, the main processor reads task end description information of the tail of the task completion queue, and analyzes the task end description information to obtain a task ID. And matching the task ID with the task ID in the task description information in the task issuing queue, and determining the coprocessor task matched with the task ID as the target coprocessor task for executing completion.
S230, updating the tail of the task completion queue according to the target coprocessor task so as to recycle the target coprocessor task.
Illustratively, dequeuing operation is performed on task end description information corresponding to the tail of the task completion queue, and the tail of the task completion queue is updated to task end description information of a next coprocessor task. For example, the main processor dequeues task end description information of the target coprocessor task corresponding to the tail of the task completion queue, and updates the tail of the task completion queue to task end description information of the next coprocessor task of the target coprocessor task. And then judging whether the positions of the tail part of the updated task completion queue and the head part of the task completion queue are equal, if so, determining that the completed task is not executed any more, and if not, determining that the completed task is not dequeued.
And the main processor executes recovery operation on the target coprocessor task, so that the task is prevented from occupying the main processor continuously. And the main processor analyzes the task ending description information to obtain the task ID. Matching the task ID with the task ID in the task description information in the task issuing queue, and destroying the coprocessor task matched with the task ID.
In the embodiment of the invention, after the task is completed by the coprocessor, the task end description information of the completed task is added to the task completion queue, so that the coprocessor continuously adds the task end description information to the head of the task completion queue, and after the main processor reads the task end description information of the target coprocessor task, the tail of the task completion queue is pointed to the task end description information corresponding to the next coprocessor task, thereby realizing that the main processor continuously deletes the task end description information of the completed task from the tail.
According to the technical scheme, the task issuing queue and the task completion queue are shared between the main processor and the coprocessor, task description information corresponding to the coprocessor task is stored through the task issuing queue, task end description information corresponding to the coprocessor task is stored through the task completion queue, the head of the task issuing queue and the tail of the task completion queue are updated through the main processor, and the tail of the task issuing queue and the head of the task completion queue are updated through the coprocessor. The main processor can issue tasks to the coprocessor in the task execution process through the task issue queue and the task completion queue, and the coprocessor can submit information of task end in the task issue process of the main processor, so that the main processor and the coprocessor are always in a working state, the occupancy rate of the main processor and the coprocessor can be greatly improved, and the performance of the equipment side is greatly improved.
Fig. 3 is a flowchart of another task processing method provided in the embodiment of the present invention, where the method may be performed by a task processing device, and the task processing device may be implemented in hardware and/or software, and the task processing device may be configured in a coprocessor, where the case of the co-processor and the main processor are involved in performing computation cooperatively. As shown in fig. 3, the method includes:
and S310, under the condition that the existence of task issuing is detected, reading target task description information at the tail part of a task issuing queue, and updating the tail part of the task issuing queue.
In the embodiment of the invention, the coprocessor can judge whether the task is issued or not through the positions of the head part and the tail part of the task issuing queue. For example, the coprocessor determines that task delivery is detected in the case that the positions of the head and the tail of the task delivery queue are not consistent.
Specifically, detecting whether there is a task issuing manner includes: acquiring the positions of the head and tail of the task issuing queue; if the positions of the head part and the tail part of the task issuing queue are unequal, determining that task issuing exists; and if the positions of the head part and the tail part of the task issuing queue are equal, determining that no task issuing exists.
The task issuing queue is used for storing task description information corresponding to the coprocessor task, and the head of the task issuing queue is updated by the main processor based on the coprocessor task. In addition, the tail of the task issuing queue is updated through the coprocessor. The task description information is used for describing task execution related address information.
The target task description information may be task description information at the tail of the task issuing queue.
Illustratively, the coprocessor reads the target task description information of the tail of the task issue queue, and then updates the task description information of the tail of the task issue queue pointing to the next coprocessor task.
S320, determining an object code according to the object task description information, and executing the coprocessor task by running the object code.
The target code can be code executed by the coprocessor, and can be determined through the target task description information.
Illustratively, determining the target code from the target task description information includes: analyzing the target task description information to obtain an instruction address and a parameter address; and determining an object code according to the instruction address and the parameter address.
Specifically, an instruction address storing the instruction set code may be obtained based on the target task description information, and thus, the instruction set code is acquired through the instruction address. For example, the coprocessor acquires the target code corresponding to the kernel function according to the information such as the target task ID, the target instruction address, the target parameter address and the like, jumps to the target code corresponding to the kernel function, and runs the code to execute the coprocessor task.
S330, for executing the completed target coprocessor task, updating the head of the task completion queue according to task end description information corresponding to the target coprocessor task, and updating the tail of the task completion queue according to the target coprocessor task by the main processor so as to recycle the target coprocessor task.
Wherein the task end description information may be used to identify the coprocessor task for which execution is complete. For example, task end description information is generated by task identification of the completed task.
Illustratively, for executing a completed target coprocessor task, the task ID of the target coprocessor task is added to the head of the task completion queue. If there is a continuous execution of the completed coprocessor task, the coprocessor continuously adds task end description information to the head of the task completion queue, so that the positions of the head and the tail of the task completion queue are inconsistent. When the positions of the head and the tail of the task completion queue are inconsistent, the main processor judges that the target coprocessor task which is completed is executed. And the main processor reads task end description information of the target coprocessor task at the tail of the task completion queue, analyzes the read task end description information and obtains a task identifier. And matching the task description information in the task issuing queue according to the task identification, and determining the target coprocessor task which is completed to execute according to the matching result. After the main processor reads the task end description information of the tail, the tail of the task completion queue is pointed to the next coprocessor task, so that the main processor is enabled to continuously delete the task end description information from the tail.
According to the technical scheme, the task issuing queue and the task completion queue are shared between the main processor and the coprocessor, task description information corresponding to the coprocessor task is stored through the task issuing queue, task end description information corresponding to the coprocessor task is stored through the task completion queue, the head of the task issuing queue and the tail of the task completion queue are updated through the main processor, and the tail of the task issuing queue and the head of the task completion queue are updated through the coprocessor. The main processor can issue tasks to the coprocessor in the task execution process through the task issue queue and the task completion queue, and the coprocessor can submit information of task end in the task issue process of the main processor, so that the main processor and the coprocessor are always in a working state, the occupancy rate of the main processor and the coprocessor can be greatly improved, and the performance of the equipment side is greatly improved.
Fig. 4 is a flowchart of another task processing method according to an embodiment of the present invention, where a specific task processing flow is provided in the embodiment of the present invention. As shown in fig. 4, the method includes:
S410, the main processor receives the coprocessor task.
Illustratively, the host processor receives coprocessor tasks issued by the user.
S420, the main processor updates the coprocessor task to the task issuing queue.
The main processor analyzes the task of the coprocessor to obtain alternative task description information, extracts target information in the alternative task description information, generates task description information corresponding to the task of the coprocessor according to the target information, adds the task description information to the head of the task issuing queue, and can instruct the coprocessor to jump to target codes.
S430, under the condition that the task is issued, the coprocessor reads the task issuing queue.
The main processor adds task description information of the coprocessor task to the head of the task issuing queue, the coprocessor reads that the positions of the head and the tail of the task issuing queue are inconsistent, the task issuing is judged to exist, and the task description information of the tail of the task issuing queue is read.
In addition, after the task description information at the tail of the task issuing queue is read by the coprocessor, the tail of the task issuing queue is pointed to task description information corresponding to the next coprocessor task.
Fig. 5 is a schematic diagram of a task issue queue according to an embodiment of the present disclosure. As shown in fig. 5, the head 511 of the task issue queue 510 is maintained by the main processor, and the tail 512 of the task issue queue 510 is maintained by the coprocessor. The maintenance specifically comprises adding task description information to the head of the task issuing queue through the main processor so as to update the position of the head of the task issuing queue. After the task description information at the tail of the task issuing queue is read by the coprocessor, the tail of the task issuing queue points to the next adjacent task description information so as to update the position of the tail of the task issuing queue.
S440, the coprocessor executes the coprocessor task.
Illustratively, the coprocessor parses task description information to obtain a target task ID, a target instruction address, and a target parameter address, uniquely identifies the coprocessor task by the target task ID, jumps to target code by the target instruction address and the target parameter address, and runs the code to execute the coprocessor task.
S450, under the condition that the task execution is completed, the coprocessor updates a task completion queue.
Illustratively, when the coprocessor completes a coprocessor task, jumping out of the target code, adding task end description information of the coprocessor task to the head of a task completion queue.
S460, the main processor checks the task completion queue.
The main processor reads the positions of the head and the tail of the task completion queue are inconsistent, judges that the task is completed, reads the task end description information of the tail of the task completion queue, and directs the tail of the task completion queue to the next adjacent task end description information so as to update the tail of the task completion queue.
Fig. 6 is a schematic diagram of a task completion queue provided in an embodiment of the present disclosure. As shown in fig. 6, the head 611 of the task completion queue 610 is maintained by the coprocessor and the tail 612 of the task completion queue 610 is maintained by the host processor. The maintenance specifically comprises adding task end description information to the head of the task completion queue through the coprocessor so as to update the position of the head of the task completion queue. After reading the task end description information of the tail of the task completion queue through the main processor, updating the tail of the task completion queue to point to the next adjacent task end description information so as to update the position of the tail of the task completion queue.
S470, in the case that the completed task exists, the main processor reclaims the completed target coprocessor task.
Fig. 7 is a schematic diagram of a task processing flow according to an embodiment of the present invention. As shown in fig. 7, the host-side CPU710 submits tasks to the device-side task queue 720. The host processor 730 on the device side obtains the coprocessor task from the task queue 720 and updates the head of the task issue queue 740 according to the coprocessor task. The tail of the task issue queue 740 is updated by the device-side coprocessor 750. After the coprocessor 750 at the device side performs the task of completing the target coprocessor, the corresponding task end description information is added to the head of the task completion queue 760, and the tail of the task completion queue 760 is updated by the main processor 730 at the device side.
It should be noted that, when the user continuously issues the coprocessor tasks, each coprocessor task is executed according to the above flow, that is, the coprocessor task issued by the user is executed by cycling the above steps.
According to the technical scheme, the task issuing queue and the task completion queue are shared between the main processor and the coprocessor, so that the main processor can issue tasks to the coprocessor in the execution process of the coprocessor, and the coprocessor can submit information of task ending in the task issuing process of the main processor, so that the main processor and the coprocessor can execute simultaneously without waiting for each other. By embedding scheduling codes on the coprocessor, the coprocessor has task scheduling capability of reading task issuing queues, executing tasks, updating task completion queues and the like. According to the technical scheme provided by the invention, under the condition that the user continuously issues the coprocessor task, the main processor and the coprocessor are always in a working state, so that the occupancy rate of the main processor and the coprocessor is improved, and the performance of the equipment side is greatly improved.
Fig. 8 is a schematic structural diagram of a task processing device according to an embodiment of the present invention. The task processing device can be implemented in hardware and/or software, and the task processing device can be configured in a main processor. As shown in fig. 8, the apparatus includes: a task issue queue update module 810, a task determination module 820, and a task reclamation module 830.
A task issuing queue updating module 810, configured to acquire a coprocessor task, update a head of a task issuing queue according to the coprocessor task, where the task issuing queue is configured to store task description information corresponding to the coprocessor task, and update a tail of the task issuing queue through a coprocessor;
a task determining module 820, configured to determine, according to a task completion queue, a target coprocessor task that is performed, where the task completion queue is configured to store task end description information corresponding to the coprocessor task, and update a head of the task completion queue through the coprocessor;
and a task recycling module 830, configured to update the tail of the task completion queue according to the target coprocessor task, so as to recycle the target coprocessor task.
Optionally, the task issuing queue updating module 810 is specifically configured to:
analyzing the coprocessor task to obtain alternative description information;
generating task description information corresponding to the coprocessor task according to the target information in the alternative description information;
and adding the task description information to the head of the task issuing queue.
Optionally, the task determination module 820 is specifically configured to:
the judging unit is used for judging whether the positions of the head part and the tail part of the task completion queue are equal or not;
and the task determining unit is used for determining the task of the target coprocessor to be executed according to the task end description information of the tail part of the task completion queue under the condition that the positions of the head part and the tail part are unequal.
Further, the task determination unit is specifically configured to:
reading task end description information of the tail part of the task completion queue, and analyzing the read task end description information to obtain a task identifier;
and matching task description information in the task issuing queue according to the task identification, and determining to execute the completed target coprocessor task according to a matching result.
Optionally, the task reclaiming module 830 is specifically configured to:
And executing dequeuing operation on task end description information corresponding to the tail of the task completion queue, and updating the tail of the task completion queue to task end description information of the next coprocessor task.
The task processing device provided by the embodiment of the invention can execute the task processing method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Fig. 9 is a schematic structural diagram of another task processing device according to an embodiment of the present invention. The task processing device may be implemented in hardware and/or software, and the task processing device may be configured in a coprocessor. As shown in fig. 9, the apparatus includes: an information reading module 910, a code execution module 920, and a task completion queue update module 930.
The information reading module 910 is configured to read target task description information at the tail of a task issuing queue and update the tail of the task issuing queue when it is detected that task issuing exists, where the task issuing queue is used to store task description information corresponding to a coprocessor task and update, by a main processor, the head of the task issuing queue based on the coprocessor task;
A code execution module 920, configured to determine an object code according to the object task description information, and execute the coprocessor task by running the object code;
and the task completion queue updating module 930 is configured to update, for a target coprocessor task that is executed and completed, a head of the task completion queue according to task end description information corresponding to the target coprocessor task, and update, by the main processor, a tail of the task completion queue according to the target coprocessor task, so as to recycle the target coprocessor task.
Optionally, the task detection module is further configured to:
acquiring the positions of the head and tail of the task issuing queue;
if the positions of the head part and the tail part of the task issuing queue are unequal, determining that task issuing exists;
and if the positions of the head part and the tail part of the task issuing queue are equal, determining that no task issuing exists.
Optionally, the code execution module 920 is specifically configured to:
analyzing the target task description information to obtain an instruction address and a parameter address;
and determining an object code according to the instruction address and the parameter address.
The task processing device provided by the embodiment of the invention can execute the task processing method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
FIG. 10 illustrates a schematic diagram of a computing device that may be used to implement an embodiment of the invention. Computing devices are intended to represent various forms of heterogeneous many-core architecture digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The computing device includes a host processor and a coprocessor. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 10, the computing device 10 includes at least one main processor 11, at least one coprocessor, and a memory, such as a read-only memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively coupled to the at least one main processor 11 and/or the coprocessor 16, in which the memory stores computer program instructions executable by the at least one processor, and the main processor 11 may perform various suitable actions and processes in accordance with the computer program instructions stored in the read-only memory (ROM) 12 or the computer program instructions loaded into the Random Access Memory (RAM) 13 from the storage unit 17. In RAM 13, various programs and data required for the operation of computing device 10 may also be stored. The main processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
The main processor 11 includes a central processing unit CPU. The coprocessor 16 includes a graphics processor GPU, a data computing unit DPU, an application specific integrated circuit ASIC, a system on a chip SOC, a programmable gate array FPGA, an embedded neural network processor NPU, a hardware computing engine HW AE, a hardware acceleration controller HAC, and a CPU. The main processor 11 and the coprocessor 16 perform the respective methods and processes described above, such as task processing methods.
In some embodiments, the task processing method may be implemented as a computer program, which is tangibly embodied on a computer-readable storage medium, such as the storage unit 17. In some embodiments, some or all of the computer program may be loaded and/or installed onto computing device 10 via ROM 12 and/or communication unit 18. One or more of the steps of the task processing method described above may be performed when the computer program is loaded into RAM 13 and executed by host processor 11 and/or co-processor 16. Alternatively, in other embodiments, host processor 11 and/or coprocessor 16 may be configured to perform the task processing methods in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above can be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (8)

1. A method for processing tasks, applied to a host processor, comprising:
acquiring a coprocessor task, and analyzing the coprocessor task to obtain alternative description information; generating task description information corresponding to the coprocessor task according to the target information in the alternative description information; adding the task description information to the head of a task issuing queue, wherein the task issuing queue is used for storing the task description information corresponding to the coprocessor task, and updating the tail of the task issuing queue through a coprocessor;
under the condition that the positions of the head part and the tail part of the task completion queue are unequal, determining a target coprocessor task for executing completion according to task end description information of the tail part of the task completion queue, wherein the task completion queue is used for storing task end description information corresponding to the coprocessor task, and updating the head part of the task completion queue through the coprocessor;
and executing dequeuing operation on task end description information corresponding to the tail of the task completion queue, and updating the tail of the task completion queue to task end description information of a next coprocessor task of the target coprocessor task so as to recycle the target coprocessor task.
2. The method of claim 1, wherein determining to execute the completed target coprocessor task based on task end description information of a tail of the task completion queue comprises:
reading task end description information of the tail part of the task completion queue, and analyzing the read task end description information to obtain a task identifier;
and matching task description information in the task issuing queue according to the task identification, and determining to execute the completed target coprocessor task according to a matching result.
3. A method of task processing, applied to a coprocessor, comprising:
acquiring the positions of the head and tail of a task issuing queue; if the positions of the head and the tail of the task issuing queue are unequal, determining that task issuing exists, reading target task description information of the tail of the task issuing queue, and updating the tail of the task issuing queue, wherein the task issuing queue is used for storing task description information corresponding to a coprocessor task, and updating the head of the task issuing queue based on the coprocessor task through a main processor;
analyzing the target task description information to obtain an instruction address and a parameter address; determining an object code according to the instruction address and the parameter address, and executing the coprocessor task by running the object code;
And for executing the completed target coprocessor task, updating the head of a task completion queue according to task end description information corresponding to the target coprocessor task, and updating the tail of the task completion queue according to the target coprocessor task by the main processor so as to recycle the target coprocessor task.
4. A method according to claim 3, further comprising:
and if the positions of the head part and the tail part of the task issuing queue are equal, determining that no task issuing exists.
5. A task processing device, applied to a main processor, comprising:
the task issuing queue updating module is used for analyzing the coprocessor task to obtain alternative description information; generating task description information corresponding to the coprocessor task according to the target information in the alternative description information; the task description information is added to the head of the task issuing queue, wherein the task issuing queue is used for storing the task description information corresponding to the coprocessor task, and the tail of the task issuing queue is updated through the coprocessor;
the task determining module is used for determining a target coprocessor task for executing completion according to task end description information of the tail part of the task completion queue under the condition that the positions of the head part and the tail part of the task completion queue are unequal, wherein the task completion queue is used for storing the task end description information corresponding to the coprocessor task, and the head part of the task completion queue is updated through the coprocessor;
And the task recycling module is used for executing dequeuing operation on task end description information corresponding to the tail of the task completion queue, and updating the tail of the task completion queue to task end description information of a next coprocessor task of the target coprocessor task so as to recycle the target coprocessor task.
6. A task processing device, for use in a coprocessor, comprising:
the information reading module is used for acquiring the positions of the head part and the tail part of the task issuing queue; if the positions of the head and the tail of the task issuing queue are unequal, determining that task issuing exists, reading target task description information of the tail of the task issuing queue, and updating the tail of the task issuing queue, wherein the task issuing queue is used for storing task description information corresponding to a coprocessor task, and updating the head of the task issuing queue based on the coprocessor task through a main processor;
the code execution module is used for analyzing the target task description information to obtain an instruction address and a parameter address; determining an object code according to the instruction address and the parameter address, and executing the coprocessor task by running the object code;
And the task completion queue updating module is used for updating the head of the task completion queue according to task end description information corresponding to the target coprocessor task for executing the completed target coprocessor task, and updating the tail of the task completion queue according to the target coprocessor task by the main processor so as to recycle the target coprocessor task.
7. A computing device, the computing device comprising:
at least one main processor;
at least one coprocessor; and a memory communicatively coupled to the at least one host processor and/or the at least one co-processor, the memory storing computer program instructions; wherein the computer program instructions, when executed by the at least one main processor, enable the at least one main processor to perform the task processing method of any one of claims 1-2;
the computer program instructions, when executed by the at least one coprocessor, enable the at least one coprocessor to perform the task processing method of any one of claims 3-4.
8. A computer readable storage medium storing computer instructions for causing a computing device to perform the task processing method of any one of claims 1-4 when executed.
CN202310762527.7A 2023-06-27 2023-06-27 Task processing method, device, computing equipment and storage medium Active CN116521606B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310762527.7A CN116521606B (en) 2023-06-27 2023-06-27 Task processing method, device, computing equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310762527.7A CN116521606B (en) 2023-06-27 2023-06-27 Task processing method, device, computing equipment and storage medium

Publications (2)

Publication Number Publication Date
CN116521606A CN116521606A (en) 2023-08-01
CN116521606B true CN116521606B (en) 2023-09-05

Family

ID=87397912

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310762527.7A Active CN116521606B (en) 2023-06-27 2023-06-27 Task processing method, device, computing equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116521606B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102385558A (en) * 2010-08-31 2012-03-21 国际商业机器公司 Request control device, request control method and relevant processor
CN103946803A (en) * 2011-10-17 2014-07-23 凯为公司 Processor with efficient work queuing
CN110209493A (en) * 2019-04-11 2019-09-06 腾讯科技(深圳)有限公司 EMS memory management process, device, electronic equipment and storage medium
CN114880102A (en) * 2022-07-04 2022-08-09 北京智芯半导体科技有限公司 Security chip, multitask scheduling method and device thereof, and storage medium
CN115981833A (en) * 2021-10-15 2023-04-18 华为技术有限公司 Task processing method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9448846B2 (en) * 2011-12-13 2016-09-20 International Business Machines Corporation Dynamically configurable hardware queues for dispatching jobs to a plurality of hardware acceleration engines
KR102184280B1 (en) * 2015-12-17 2020-11-30 아브 이니티오 테크놀로지 엘엘시 Data processing using dynamic partitioning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102385558A (en) * 2010-08-31 2012-03-21 国际商业机器公司 Request control device, request control method and relevant processor
CN103946803A (en) * 2011-10-17 2014-07-23 凯为公司 Processor with efficient work queuing
CN110209493A (en) * 2019-04-11 2019-09-06 腾讯科技(深圳)有限公司 EMS memory management process, device, electronic equipment and storage medium
CN115981833A (en) * 2021-10-15 2023-04-18 华为技术有限公司 Task processing method and device
CN114880102A (en) * 2022-07-04 2022-08-09 北京智芯半导体科技有限公司 Security chip, multitask scheduling method and device thereof, and storage medium

Also Published As

Publication number Publication date
CN116521606A (en) 2023-08-01

Similar Documents

Publication Publication Date Title
EP2652600B1 (en) Virtual machine branching and parallel execution
CN101726256B (en) Computer system and method for searching inflection point from image contour
CN105468588A (en) Character string matching method and apparatus
CN116126346B (en) Code compiling method and device of AI model, computer equipment and storage medium
CN116521606B (en) Task processing method, device, computing equipment and storage medium
CN116662039B (en) Industrial information parallel detection method, device and medium based on shared memory
CN116382658A (en) Compiling method and device of AI model, computer equipment and storage medium
CN116300946A (en) Path planning method, device, equipment and medium of automatic loader
CN113537392A (en) Similar image identification method and device, computing equipment and computer storage medium
CN102929392B (en) Based on the user operation recognition methods of multisensor and the equipment of use the method
CN110889677A (en) Road examination item judgment method and device and electronic equipment
CN117492822B (en) Change contrast method, device, electronic equipment and storage medium
CN116579914B (en) Execution method and device of graphic processor engine, electronic equipment and storage medium
CN110018877B (en) Method and device for quickly instantiating VNF according to affinity principle
CN113656549B (en) Content searching method of electronic book, electronic device and computer storage medium
CN115904899A (en) Operation record generation method, operation record acquisition method, operation record generation device, operation record acquisition device and operation record acquisition medium
CN108255518A (en) Processor and cyclic program branch prediction method
CN117194018A (en) Processing method and device of system temperature control algorithm in multi-core and multi-chip environment
CN116737430A (en) BMC control method and device, electronic equipment and storage medium
CN116166454A (en) Message processing method and device, electronic equipment and storage medium
CN107608765B (en) Virtual machine migration method and device
CN117150229A (en) Data processing method, device, equipment and medium
CN117608779A (en) Scheduling period determining method, device, equipment and medium
CN112363847A (en) Automatic identification method and system for license document
CN115061842A (en) Data processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant