CN112905317A - Task scheduling method and system under rapid reconfigurable signal processing heterogeneous platform - Google Patents

Task scheduling method and system under rapid reconfigurable signal processing heterogeneous platform Download PDF

Info

Publication number
CN112905317A
CN112905317A CN202110152746.4A CN202110152746A CN112905317A CN 112905317 A CN112905317 A CN 112905317A CN 202110152746 A CN202110152746 A CN 202110152746A CN 112905317 A CN112905317 A CN 112905317A
Authority
CN
China
Prior art keywords
task
signal processing
scheduling
tasks
heterogeneous
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110152746.4A
Other languages
Chinese (zh)
Other versions
CN112905317B (en
Inventor
李静磊
陈仕豪
黄柏林
杨清海
张帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202110152746.4A priority Critical patent/CN112905317B/en
Publication of CN112905317A publication Critical patent/CN112905317A/en
Application granted granted Critical
Publication of CN112905317B publication Critical patent/CN112905317B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)
  • Complex Calculations (AREA)

Abstract

The invention belongs to the technical field of signal processing, and discloses a task scheduling method and a task scheduling system under a rapid reconfigurable signal processing heterogeneous platform, wherein tasks are screened out by using statistical criteria in a task selection stage, and the tasks are selected by using priority criteria; the processing resources are selected during the processing resource selection phase to minimize the earliest execution completion time for the task or bound task. The task scheduling module comprises: a task selection module; and a processing resource selection module. The invention uses a large amount of heterogeneous hardware resources to help the acceleration of the signal processing application, and simultaneously, the invention is matched with the comprehensive development software of the upper layer to realize the rapid reconfiguration and the rapid iteration of the deployment of the signal processing application. The algorithm of the invention improves the HEFT algorithm and the CEFT algorithm, fully utilizes the idle time of processing resources by using a method of combining task replication and task binding, offsets larger communication overhead by a certain amount of calculation overhead, and achieves the purpose of optimizing the time performance of the whole task and the utilization rate of system resources.

Description

Task scheduling method and system under rapid reconfigurable signal processing heterogeneous platform
Technical Field
The invention belongs to the technical field of signal processing, and particularly relates to a task scheduling method and system under a rapidly reconfigurable signal processing heterogeneous platform.
Background
At present: the quick reconfigurable signal processing heterogeneous platform utilizes heterogeneous processing resources such as a CPU, a DSP, a GPU, an FPGA and the like to form a large number of board cards for signal processing, integrates all the board cards into a case, utilizes a bus on a back plate of the case and a switch board of the case to carry out data interaction between processing resources, and stores data and files required in the processing process by means of a memory. The comprehensive software development environment of the platform is a software part of the platform and comprises a visual development interface, a task scheduling module, a resource monitoring module, a configuration control module, an encapsulation module, a communication control module and other functional modules. The task scheduling algorithm is the core of the task scheduling module and determines the performance and efficiency of the signal processing application. A signal processing application is usually completed by multiple tasks in cooperation, and the tasks usually contain complex dependencies, which are usually represented by DAG (directed acyclic graph). In a directed acyclic graph, nodes represent tasks and directed edges represent data dependencies between tasks. The task scheduling algorithm is to assign the tasks in the task graph to the appropriate computing resources according to the specified optimization objective, and determine the execution sequence on the computing resources on the premise of ensuring the communication dependency relationship among the tasks. In our scenario, the main optimization goal of the task scheduling algorithm is the overall execution time of the signal processing tasks, i.e. the time difference between the task that starts executing earliest and the task that finishes executing latest. Task scheduling problems can be divided into static task scheduling and dynamic task scheduling according to whether the characteristics of tasks (including dependency relationship between tasks, calculation overhead of tasks on a processor and communication overhead between tasks) are given in advance. Under our platform, the method of static scheduling is adopted, since hardware resources are usually fixed and various signal processing tasks are provided by a task component library, all of us give these characteristic estimation values based on a statistical method to be provided for a scheduling algorithm. Aiming at the problem of static scheduling in a heterogeneous system, the scheduling can be mainly divided into table-based scheduling, cluster-based scheduling, task replication-based scheduling and guided random search scheduling, and common scheduling algorithms such as heuristic algorithms like HEFT, CPOP, Lookahead and CEFT are widely applied to practical application.
Typically, such task scheduling is divided into two phases, a task selection phase and a processing resource selection phase. The HEFT algorithm is used for calculating RankU values of all task nodes from the task node exiting based on the communication overhead and the calculation overhead of the tasks, taking the RankU values as the standard of priority sorting, performing descending order arrangement on the RankU values, and sequentially scheduling each task to the processing resource which enables the earliest execution completion time (EFT) of the task to be the minimum. The CPOP algorithm schedules tasks on the critical path to specific processing resources, and the rest of the scheduling process is similar to the left scheduling process. The Lookahead method differs from the HEFT in that the earliest execution completion time of the task immediately following the task is used as a criterion in the processing resource selection phase. The CEFT algorithm is a scheduling method that combines table scheduling and task replication.
Because data dependency exists among tasks, a subsequent task can be executed only after all direct predecessor tasks of the subsequent task are executed, that is, after all direct predecessor tasks of the subsequent task are executed, data depended on by the subsequent task needs to be transmitted to processing resources where the subsequent task is located. Under the scenes of large data transmission quantity, narrow communication bandwidth and large communication overhead, the subsequent tasks have long waiting time delay, a certain amount of idle time for processing resources is caused, the resource utilization rate is low, and defects of the HEFT algorithm in the aspect are more prominent. In order to fully utilize the idle time of the processing resources, part of the predecessor tasks are executed by utilizing the idle time of the processing resources, and the larger communication overhead is offset by using the calculation overhead, so that the execution time of the whole task application is reduced, namely the task copying method is achieved. However, the performance of the method of simply using task replication needs to be improved in a scenario where there is a task with high communication overhead. Aiming at the defects, the invention provides high-communication overhead task binding and improves the time application of signal processing application under the rapid reconfigurable signal processing heterogeneous platform by combining task replication. Task replication is the process of copying some predecessor tasks to the hardware processing resources of the task to execute in order to save the communication overhead of the predecessor tasks to the task. The task binding with high communication overhead is to bind the task pair together, and after the data of the predecessor tasks of the two tasks are prepared, the binding task pair is allocated on the same hardware processing resource according to a certain rule, so that the direct high communication overhead of the two tasks is eliminated.
In order to meet the real-time requirement of signal processing in more and more scenes and enable a developer to carry out quick iteration and quick reconstruction on deployed signal processing tasks, a quick reconfigurable signal processing heterogeneous platform which combines a heterogeneous hardware accelerator (a large number of heterogeneous computing resources such as CPUs, GPUs, DSPs (digital signal processors), FPGAs (field programmable gate arrays) and a comprehensive development environment of visual development, resource virtualization management and real-time task scheduling can meet the requirement. Developers deploy signal processing applications in visualization software, and the applications are usually composed of a plurality of tasks with certain data dependency, and are represented in the form of a DAG (directed acyclic graph). In order to accelerate the signal processing process to meet the real-time requirement, how to allocate a scheduling algorithm of appropriate hardware computing resources to each task according to the characteristics of the application and the characteristics of hardware in the platform is crucial. Unlike other scenarios, hardware resources are abundant in this heterogeneous platform, so the main concern of the scheduling algorithm is temporal performance.
Through the above analysis, the problems and defects of the prior art are as follows: in the prior art, under the conditions of large data transmission quantity, narrow communication bandwidth and high communication overhead, a subsequent task has long waiting time delay, and simultaneously, a certain amount of idle time for processing resources is caused, so that the resource utilization rate is low.
The difficulty in solving the above problems and defects is: to achieve fast reconstruction and fast deployment of signal processing applications requires a combination of visualization software and heterogeneous hardware, since both software and hardware are involved. How to build such a fast reconfigurable signal processing platform becomes a difficult point. Meanwhile, in the face of scenes of large data transmission quantity, narrow communication bandwidth and high communication overhead, how to carry out task scheduling can improve the real-time performance of signal processing application as much as possible.
The significance of solving the problems and the defects is as follows: the method can accelerate the rapid iteration and the rapid reconstruction of the signal processing application and shorten the time period of application deployment. The task scheduling algorithm provided by the invention can further accelerate the overall running time of the signal processing application and improve the real-time performance of the application. Meanwhile, the utilization rate of hardware resources under the heterogeneous platform used by the user is improved.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a task scheduling method and system under a rapid reconfigurable signal processing heterogeneous platform.
The invention is realized in such a way that the task scheduling method under the rapid reconfigurable signal processing heterogeneous platform finishes screening by using statistical standards and binding tasks with high communication overhead and priority standard selection scheduling tasks at a task selection stage; the processing resources are selected during the processing resource selection phase to minimize the earliest execution completion time for the task or bound task.
Using wijRepresenting the computational overhead of task i on processing resource j, cikThe communication overhead between the task i and the task j is expressed, obviously, for a given DAG signal processing application, the DAG signal processing application comprises N tasks, the platform provides P heterogeneous processing resources, and a corresponding calculation overhead matrix W is providedN×PAnd a communication overhead matrix CN×NMeanwhile, the entry task is EntryTask and the exit task is ExitTask.
Further, the priority criteria used in the task selection phase is
Figure BDA0002933009630000041
With higher priority for larger RankU values, the application quitting the task
Figure BDA0002933009630000042
Wherein
Figure BDA0002933009630000043
Representing the mean of the computational overhead of task i over the various processing resources,
Figure BDA0002933009630000044
represents the average communication overhead of task i to task k, succ (i) represents the set of immediate successors to task i.
Further, the criteria in the processing resource selection stage is that the processing resource is selected such that the earliest execution completion time, EFT, of the task is minimized, which may be calculated by the following formula EFT (j) wij+ EST (j), where EFT (j) represents the earliest completion time of task i on processing resource j, EST (j) represents the earliest start time of task i on processing resource j, and EST (j) may be expressed by the following equation EST (j) max { avail (j), maxm∈pred(i)(AFT(m)+cmi) Where avail (j) is the time of availability of processing resource j, aft (m) is equal to the minimum est (m), obviously for the task in the entry task EntryTask, est (j) is 0, pred (i) represents the immediate predecessor task set for task i.
Further, the task scheduling method under the rapidly reconfigurable signal processing heterogeneous platform is prepared before scheduling, and specifically comprises the following steps: based on hardware processing resources provided by a platform and signal processing tasks provided by a system task library, a calculation overhead value w of each task on different hardware processing resources is estimated by using a statistical method and an empirical methodijRepresents the computational overhead of task i on processing resource j; analyzing the dependency relationship of DAG application according to the signal processing application deployed by a developer on a visual interface, and simultaneously predicting the communication overhead value between different dependent tasks by combining the parameters set for the tasks by the developer and a bottom hardware resource topology information tableikRepresenting the communication overhead between task i to task j; assuming that the signal processing application graph is composed of N tasks and the platform provides P heterogeneous processing resources, the corresponding calculation overhead matrix is WN×PThe communication overhead matrix is CN×N
Further, the task scheduling method under the fast reconfigurable heterogeneous platform performs scheduling, and specifically includes:
(1) analyzing a DAG application diagram, searching an entry task EntryTask and an exit task ExitTask, and assigning the EntryTask to a ready queue ReadyQueue;
(2) counting communication overhead between tasks of signal processing application, calculating mean value mu and standard deviation sigma, screening out task pairs with communication overhead falling outside 2 sigma by using 3 sigma principle of Gaussian distribution model, namely screening out task pairs meeting cikPerforming high-communication overhead binding on the task pair (i, k) which is more than or equal to mu +2 sigma, and adding the merged task pair into a binding List;
(3) calculating RankU values of all tasks, and taking the RankU values as priority standards when the tasks are selected, wherein the larger the RankU value is, the higher the priority is;
(4) when the ready queue ReadyQueue is not empty, executing the steps (5) and (6); performing step (7) when the ready queue ReadyQueue is empty;
(5) judging whether the task pair in the binding List exists in the ReadyQueue of the ready queue or not, if so, dispatching the task, distributing the task to a processing resource which enables the subsequent task in the task pair to execute the earliest execution completion time EFT to be the minimum, deleting the task pair from the ReadyQueue of the ready queue and the binding List, adding a new ready task into the ReadyQueue of the ready queue, and returning to the step (4); if no pair of task in the binding List exists in the ready queue ReadyQueue, executing step (6);
(6) selecting a task corresponding to the largest RankU in the ready queue ReadyQueue for scheduling, respectively considering two situations of not copying a precursor task and copying the precursor task, selecting a processing resource which enables the earliest execution completion time EFT of the task to be the smallest in the two situations to execute the task, deleting the task from the ready queue ReadyQueue, adding a new ready task into the ready queue ReadyQueue, and returning to the step (4);
(7) and obtaining a complete task scheduling scheme and finishing scheduling.
The invention also aims to provide a quick reconfigurable signal processing heterogeneous platform system, which is used for realizing the task scheduling method under the quick reconfigurable heterogeneous platform. The quick reconfigurable signal processing heterogeneous platform system utilizes heterogeneous processing resources such as a CPU, a DSP, a GPU, an FPGA and the like to form a large number of board cards for signal processing, integrates all the board cards into a case, utilizes a bus on a back plate of the case and an exchange board of the case to carry out data interaction between processing resources, and stores data and files required in the processing process by means of a memory. The comprehensive software development environment of the platform is a software part of the platform and comprises a visual development interface, a task scheduling module, a resource monitoring module, a configuration control module, an encapsulation module, a communication control module and other functional modules.
By combining all the technical schemes, the invention has the advantages and positive effects that: the invention combines the methods of task replication and task binding, fully utilizes the idle overhead of processing resources, offsets larger communication overhead by a certain amount of calculation overhead, accelerates the execution process of the whole task, and can well meet the real-time requirement. Compared with the common scheduling algorithm of the same type, the algorithm provided by the invention has better time performance.
The system of the invention uses a large amount of heterogeneous hardware resources to accelerate the signal processing application, and simultaneously realizes the quick reconfiguration and quick iteration of application deployment by matching with the upper comprehensive software. The algorithm of the invention is an improvement of the HEFT algorithm and the CEFT algorithm, and by using a method of task replication and task binding, idle time of processing resources is fully utilized, and a certain amount of calculation overhead is used for offsetting larger communication overhead, so that time performance of the whole task and system resource utilization rate are optimized.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained from the drawings without creative efforts.
FIG. 1 is a block diagram of a task scheduling module, 1, task selection module, according to the present invention; 2. and a processing resource selection module.
Fig. 2 is a schematic structural diagram of a fast reconfigurable signal processing heterogeneous platform system provided by an embodiment of the present invention;
fig. 3 is a flowchart of a task scheduling method under a fast reconfigurable signal processing heterogeneous platform according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of a simulation result provided in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Aiming at the problems in the prior art, the invention provides a task scheduling method, a task scheduling system and computer equipment under a fast reconfigurable heterogeneous platform, and the invention is described in detail below with reference to the attached drawings.
The task scheduling method under the rapidly reconfigurable heterogeneous platform provided by the invention can be implemented by adopting other steps by ordinary technicians in the field.
As shown in fig. 1, the task scheduling module under the fast reconfigurable heterogeneous platform provided by the present invention includes:
the task selection module 1 is used for selecting a task or binding the task according to the priority standard used in the task selection stage;
a processing resource selection module 2, configured to select a processing resource in the processing resource selection stage so as to minimize the earliest completion time of execution of the task or the bound task.
As shown in fig. 2, the system architecture of the rapidly reconfigurable heterogeneous signal processing platform provided by the present invention is shown in the figure, and is composed of a software portion and a hardware portion, wherein the hardware portion is composed of heterogeneous hardware such as an FPGA, a DSP, a GPU, and a CPU, the software portion virtualizes the hardware at the bottom layer into uniform logic resources, and the software at the upper layer includes a real-time scheduling module, a configuration control module, a resource management module, a state monitoring module, a visual development interface, and a display control interface. Developers can rapidly deploy and reconfigure signal processing applications in a graphical manner.
The technical solution of the present invention is further described with reference to fig. 3.
The task scheduling algorithm is a scheduling algorithm for accelerating the application of processing the directed acyclic graph type signals aiming at a quick reconfigurable heterogeneous signal processing platform. The process of scheduling may be divided into a task selection phase and a processing resource selection phase. Using wijRepresenting the computational overhead of task i on processing resource j, cikRepresenting the communication overhead between task i to task j, it is clear that for a given DAG signal processing application (assuming N task components, the system setup provides P heterogeneous processing resources), there is a corresponding computational overhead matrix WN×PAnd a communication overhead matrix CN×N. Meanwhile, the entry task is EntryTask and the exit task is ExitTask.
The priority criteria used in the task selection phase are
Figure BDA0002933009630000081
The higher the RankU value, the higher the priority, and obviously the application exits from the task
Figure BDA0002933009630000082
Wherein
Figure BDA0002933009630000083
Representing the mean of the computational overhead of task i over the various processing resources,
Figure BDA0002933009630000084
represents the average communication overhead of task i to task k, succ (i) represents the set of immediate successors to task i.
The criteria in the processing resource selection stage is that the processing resource is selected such that the earliest execution completion time, EFT, of the task is minimized, which can be calculated by the following equation EFT (j) wij+ EST (j), where EFT (j) represents the earliest completion time of execution of task i on processing resource j, and EST (j) represents the earliest start time of execution of task i on processing resource j. EST (j) can be expressed by the formula EST (j) max (avail (j), maxm∈pred(i)(AFT(m)+cmi) Where avail (j) is the time of availability of processing resource j, aft (m) is equal to the minimum est (m), obviously for the task in the entry task EntryTask, est (j) is 0, pred (i) represents the immediate predecessor task set for task i.
The scheduling algorithm of the invention comprises the following steps:
the first step, preparation before scheduling, specifically includes: based on hardware processing resources provided by a platform and signal processing tasks provided by a system task library, a calculation overhead value w of each task on different hardware processing resources is estimated by using a statistical method and an empirical methodijRepresents the computational overhead of task i on processing resource j; analyzing the dependency relationship of the DAG application according to the signal processing application (DAG graph form) deployed on the visual interface by a developer, and estimating the communication overhead value among different dependent tasks by combining the parameters set for the tasks by the developer and the bottom hardware resource topology information table, cikRepresenting the communication overhead between task i to task j; obtaining a calculation overhead matrix W corresponding to a signal processing application diagram (assuming that N tasks are formed and the system setting provides P heterogeneous processing resources)N×PAnd a communication overhead matrix CN×N
And step two, scheduling is executed, and the method specifically comprises the following steps:
(1) and analyzing the DAG application diagram, searching an entry task EntryTast and an exit task ExitTask, and assigning the ETtryTask to the ready queue ReadyQueue.
(2) Counting communication overhead between tasks of signal processing application, calculating mean value mu and standard deviation sigma, screening out task pairs with communication overhead falling outside 2 sigma by using 3 sigma principle of Gaussian distribution model, namely screening out task pairs meeting cikAnd (5) performing high-communication overhead binding on the task pair (i, k) of more than or equal to mu +2 sigma, and adding the determined task pair into a binding List.
(3) Calculating RankU values of all tasks, and taking the RankU values as priority standards when the tasks are selected, wherein the larger the RankU value is, the higher the priority is;
(4) when the ready queue ReadyQueue is not empty, executing the steps (5) and (6); step (7) is performed when the ready queue ReadyQueue is empty.
(5) Judging whether the task pair in the binding List exists in the ReadyQueue of the ready queue or not, if so, dispatching the task, distributing the task to a processing resource which enables the subsequent task in the task pair to execute the earliest execution completion time EFT to be the minimum, deleting the task pair from the ReadyQueue of the ready queue and the binding List, adding a new ready task into the ReadyQueue of the ready queue, and returning to the step (4); if no pair of tasks in the binding List exists in the ready queue ReadyQueue, step (6) is performed.
(6) And (3) selecting the task corresponding to the largest RankU in the ready queue ReadyQueue for scheduling, respectively considering two situations of not copying the precursor task and copying the precursor task, selecting the processing resource which enables the earliest execution completion time EFT of the task to be the smallest in the two situations to execute the task, deleting the task from the ready queue ReadyQueue, adding a new ready task into the ready queue ReadyQueue, and returning to the step (4).
(7) And obtaining a complete task scheduling scheme and finishing scheduling.
The technical effects of the present invention will be described in detail with reference to simulations.
In order to evaluate the performance difference of the task scheduling algorithm of the invention compared with some common algorithms and simulate the common algorithms, the simulation mainly compares the classic HEFT algorithm with the CEFT algorithm based on task replication. In the simulation, the execution time and the performance parameter of the whole task are makespan, which are defined as:
makespan=max{AFT(ExitTask)}
a scheduling length ratio, SLR, defined as:
Figure BDA0002933009630000101
in our simulation, the average scheduling length ratio was used as a performance index for comparing the three scheduling algorithms. As can be seen from the simulation result graph, under the condition that the number of tasks is the same, the algorithm provided by the invention has better average scheduling length ratio performance. Compared with the classic HEFT algorithm, the performance of the algorithm is remarkably optimized, and due to the combined action of task replication and task binding, the communication overhead between tasks is greatly reduced. Compared with the CEFT algorithm with task replication, the performance of the algorithm is improved to a certain extent, and the communication overhead between the task pairs is eliminated due to the binding of the algorithm to the high-communication marketing task pairs.
It should be noted that the embodiments of the present invention can be realized by hardware, software, or a combination of software and hardware. The hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the apparatus and methods described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided on a carrier medium such as a disk, CD-or DVD-ROM, programmable memory such as read only memory (firmware), or a data carrier such as an optical or electronic signal carrier, for example. The apparatus and its modules of the present invention may be implemented by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., or by software executed by various types of processors, or by a combination of hardware circuits and software, e.g., firmware.
The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.

Claims (8)

1. A task scheduling method under a rapid reconfigurable signal heterogeneous platform is characterized in that a task scheduling method under a rapid reconfigurable signal processing heterogeneous platform uses a statistical standard to screen and bind tasks with high communication overhead in a task selection stage, and uses a given priority standard to select scheduling tasks; selecting a processing resource in a processing resource selection phase to minimize the earliest execution completion time of the task or bound task;
using wijRepresenting the computational overhead of task i on processing resource j, cikThe communication overhead between the task i and the task j is expressed, obviously, for a given DAG signal processing application, the system comprises N tasks, provides P heterogeneous processing resources, and has a corresponding calculation overhead matrix WN×PAnd a communication overhead matrix CN×NMeanwhile, the entry task is EntryTask and the exit task is ExitTask.
2. The method for task scheduling under the heterogeneous platform of fast reconfigurable signal processing according to claim 1, wherein the task pairs in the upper region with communication cost falling outside 2 σ are screened out by using the 3 σ principle of the Gaussian distribution model, namely, the task pairs satisfying c are screened outikThe task more than or equal to mu +2 sigma carries out high-communication overhead binding on (i, k); the priority criteria used in the task selection phase are
Figure FDA0002933009620000011
With higher priority for larger RankU values, the application quitting the task
Figure FDA0002933009620000012
Wherein
Figure FDA0002933009620000013
Representing the mean of the computational overhead of task i over the various processing resources,
Figure FDA0002933009620000014
represents the average communication overhead of task i to task k, and succ (i) represents the set of immediate successors to task i.
3. The task scheduler under the heterogeneous platform of fast reconfigurable signal processing according to claim 1A method, characterized in that the criteria in the processing resource selection phase is that the processing resources are selected such that the earliest execution completion time, EFT, of the task or bound task is minimized, EFT being calculated by the following formula EFT (j) wij+ EST (j), where EFT (j) represents the earliest completion time of task i on processing resource j, EST (j) represents the earliest start time of task i on processing resource j, and EST (j) may be expressed by the following equation EST (j) max { avail (j), maxm∈pred(i)(AFT(m)+cmi) Where avail (j) is the time of availability of processing resource j, aft (m) is equal to the minimum est (m), obviously for the task in the entry task EntryTask, est (j) is 0, pred (i) represents the immediate predecessor task set for task i.
4. The method for task scheduling under the heterogeneous fast reconfigurable signal processing platform according to claim 1, wherein the preparation before the task scheduling under the heterogeneous fast reconfigurable signal processing platform specifically comprises: based on heterogeneous hardware processing resources provided by a platform and signal processing tasks provided by a system task library, a calculation overhead value w of each task on different hardware processing resources is estimated by using a statistical method and an empirical methodijRepresents the computational overhead of task i on processing resource j; analyzing the dependency relationship of DAG application according to the signal processing application deployed by a developer on a visual interface, and simultaneously predicting the communication overhead value between different dependent tasks by combining the parameters set for the tasks by the developer and a bottom hardware resource topology information tableikRepresenting the communication overhead between task i to task j; assuming that the signal processing application graph is composed of N tasks and the platform provides P heterogeneous processing resources, the corresponding calculation overhead matrix is WN×PThe communication overhead matrix is CN×N
5. The method for task scheduling under the heterogeneous platform of fast reconfigurable signal processing according to claim 4, wherein the method for task scheduling under the heterogeneous platform of fast reconfigurable signal processing performs scheduling, and specifically comprises:
(1) analyzing a DAG application diagram, searching an entry task EntryTask and an exit task ExitTask, and assigning the EntryTask to a ready queue ReadyQueue;
(2) counting communication overhead between tasks of signal processing application, calculating mean value mu and standard deviation sigma, screening out task pairs with communication overhead falling outside 2 sigma by using 3 sigma principle of Gaussian distribution model, namely screening out task pairs meeting cikPerforming high-communication overhead binding on the task pair (i, k) which is more than or equal to mu +2 sigma, and adding the merged task pair into a binding List;
(3) calculating RankU values of all tasks, and taking the RankU values as priority standards when the tasks are selected, wherein the larger the RankU value is, the higher the priority is;
(4) when the ready queue ReadyQueue is not empty, executing the steps (5) and (6); performing step (7) when the ready queue ReadyQueue is empty;
(5) judging whether the task pair in the binding List exists in the ReadyQueue of the ready queue or not, if so, dispatching the task, distributing the task to a processing resource which enables the subsequent task in the task pair to execute the earliest execution completion time EFT to be the minimum, deleting the task pair from the ReadyQueue of the ready queue and the binding List, adding a new ready task into the ReadyQueue of the ready queue, and returning to the step (4); if no pair of task in the binding List exists in the ready queue ReadyQueue, executing step (6);
(6) selecting a task corresponding to the largest RankU in the ready queue ReadyQueue for scheduling, respectively considering two situations of not copying a precursor task and copying the precursor task, selecting a processing resource which enables the earliest execution completion time EFT of the task to be the smallest in the two situations to execute the task, deleting the task from the ready queue ReadyQueue, adding a new ready task into the ready queue ReadyQueue, and returning to the step (4);
(7) and obtaining a complete task scheduling scheme and finishing scheduling.
6. A signal processing platform processing terminal is characterized in that the signal processing platform terminal is used for realizing the task scheduling method under the quick reconfigurable heterogeneous platform according to any one of claims 1 to 5.
7. The task scheduling module under the rapid reconfigurable signal processing heterogeneous platform for implementing any one of claims 1 to 5 is characterized by comprising:
the task selection module is used for screening and binding the tasks with high communication overhead and the priority standard selection scheduling tasks by using a statistical standard in a task selection stage;
and the processing resource selection module is used for selecting the processing resources in the processing resource selection stage so as to minimize the earliest execution completion time of the task or the bound task.
8. A quick reconfigurable signal processing heterogeneous platform system is characterized in that the quick reconfigurable signal processing heterogeneous platform system is used for realizing the quick reconfigurable heterogeneous platform according to any one of claims 1-5. The platform consists of a software part and a heterogeneous hardware part, wherein the comprehensive software development environment is the software part of the platform and comprises a visual development interface, a task scheduling module, a resource monitoring module, a configuration control module, an encapsulation module, a communication control module and other functional modules; the heterogeneous hardware part consists of an FPGA, a DSP, a CPU, a GPU and a VPX case.
CN202110152746.4A 2021-02-04 2021-02-04 Task scheduling method and system under rapid reconfigurable signal processing heterogeneous platform Active CN112905317B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110152746.4A CN112905317B (en) 2021-02-04 2021-02-04 Task scheduling method and system under rapid reconfigurable signal processing heterogeneous platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110152746.4A CN112905317B (en) 2021-02-04 2021-02-04 Task scheduling method and system under rapid reconfigurable signal processing heterogeneous platform

Publications (2)

Publication Number Publication Date
CN112905317A true CN112905317A (en) 2021-06-04
CN112905317B CN112905317B (en) 2023-12-15

Family

ID=76122074

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110152746.4A Active CN112905317B (en) 2021-02-04 2021-02-04 Task scheduling method and system under rapid reconfigurable signal processing heterogeneous platform

Country Status (1)

Country Link
CN (1) CN112905317B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113612642A (en) * 2021-08-03 2021-11-05 北京八分量信息科技有限公司 Communication overhead-based heterogeneous task depicting method and device and related products
CN113806044A (en) * 2021-08-31 2021-12-17 天津大学 Heterogeneous platform task bottleneck elimination method for computer vision application
CN113886111A (en) * 2021-10-15 2022-01-04 中国科学院信息工程研究所 Workflow-based data analysis model calculation engine system and operation method
CN115033373A (en) * 2022-03-08 2022-09-09 西安电子科技大学 Method for scheduling and unloading logic dependency tasks in mobile edge computing network

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100242041A1 (en) * 2009-03-17 2010-09-23 Qualcomm Incorporated Real Time Multithreaded Scheduler and Scheduling Method
CN101996105A (en) * 2010-07-09 2011-03-30 中国科学技术大学苏州研究院 Static software/hardware task dividing and dispatching method for reconfigurable computing platform
US20110239017A1 (en) * 2008-10-03 2011-09-29 The University Of Sydney Scheduling an application for performance on a heterogeneous computing system
US20170277654A1 (en) * 2015-03-27 2017-09-28 Huawei Technologies Co., Ltd. Method and apparatus for task scheduling on heterogeneous multi-core reconfigurable computing platform
CN107301500A (en) * 2017-06-02 2017-10-27 北京工业大学 A kind of workflow schedule method looked forward to the prospect based on critical path task
US20170315846A1 (en) * 2015-01-16 2017-11-02 Huawei Technologies Co., Ltd. Task scheduling method and apparatus on heterogeneous multi-core reconfigurable computing platform
CN110018887A (en) * 2018-01-10 2019-07-16 苏州智配信息科技有限公司 Task schedule and Resource Management Algorithm on a kind of Reconfigurable Platform
CN112181613A (en) * 2020-09-09 2021-01-05 国家计算机网络与信息安全管理中心 Heterogeneous resource distributed computing platform batch task scheduling method and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110239017A1 (en) * 2008-10-03 2011-09-29 The University Of Sydney Scheduling an application for performance on a heterogeneous computing system
US20100242041A1 (en) * 2009-03-17 2010-09-23 Qualcomm Incorporated Real Time Multithreaded Scheduler and Scheduling Method
CN101996105A (en) * 2010-07-09 2011-03-30 中国科学技术大学苏州研究院 Static software/hardware task dividing and dispatching method for reconfigurable computing platform
US20170315846A1 (en) * 2015-01-16 2017-11-02 Huawei Technologies Co., Ltd. Task scheduling method and apparatus on heterogeneous multi-core reconfigurable computing platform
US20170277654A1 (en) * 2015-03-27 2017-09-28 Huawei Technologies Co., Ltd. Method and apparatus for task scheduling on heterogeneous multi-core reconfigurable computing platform
CN107301500A (en) * 2017-06-02 2017-10-27 北京工业大学 A kind of workflow schedule method looked forward to the prospect based on critical path task
CN110018887A (en) * 2018-01-10 2019-07-16 苏州智配信息科技有限公司 Task schedule and Resource Management Algorithm on a kind of Reconfigurable Platform
CN112181613A (en) * 2020-09-09 2021-01-05 国家计算机网络与信息安全管理中心 Heterogeneous resource distributed computing platform batch task scheduling method and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张小庆;刘仁峰;: "一种满足能效的云任务调度算法", 武汉轻工大学学报, no. 04 *
蔡昌许;: "基于重复异构最早完成时间的云计算任务调度算法", 西南师范大学学报(自然科学版), no. 05 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113612642A (en) * 2021-08-03 2021-11-05 北京八分量信息科技有限公司 Communication overhead-based heterogeneous task depicting method and device and related products
CN113612642B (en) * 2021-08-03 2024-03-08 北京八分量信息科技有限公司 Method and device for describing heterogeneous tasks based on communication overhead and related products
CN113806044A (en) * 2021-08-31 2021-12-17 天津大学 Heterogeneous platform task bottleneck elimination method for computer vision application
CN113806044B (en) * 2021-08-31 2023-11-07 天津大学 Heterogeneous platform task bottleneck eliminating method for computer vision application
CN113886111A (en) * 2021-10-15 2022-01-04 中国科学院信息工程研究所 Workflow-based data analysis model calculation engine system and operation method
CN115033373A (en) * 2022-03-08 2022-09-09 西安电子科技大学 Method for scheduling and unloading logic dependency tasks in mobile edge computing network

Also Published As

Publication number Publication date
CN112905317B (en) 2023-12-15

Similar Documents

Publication Publication Date Title
EP3754495B1 (en) Data processing method and related products
CN112905317B (en) Task scheduling method and system under rapid reconfigurable signal processing heterogeneous platform
US8739171B2 (en) High-throughput-computing in a hybrid computing environment
US8984494B1 (en) Scheduling generated code based on target characteristics
US11816509B2 (en) Workload placement for virtual GPU enabled systems
Chen et al. Deep learning research and development platform: Characterizing and scheduling with qos guarantees on gpu clusters
CN113535367A (en) Task scheduling method and related device
CN110389903B (en) Test environment deployment method and device, electronic equipment and readable storage medium
US9471387B2 (en) Scheduling in job execution
US20200044938A1 (en) Allocation of Shared Computing Resources Using a Classifier Chain
CN100593169C (en) Multithreaded reachability
CN114217966A (en) Deep learning model dynamic batch processing scheduling method and system based on resource adjustment
Mousavi Nik et al. Task replication to improve the reliability of running workflows on the cloud
WO2016018352A1 (en) Platform configuration selection based on a degraded makespan
CN114625500A (en) Method and application for scheduling micro-service application based on topology perception in cloud environment
Özden et al. ElastiSim: a batch-system simulator for malleable workloads
Vakilinia et al. Preemptive cloud resource allocation modeling of processing jobs
JP2023544911A (en) Method and apparatus for parallel quantum computing
Yang et al. Tear up the bubble boom: Lessons learned from a deep learning research and development cluster
Mao et al. Hierarchical model-based associate tasks scheduling with the deadline constraints in the cloud
De Munck et al. Design and performance evaluation of a conservative parallel discrete event core for GES
Plauth et al. CloudCL: distributed heterogeneous computing on cloud scale
Xia et al. A Load Balancing Strategy Of" Container Virtual Machine" Cloud Microservice Based On Deadline Limit
Qiao et al. An online workflow scheduling algorithm considering license limitation in heterogeneous environment
US20220345535A1 (en) Distribution of machine learning workflows on webscale infrastructures

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant