WO2008148624A1 - Method and device for providing a schedule for a predictable operation of an algorithm on a multi-core processor - Google Patents

Method and device for providing a schedule for a predictable operation of an algorithm on a multi-core processor Download PDF

Info

Publication number
WO2008148624A1
WO2008148624A1 PCT/EP2008/055907 EP2008055907W WO2008148624A1 WO 2008148624 A1 WO2008148624 A1 WO 2008148624A1 EP 2008055907 W EP2008055907 W EP 2008055907W WO 2008148624 A1 WO2008148624 A1 WO 2008148624A1
Authority
WO
WIPO (PCT)
Prior art keywords
optimization
algorithm
tasks
alg
time
Prior art date
Application number
PCT/EP2008/055907
Other languages
French (fr)
Inventor
Andrey Nechypurenko
Egon Wuchner
Original Assignee
Siemens Aktiengesellschaft
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens Aktiengesellschaft filed Critical Siemens Aktiengesellschaft
Publication of WO2008148624A1 publication Critical patent/WO2008148624A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs

Definitions

  • the invention relates to a method for providing a schedule for a predictable operation of an algorithm on a multi-core processor comprising a plurality of parallel working cores.
  • the invention further relates to a device for creating a schedule for a predictable operation of an algorithm on a multi-core processor comprising a plurality of parallel working cores.
  • the invention further relates to a computer program product .
  • an algorithm comprises a plurality of different tasks, each task having a specific duration. Furthermore, there are task dependencies and data exchange between some of the tasks. These tasks are called interdependent tasks.
  • the operation time of an algorithm depends on many aspects, especially the task dependencies, task durations and a so- called domain motivated explicit mapping of tasks or task groups to certain cores of a multi-core processor. Improving the operation time of an algorithm implies a better data- throughput or a better task scalability in general. Especially, real-time or life-processing algorithms which are used for medical instruments need feasible run-time predictions with regard to their operation time.
  • a time- optimized scheduling of the tasks specific to an algorithm sometimes there have to be made trade-offs because a time- robust task-to-core deployment with as few cores of the multi-core processor as possible as another option is as important as a time-optimized task-to-core deployment. Consequently, a general-purpose scheduling policy does not suffice, since there are several dimensions of variability: the number of cores of the multi-core processor; the questions whether the multi-core processor comprises asymmetric or symmetric cores; a robust versus a time- optimized deployment of the tasks of an algorithm; significant time improvements due to further parallelizability of certain tasks.
  • a symmetric multi-core processor means that all cores of the multi-core processor have the same capability.
  • real-time operating systems schedule their execution units, i.e. tasks, threads or processes, in a transparent way according to priorities, execution unit states and queues. Executing an algorithm with a real-time operating system therefore makes it necessary to assign a priority to each task of the algorithm.
  • Several approaches are known how to map task to priorities in order to derive a feasible and predictable algorithm duration. Unfortunately, such a mapping has to be done manually by the programmer leveraging his experience and knowledge. Furthermore, this procedure is hardly applicable to algorithms having interdependent tasks.
  • a method for providing a schedule for a predictable operation of an algorithm on a multi-core processor comprises the steps of creating a model of the algorithm, thereby identifying tasks of the algorithm and at least one characteristic of each of the tasks; exploiting an optimization method taking account of a first optimization criterion, thereby assigning each identified task according to its at least one characteristic to at least some of the plurality of cores of the multi-core processor and determining a starting point in time for each of the tasks of the algorithm; repeatedly exploiting the optimization method taking account of at least one second optimization criterion, thereby outputting the starting point in time for each of the tasks, the first and the at least one second optimization criteria; and analyzing each schedule of operation of the algorithm with regard to the starting point in time for each of the tasks, the first and the at least one second optimization criterion to determine the best combination of the first and the at least one second criteria.
  • Scheduling a predictable operation means that the overall time of the algorithm, the needed number of cores for executing the algorithm, the time-robustness and the resulting core utilization of the number of cores is identical with each execution of the algorithm.
  • a scheduling policy on multi-core processors must not be dominated by one dimension, as for example its overall time- optimization.
  • Applying the method according to the invention means to make a trade-off analysis to complex algorithms so that their task-to-core deployment and scheduling addresses this problem. It takes several dimensions of variability, i.e. optimization criteria, as input parameters.
  • the delivered result is a set of scheduling solutions to each tuple of input values and the impact of each solution like time-optimization, the scale of its time-robustness or the specific tasks worth being further parallelized.
  • the schedule for the algorithm furthermore allows a predictable operation. Therefore, the method according to the invention allows providing algorithms which can be used for critical applications, e.g. medical measurements.
  • An advantage over the prior art is that the method according to the invention can be carried out automatically by using a computer system instead of a manual approach. Therefore, the method according to the invention may be used for algorithms with changing tasks, changing dependencies and durations because scheduling a new predictable operation of the algorithm can be done even dynamically.
  • the first optimization criterion may correspond to a time- optimization of the algorithm. Therefore, starting point for the method according to the invention is to provide a schedule for a predictable operation of an algorithm which is time-optimized.
  • the at least one second optimization criterion may correspond to a fixed number of cores and/or a discrete range of cores and/or a time-robustness of the schedule and/or parallelization options of certain tasks. According to the used at least one second optimization criterion an already provided schedule for the operation of the algorithm can be varied to determine the best combination of the first and the at least one second criteria for the schedule of the tasks of the algorithm.
  • different second optimization criteria are considered during exploiting each run of the optimization method. This step is done before analysing each schedule of operation of the algorithm with regard to the starting point in time for each of the tasks, the first and the at least one second optimization criteria to determine the best combination of the first and the at least one second criteria.
  • the method steps of the invention may be carried out based on a computer. Alternatively, it is also possible that some of the method steps of the invention are carried out and created by, respectively, a programmer.
  • an optimization algorithm or a heuristics may be used. Thereby use of known optimization algorithms and heuristics is possible, like the optimization method “Simulated Annealing” or the heuristics “First Fit Decreasing” .
  • the first and/or the at least one second optimization criterion gets weighted with a respective weight parameter.
  • a weight parameter may be specified for one, for some or for the entire first and/or the at least on second optimization criteria.
  • the first optimization criterion may be specified as a range.
  • the at least one second optimization criterion may be specified as a range.
  • the analysis of each schedule of operation of the algorithm ends up with one or more suggestions of the first and the at least one second optimization criterion. After that, a choice can be made automatically or by the software engineer which of the proposed schedules seems to be the most practical solution for the algorithm running on a specific multi-core processor.
  • a computer program product directly loadable into the internal memory of a digital computer comprising software code portions for performing the steps of the method according to the invention when said product is run on a computer.
  • a device for providing a schedule for a predictable operation of an algorithm on a multi-core processor comprising a plurality of parallel working cores comprises a first means for creating a model of the algorithm, thereby identifying tasks of the algorithm and at least one characteristic of each of the tasks; a second means for exploiting an optimization method taking account of a first optimization criterion, thereby assigning each identified task according to its at least one characteristic to at least some of the plurality of cores of the multi-core processor and for determining a starting point in time for each of the tasks of the algorithm, a third means for repeatedly exploiting the optimization method taking account of at least one second optimization criterion, thereby outputting the starting point in time for each of the tasks, the first and the at least one second optimization criterion; and a fourth means for analysing each schedule of operation of the algorithm with regard to the starting point in time for each of the tasks, the first and the at least one second optimization criterion to determine the best combination of the
  • a device according to the invention has the same advantages pointed out in connection with the described method.
  • the device according to the invention comprises further means for executing the method steps as set out above .
  • Fig. 1 shows a simplified already analysed model of an algorithm containing a plurality of tasks
  • Fig. 2 shows a schedule of the tasks of the algorithm of
  • Fig. 1 assigned to a plurality of cores of a multi- core processor.
  • Fig. 1 shows a simplified model of an algorithm ALG consisting of tasks tO to t22.
  • Each of the tasks t ⁇ ,...,t22 has a specific duration which is indicated by reference numeral "delta".
  • Fig. 1 shows a dependency between some of the tasks.
  • the dependency between task t9 and task t7 is indicated with dl .
  • tasks t6, t8 and tlO need input information from task t7.
  • the dependencies are outlined with d4, d5 and d6. In an appropriate manner dependencies dl to d32 between further interdependent tasks are indicated.
  • Fig. 1 exemplarily shows the algorithm after performing the step of creating a model of the algorithm, thereby identifying the tasks tl,...,t22 of the algorithm ALG and identifying characteristics of each of the tasks.
  • the characteristics used for a first step of optimization and shown in Fig. 1 are the duration of each of the tasks and dependencies to other tasks.
  • a time-optimized algorithm ALG will be determined by an optimization method, e.g. any optimization algorithm or a heuristics.
  • the determination of the time-optimized scheduling can be done automatically. Applying a time-optimization to the algorithm shown in Fig. 1 results in using eight cores of a multi-core processor (which might have more than eight cores in total) and an overall duration of the algorithm of 1850 ms .
  • the resulting task-to-core scheduling is outlined in Fig. 2.
  • each of the tasks t ⁇ ,...,t22 is assigned to a specific core P1,...,P8: task t22 is assigned to core Pl.
  • Tasks tl, tl9 and t4 are assigned to core P4.
  • Tasks t5, tl8 and til are assigned to core P5.
  • Tasks t ⁇ , t6, tl5 and tl7 are assigned to core P6.
  • Tasks t7, t8, t9, tl4 and t2 are assigned to core P7.
  • Tasks tl3 and t21 are assigned to core P2.
  • Tasks tl ⁇ , t20 and t3 are assigned to core P3.
  • Last, tasks tlO and tl2 are assigned to core P8.
  • ALG may be data exchanges between the tasks, task durations, core candidates per task and task groups to be deployed to one core .
  • Further parameters or criteria besides time-optimization for finding an optimized schedule for the algorithm ALG are a fixed number of cores of the multi-core processor, a discrete range of cores, a time-robustness of the schedule and parallelization options of certain tasks. These criteria can be called trade-off criteria. Some or all of these criteria will be used as input information to exploit optimization methods (algorithms or heuristics) to suggest possible schedulings of tasks to certain cores at specific times.
  • the input criteria which represent further "dimensions of variability" of the algorithm besides time-optimization can be dealt with by an interactive simulation of deployment suggestions and an impact analysis of alternative deployment options and their trade-offs.
  • the optimization method is repeatedly exploited taking account of some or all of the above mentioned optimization criteria, thereby outputting the starting point in time for each of the tasks and the optimization criteria to be considered for each possible schedule. Thereafter, each proposed schedule of operation of the algorithm is analysed with regard to its starting point in time for each of the tasks and the optimization criteria considered determining the best combination of the optimization criteria.
  • the impact analysis may encompass the following usage scenarios : A fixed number of cores of the multi-core processor may be specified. As a result a suggestion about a time- optimal scheduling solution feasible within the bounds of given cores is provided.
  • a (discrete) range of cores e.g. 8, 12, 16 available cores
  • deployment suggestions sorted along several criteria like operation time and needed number of cores is provided.
  • a simulation of different task duration variations could be executed in order to see the time- robustness of a scheduling solution depending on ranges of task durations.
  • one is able to simulate several deployment suggestions by using a range of cores and task duration variations as input parameters.
  • the delivered result is the impact of all input values within the specified range on the overall optimized algorithm time and its time-robustness.
  • a selection criterion may be specified, e.g. a weighted measure of all variability dimensions like time-optimization, time-robustness, needed number of cores etc.
  • a range of available cores task durations and time optimization ranges with minimums and maximums can be specified.
  • Another input value is the weight which is a kind of importance of each input dimension, each needed to determine the final scheduling policy. For instance, the needed number of cores should influence the selection of the final scheduling policy by 25%, the time-optimization by 35% and the time-robustness by 40%.
  • the result will be a best- possible time-deployment suggestion.
  • Another option is to focus on the scheduling suggestion estimated as the best according to the results of analysis of the repeatedly exploited optimization methods. It could be aimed to retrieve suggestions about which tasks are worth being split into sub-tasks in order to further improve the overall operation time. For instance, when the rate of incoming data blocks cannot be met by the operation time of the selected algorithm it is appropriate to split further tasks into subtasks. With the method according to the invention it is possible to gather information about which tasks are worth the effort to split them into subtasks.

Abstract

The invention describes a method for providing a schedule for a predictable operation of an algorithm (ALG) on a multi-core processor comprising a plurality of parallel working cores (P1,...,P8), comprising the steps of - creating a model of the algorithm (ALG), thereby identifying tasks (t1,..., t22) of the algorithm (ALG) and at least one characteristic of each of the tasks (t1,..., t22); - exploiting an optimization method taking account of a first optimization criterion, thereby assigning each identified task (t1,..., t22) according to its at least one characteristic to at least some of the plurality of cores (P1,..., P8) of the multi-core processor and determining a starting point in time for each of the tasks (t1,..., t22) of the algorithm (ALG); - repeatedly exploiting the optimization method taking account of at least one second optimization criterion, thereby outputting the starting point in time for each of the tasks (t1,..., t22), the first and the at least one second optimization criterion; and - analysing each schedule of operation of the algorithm (ALG) with regard to the starting point in time for each of the tasks (t1,..., t22), the first and the at least one second optimization criterion to determine the best combination of the first and the at least one second optimization criteria.

Description

Description
Method and Device for providing a schedule for a predictable operation of an algorithm on a multi-core processor
The invention relates to a method for providing a schedule for a predictable operation of an algorithm on a multi-core processor comprising a plurality of parallel working cores. The invention further relates to a device for creating a schedule for a predictable operation of an algorithm on a multi-core processor comprising a plurality of parallel working cores. The invention further relates to a computer program product .
Normally, an algorithm comprises a plurality of different tasks, each task having a specific duration. Furthermore, there are task dependencies and data exchange between some of the tasks. These tasks are called interdependent tasks. The operation time of an algorithm depends on many aspects, especially the task dependencies, task durations and a so- called domain motivated explicit mapping of tasks or task groups to certain cores of a multi-core processor. Improving the operation time of an algorithm implies a better data- throughput or a better task scalability in general. Especially, real-time or life-processing algorithms which are used for medical instruments need feasible run-time predictions with regard to their operation time.
Even in case that run-time predictions are not needed, the dynamic scheduling of an algorithm with interdependent tasks has to be based on their task dependencies and durations. The tasks of the algorithm have to be executed in sequence of their specified dependencies. In addition, there has to be a certain kind of fairness with respect to all dispatched algorithms processed by the multi-core processor at a certain time. By executing a plurality of algorithms in parallel, it has to be guaranteed that no previous algorithm starves out. On the contrary, an algorithm should be able to be completed before a new or the same algorithm starts. Besides a time- optimized scheduling of the tasks specific to an algorithm sometimes there have to be made trade-offs because a time- robust task-to-core deployment with as few cores of the multi-core processor as possible as another option is as important as a time-optimized task-to-core deployment. Consequently, a general-purpose scheduling policy does not suffice, since there are several dimensions of variability: the number of cores of the multi-core processor; the questions whether the multi-core processor comprises asymmetric or symmetric cores; a robust versus a time- optimized deployment of the tasks of an algorithm; significant time improvements due to further parallelizability of certain tasks.
By now, when using so-called symmetric multi-core processors (as provided by Intel or AMD) , task scheduling is under the control of the operating system. A symmetric multi-core processor means that all cores of the multi-core processor have the same capability. Even real-time operating systems schedule their execution units, i.e. tasks, threads or processes, in a transparent way according to priorities, execution unit states and queues. Executing an algorithm with a real-time operating system therefore makes it necessary to assign a priority to each task of the algorithm. Several approaches are known how to map task to priorities in order to derive a feasible and predictable algorithm duration. Unfortunately, such a mapping has to be done manually by the programmer leveraging his experience and knowledge. Furthermore, this procedure is hardly applicable to algorithms having interdependent tasks.
Even when using so-called asymmetric multi-core processors or a combination of different multi-core units, like the so- called IBM Cell Broadcasting Engine (CBE) and graphical processing units (GPUs) the scheduling of the tasks of an algorithm has to be done manually. The scheduling result has to be programmed explicitly. Each scheduling solution needs to consider which tasks are able to run on which cores and needs to specify and trigger task-to-core deployments explicitly.
It is therefore an object of the present invention to provide a method for providing a schedule for a predictable operation of an algorithm on a multi-core processor. It is a further object of the present invention to provide a device for creating a schedule for a predictable operation of an algorithm on a multi-core processor. Furthermore, a computer program product directly loadable into the internal memory of a digital computer has to be provided.
These objects are solved by a method according to claim 1, a computer program product according to claim 11 and a device according to claim 12. Preferred embodiments of the invention are set out in the dependent claims.
A method for providing a schedule for a predictable operation of an algorithm on a multi-core processor, the multi-core processor comprising a plurality of parallel working cores, comprises the steps of creating a model of the algorithm, thereby identifying tasks of the algorithm and at least one characteristic of each of the tasks; exploiting an optimization method taking account of a first optimization criterion, thereby assigning each identified task according to its at least one characteristic to at least some of the plurality of cores of the multi-core processor and determining a starting point in time for each of the tasks of the algorithm; repeatedly exploiting the optimization method taking account of at least one second optimization criterion, thereby outputting the starting point in time for each of the tasks, the first and the at least one second optimization criteria; and analyzing each schedule of operation of the algorithm with regard to the starting point in time for each of the tasks, the first and the at least one second optimization criterion to determine the best combination of the first and the at least one second criteria.
Scheduling a predictable operation means that the overall time of the algorithm, the needed number of cores for executing the algorithm, the time-robustness and the resulting core utilization of the number of cores is identical with each execution of the algorithm.
A scheduling policy on multi-core processors must not be dominated by one dimension, as for example its overall time- optimization. Applying the method according to the invention means to make a trade-off analysis to complex algorithms so that their task-to-core deployment and scheduling addresses this problem. It takes several dimensions of variability, i.e. optimization criteria, as input parameters. The delivered result is a set of scheduling solutions to each tuple of input values and the impact of each solution like time-optimization, the scale of its time-robustness or the specific tasks worth being further parallelized.
Thereby, the power of the multi-core processor can be fully exploited. The schedule for the algorithm furthermore allows a predictable operation. Therefore, the method according to the invention allows providing algorithms which can be used for critical applications, e.g. medical measurements. An advantage over the prior art is that the method according to the invention can be carried out automatically by using a computer system instead of a manual approach. Therefore, the method according to the invention may be used for algorithms with changing tasks, changing dependencies and durations because scheduling a new predictable operation of the algorithm can be done even dynamically. The first optimization criterion may correspond to a time- optimization of the algorithm. Therefore, starting point for the method according to the invention is to provide a schedule for a predictable operation of an algorithm which is time-optimized. The at least one second optimization criterion may correspond to a fixed number of cores and/or a discrete range of cores and/or a time-robustness of the schedule and/or parallelization options of certain tasks. According to the used at least one second optimization criterion an already provided schedule for the operation of the algorithm can be varied to determine the best combination of the first and the at least one second criteria for the schedule of the tasks of the algorithm.
According to an improvement only some of the second optimization criteria are considered during exploiting the optimization method repeatedly. As a result, one or more solutions for scheduling the tasks of the algorithm can be found. In contrast to using all of the second optimization criteria, the time for finding one or more solutions can be reduced.
According to a further embodiment different second optimization criteria are considered during exploiting each run of the optimization method. This step is done before analysing each schedule of operation of the algorithm with regard to the starting point in time for each of the tasks, the first and the at least one second optimization criteria to determine the best combination of the first and the at least one second criteria.
The method steps of the invention may be carried out based on a computer. Alternatively, it is also possible that some of the method steps of the invention are carried out and created by, respectively, a programmer.
As an optimization method an optimization algorithm or a heuristics may be used. Thereby use of known optimization algorithms and heuristics is possible, like the optimization method "Simulated Annealing" or the heuristics "First Fit Decreasing" .
According to a further embodiment of the invention the first and/or the at least one second optimization criterion gets weighted with a respective weight parameter. A weight parameter may be specified for one, for some or for the entire first and/or the at least on second optimization criteria. As a result, only those schedules (defined by starting point in time for each of the tasks, the first and the at least one second optimization criterion) are outputted which meet the specified weighted criteria.
Furthermore, the first optimization criterion may be specified as a range. Accordingly, the at least one second optimization criterion may be specified as a range.
According to a further embodiment, the analysis of each schedule of operation of the algorithm ends up with one or more suggestions of the first and the at least one second optimization criterion. After that, a choice can be made automatically or by the software engineer which of the proposed schedules seems to be the most practical solution for the algorithm running on a specific multi-core processor.
According to a second aspect of the invention a computer program product directly loadable into the internal memory of a digital computer is suggested, the computer program product comprising software code portions for performing the steps of the method according to the invention when said product is run on a computer.
According to a third aspect of the invention a device for providing a schedule for a predictable operation of an algorithm on a multi-core processor comprising a plurality of parallel working cores is suggested. The device comprises a first means for creating a model of the algorithm, thereby identifying tasks of the algorithm and at least one characteristic of each of the tasks; a second means for exploiting an optimization method taking account of a first optimization criterion, thereby assigning each identified task according to its at least one characteristic to at least some of the plurality of cores of the multi-core processor and for determining a starting point in time for each of the tasks of the algorithm, a third means for repeatedly exploiting the optimization method taking account of at least one second optimization criterion, thereby outputting the starting point in time for each of the tasks, the first and the at least one second optimization criterion; and a fourth means for analysing each schedule of operation of the algorithm with regard to the starting point in time for each of the tasks, the first and the at least one second optimization criterion to determine the best combination of the first and the at least one second optimization criteria.
A device according to the invention has the same advantages pointed out in connection with the described method.
Furthermore, the device according to the invention comprises further means for executing the method steps as set out above .
The invention will be described in more detail by reference to the accompanying figures.
Fig. 1 shows a simplified already analysed model of an algorithm containing a plurality of tasks, and
Fig. 2 shows a schedule of the tasks of the algorithm of
Fig. 1 assigned to a plurality of cores of a multi- core processor.
Fig. 1 shows a simplified model of an algorithm ALG consisting of tasks tO to t22. Each of the tasks tθ,...,t22 has a specific duration which is indicated by reference numeral "delta". For instance, task tO has a duration of delta = 200 ms . Task t7 has a duration of delta = 50 ms and so on. Furthermore, Fig. 1 shows a dependency between some of the tasks. For example, the dependency between task t9 and task t7 is indicated with dl . This means, between tasks t7 and t9 there is a data exchange and task t7 has to wait for some input information from task tθ. Accordingly, tasks t6, t8 and tlO need input information from task t7. The dependencies are outlined with d4, d5 and d6. In an appropriate manner dependencies dl to d32 between further interdependent tasks are indicated.
Fig. 1 exemplarily shows the algorithm after performing the step of creating a model of the algorithm, thereby identifying the tasks tl,...,t22 of the algorithm ALG and identifying characteristics of each of the tasks. The characteristics used for a first step of optimization and shown in Fig. 1 are the duration of each of the tasks and dependencies to other tasks. Thereafter, a time-optimized algorithm ALG will be determined by an optimization method, e.g. any optimization algorithm or a heuristics. The determination of the time-optimized scheduling can be done automatically. Applying a time-optimization to the algorithm shown in Fig. 1 results in using eight cores of a multi-core processor (which might have more than eight cores in total) and an overall duration of the algorithm of 1850 ms . The resulting task-to-core scheduling is outlined in Fig. 2.
As can be seen from Fig. 2 each of the tasks tθ,...,t22 is assigned to a specific core P1,...,P8: task t22 is assigned to core Pl. Tasks tl, tl9 and t4 are assigned to core P4. Tasks t5, tl8 and til are assigned to core P5. Tasks tθ, t6, tl5 and tl7 are assigned to core P6. Tasks t7, t8, t9, tl4 and t2 are assigned to core P7. Tasks tl3 and t21 are assigned to core P2. Tasks tlβ, t20 and t3 are assigned to core P3. Last, tasks tlO and tl2 are assigned to core P8. The assignment to the eight cores is only exemplary. However, another task-to- core deployment could be used as a starting point for the method according to the invention. Furthermore, the dependencies dl,...,d32 which are outlined in Fig. 1 are shown in Fig. 2, too.
Further criteria which can affect a time-optimization for the schedule for a predictable operation of the algorithm ALG may be data exchanges between the tasks, task durations, core candidates per task and task groups to be deployed to one core .
Further parameters or criteria besides time-optimization for finding an optimized schedule for the algorithm ALG are a fixed number of cores of the multi-core processor, a discrete range of cores, a time-robustness of the schedule and parallelization options of certain tasks. These criteria can be called trade-off criteria. Some or all of these criteria will be used as input information to exploit optimization methods (algorithms or heuristics) to suggest possible schedulings of tasks to certain cores at specific times. The input criteria which represent further "dimensions of variability" of the algorithm besides time-optimization can be dealt with by an interactive simulation of deployment suggestions and an impact analysis of alternative deployment options and their trade-offs.
According to this, the optimization method is repeatedly exploited taking account of some or all of the above mentioned optimization criteria, thereby outputting the starting point in time for each of the tasks and the optimization criteria to be considered for each possible schedule. Thereafter, each proposed schedule of operation of the algorithm is analysed with regard to its starting point in time for each of the tasks and the optimization criteria considered determining the best combination of the optimization criteria.
The impact analysis may encompass the following usage scenarios : A fixed number of cores of the multi-core processor may be specified. As a result a suggestion about a time- optimal scheduling solution feasible within the bounds of given cores is provided.
Alternatively a (discrete) range of cores (e.g. 8, 12, 16 available cores) can be specified. As a result deployment suggestions sorted along several criteria like operation time and needed number of cores is provided. By selection of one of these scheduling solutions a simulation of different task duration variations could be executed in order to see the time- robustness of a scheduling solution depending on ranges of task durations. Thus, one is able to simulate several deployment suggestions by using a range of cores and task duration variations as input parameters. The delivered result is the impact of all input values within the specified range on the overall optimized algorithm time and its time-robustness.
In contrast, a selection criterion may be specified, e.g. a weighted measure of all variability dimensions like time-optimization, time-robustness, needed number of cores etc. As an input a range of available cores, task durations and time optimization ranges with minimums and maximums can be specified. Another input value is the weight which is a kind of importance of each input dimension, each needed to determine the final scheduling policy. For instance, the needed number of cores should influence the selection of the final scheduling policy by 25%, the time-optimization by 35% and the time-robustness by 40%. As a result, one deployment suggestion according to the specified tradeoff criterion is provided.
On the other hand, if the number or a range of available cores is not restricted, the result will be a best- possible time-deployment suggestion. Another option is to focus on the scheduling suggestion estimated as the best according to the results of analysis of the repeatedly exploited optimization methods. It could be aimed to retrieve suggestions about which tasks are worth being split into sub-tasks in order to further improve the overall operation time. For instance, when the rate of incoming data blocks cannot be met by the operation time of the selected algorithm it is appropriate to split further tasks into subtasks. With the method according to the invention it is possible to gather information about which tasks are worth the effort to split them into subtasks.
Trade-off criteria allow to stress certain aspects of the proposed scheduling solutions, like predictable (not necessarily fast) and robust over all time execution with a limited number of cores for the fastest operation time possible with as many cores as needed. The invention provides a powerful mechanism to assist software developers in making an informed and automatic decision on the right scheduling solution .

Claims

Patent Claims
1. A method for providing a schedule for a predictable operation of an algorithm (ALG) on a multi-core processor comprising a plurality of parallel working cores (P1,...,P8), comprising the steps of creating a model of the algorithm (ALG) , thereby identifying tasks (tl, ..., t22) of the algorithm (ALG) and at least one characteristic of each of the tasks (tl,...,t22); exploiting an optimization method taking account of a first optimization criterion, thereby assigning each identified task (tl, ..., t22) according to its at least one characteristic to at least some of the plurality of cores (P1,...,P8) of the multi-core processor and determining a starting point in time for each of the tasks (tl,...,t22) of the algorithm (ALG); repeatedly exploiting the optimization method taking account of at least one second optimization criterion, thereby outputting the starting point in time for each of the tasks (tl, ..., t22) , the first and the at least one second optimization criterion; and analysing each schedule of operation of the algorithm (ALG) with regard to the starting point in time for each of the tasks (tl, ..., t22) , the first and the at least one second optimization criterion to determine the best combination of the first and the at least one second optimization criteria.
2. Method according to claim 1, wherein the first optimization criterion corresponds to a time-optimization of the algorithm (ALG) .
3. Method according to claim 1 or 2, wherein the at least one second optimization criterion corresponds to a fixed number of cores (P1,...,P8), and/or a discrete range of cores (P1,...,P8), and/or a time-robustness of the schedule, and/or parallelization options of certain tasks (tl, ..., t22) .
4. Method according to claim 3, wherein only some of the second optimization criteria are considered during exploiting the optimization method repeatedly.
5. Method according to one of the preceding claims, wherein different second optimization criteria are considered during exploiting each run of the optimization method.
6. Method according to one of the preceding claims, wherein an optimization algorithm or a heuristics is used as an optimization method.
7. Method according to one of the preceding claims, wherein the first and/or the at least one second optimization criterion gets weighted with a respective weight parameter.
8. Method according to one of the preceding claims, wherein the first optimization criterion is specified as a range.
9. Method according to one of the preceding claims, wherein the at least one second optimization criterion is specified as a range .
10. Method according to one of the preceding claims, wherein the analysis of each schedule of operation of the algorithm (ALG) ends up with one or more suggestions of the first and the at least one second optimization criterion.
11. Computer program product directly loadable into the internal memory of a digital computer, comprising software code portions for performing the steps of one of the preceding claims when said product is run on a computer.
12. A device for providing a schedule for a predictable operation of an algorithm (ALG) on a multi-core processor comprising a plurality of parallel working cores (P1,...,P8), comprising a first means for creating a model of the algorithm (ALG), thereby identifying tasks (tl, ..., t22) of the algorithm (ALG) and at least one characteristic of each of the tasks (tl, ..., t22) ; a second means for exploiting an optimization method taking account of a first optimization criterion, thereby assigning each identified task (tl,...,t22) according to its at least one characteristic to at least some of the plurality of cores (P1,...,P8) of the multi- core processor and for determining a starting point in time for each of the tasks (tl,...,t22) of the algorithm (ALG) ; - a third means for repeatedly exploiting the optimization method taking account of at least one second optimization criterion, thereby outputting the starting point in time for each of the tasks (tl, ..., t22) , the first and the at least one second optimization criterion; and a fourth means for analysing each schedule of operation of the algorithm (ALG) with regard to the starting point in time for each of the tasks (tl, ..., t22) , the first and the at least one second optimization criterion to determine the best combination of the first and the at least one second optimization criteria.
13. Device according to claim 12, comprising further means for executing the method steps according to claim 2 to 10.
PCT/EP2008/055907 2007-06-05 2008-05-14 Method and device for providing a schedule for a predictable operation of an algorithm on a multi-core processor WO2008148624A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP07011069 2007-06-05
EP07011069.7 2007-06-05

Publications (1)

Publication Number Publication Date
WO2008148624A1 true WO2008148624A1 (en) 2008-12-11

Family

ID=39734909

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2008/055907 WO2008148624A1 (en) 2007-06-05 2008-05-14 Method and device for providing a schedule for a predictable operation of an algorithm on a multi-core processor

Country Status (1)

Country Link
WO (1) WO2008148624A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102193826A (en) * 2011-05-24 2011-09-21 哈尔滨工程大学 Method for high-efficiency task scheduling of heterogeneous multi-core processor
US20130312001A1 (en) * 2010-10-28 2013-11-21 Noriaki Suzuki Task allocation optimization system, task allocation optimization method, and non-transitory computer readable medium storing task allocation optimization program
US20140344825A1 (en) * 2011-12-19 2014-11-20 Nec Corporation Task allocation optimizing system, task allocation optimizing method and task allocation optimizing program
WO2015004207A1 (en) * 2013-07-10 2015-01-15 Thales Method for optimising the parallel processing of data on a hardware platform

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
COFFMAN E G ET AL: "AN APPLICATION OF BIN-PACKING TO MULTIPROCESSOR SCHEDULING", SIAM JOURNAL ON COMPUTING, SOCIETY FOR INDUSTRIAL AND APPLIED MATHEMATICS, US, 1 January 1979 (1979-01-01), pages 1 - 17, XP008077029, ISSN: 0097-5397 *
COLES J: "A User Interface for DAG Scheduling Algorithms", MANCHESTER UNIVERSITY, vol. -, no. -, 2 May 2007 (2007-05-02), pages 1 - 31, XP002495491 *
JUAN R PIMENTEL ED - JUAN R PIMENTEL: "An Incremental Approach to Task and Message Scheduling for AUTOSAR Based Distributed Automotive Applications", SOFTWARE ENGINEERING FOR AUTOMOTIVE SYSTEMS, 2007. ICSE WORKSHOPS SEAS '07. FOURTH INTERNATIONAL WORKSHOP ON, IEEE, PI, 1 May 2007 (2007-05-01), pages 1 - 1, XP031175845, ISBN: 978-0-7695-2968-4 *
RONNGREN S ET AL: "Static multiprocessor scheduling of periodic real-time tasks with precedence constraints and communication costs", SYSTEM SCIENCES, 1995. VOL. III,. PROCEEDINGS OF THE TWENTY-EIGHTH HAW AII INTERNATIONAL CONFERENCE ON WAILEA, HI, USA 3-6 JAN. 1995, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, 3 January 1995 (1995-01-03), pages 143 - 152, XP010128247, ISBN: 978-0-8186-6935-4 *
YU-KWONG KWOK ET AL: "Static scheduling algorithms for allocating directed task graphs to multiprocessors", ACM COMPUTING SURVEYS, ACM, NEW YORK, NY, US, US, vol. 31, no. 4, 1 December 1999 (1999-12-01), pages 406 - 471, XP002461554, ISSN: 0360-0300 *
ZHENG W: "Workflow Scheduling Simulation Demo (Screenshots)", MANCHESTER UNIVERSITY, vol. -, no. -, September 2006 (2006-09-01), pages 1 - 12, XP002495492, Retrieved from the Internet <URL:http://www.cs.man.ac.uk/~zhengw/demo/demo.html> *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130312001A1 (en) * 2010-10-28 2013-11-21 Noriaki Suzuki Task allocation optimization system, task allocation optimization method, and non-transitory computer readable medium storing task allocation optimization program
US9384053B2 (en) * 2010-10-28 2016-07-05 Nec Corporation Task allocation optimization system, task allocation optimization method, and non-transitory computer readable medium storing task allocation optimization program
CN102193826A (en) * 2011-05-24 2011-09-21 哈尔滨工程大学 Method for high-efficiency task scheduling of heterogeneous multi-core processor
CN102193826B (en) * 2011-05-24 2012-12-19 哈尔滨工程大学 Method for high-efficiency task scheduling of heterogeneous multi-core processor
US20140344825A1 (en) * 2011-12-19 2014-11-20 Nec Corporation Task allocation optimizing system, task allocation optimizing method and task allocation optimizing program
US9535757B2 (en) * 2011-12-19 2017-01-03 Nec Corporation Task allocation optimizing system, task allocation optimizing method and task allocation optimizing program
WO2015004207A1 (en) * 2013-07-10 2015-01-15 Thales Method for optimising the parallel processing of data on a hardware platform
US10120717B2 (en) 2013-07-10 2018-11-06 Thales Method for optimizing the size of a data subset of a processing space for improved execution performance

Similar Documents

Publication Publication Date Title
US11720408B2 (en) Method and system for assigning a virtual machine in virtual GPU enabled systems
TWI594117B (en) Profiling application code to identify code portions for fpga inplementation
EP2176751B1 (en) Scheduling by growing and shrinking resource allocation
JP2012511204A (en) How to reorganize tasks to optimize resources
US20120185866A1 (en) System and method for managing the interleaved execution of threads
EP2839370B1 (en) Time slack pipeline balancing for multicore programmable logic controllers (plc)
US11816509B2 (en) Workload placement for virtual GPU enabled systems
Carastan-Santos et al. One can only gain by replacing EASY Backfilling: A simple scheduling policies case study
CN110308982B (en) Shared memory multiplexing method and device
TW201519091A (en) Computer system, method and computer-readable storage medium for tasks scheduling
KR101703328B1 (en) Apparatus and method for optimizing data processing over the heterogeneous multi-processor environment
US10133660B2 (en) Dynamically allocated thread-local storage
US11816545B2 (en) Optimizing machine learning models
Hofer et al. Industrial control via application containers: Migrating from bare-metal to IAAS
Suzuki et al. Real-time ros extension on transparent cpu/gpu coordination mechanism
WO2008148624A1 (en) Method and device for providing a schedule for a predictable operation of an algorithm on a multi-core processor
Cavicchioli et al. Novel methodologies for predictable CPU-to-GPU command offloading
Krawczyk et al. Automated distribution of software to multi-core hardware in model based embedded systems development
WO2008148625A1 (en) Method and device for scheduling a predictable operation of an algorithm on a multi-core processor
Manolache et al. Optimization of soft real-time systems with deadline miss ratio constraints
CN110088730A (en) Task processing method, device, medium and its equipment
KR102022972B1 (en) Runtime management apparatus for heterogeneous multi-processing system and method thereof
JP2021039666A (en) Core allocation device and core allocation method
JP6753521B2 (en) Computational resource management equipment, computational resource management methods, and programs
Pang et al. Efficient CUDA stream management for multi-DNN real-time inference on embedded GPUs

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08759589

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08759589

Country of ref document: EP

Kind code of ref document: A1