CN104281495A

CN104281495A - Method for task scheduling of shared cache of multi-core processor

Info

Publication number: CN104281495A
Application number: CN201410537569.1A
Authority: CN
Inventors: 唐小勇
Original assignee: Hunan Agricultural University
Current assignee: Hunan Agricultural University
Priority date: 2014-10-13
Filing date: 2014-10-13
Publication date: 2015-01-14
Anticipated expiration: 2034-10-13
Also published as: CN104281495B

Abstract

The invention discloses a method for task scheduling of a shared cache of a multi-core processor. The method includes: firstly, dividing the shared Cache of the multi-core processor into a plurality of blocks, and initializing processing core correlation parameters; secondly, for each task in a task queue of the multi-core processor and different shared Cache blocks needed by the tasks, computing earliest task execution completion time corresponding to all processing cores and the task shared Cache blocks; thirdly, judging whether or not task processing core pairs capable of satisfying conditions of the shared Cache blocks needed by the tasks are available; fourthly, searching for the optimum schedulable task processing core pair, scheduling the tasks onto the corresponding processing cores to the executed, and updating the correlation parameter of the multi-core processor parameters; fifthly, judging whether all the tasks in the task queue are scheduled or not, if yes, outputting a task processing core pair sequence, and otherwise, executing the second, third and fourth steps circularly. Compared with an existing multi-core processor oriented task scheduling theory, the method has the performance advantages of short scheduling length and average response time and the like.

Description

Polycaryon processor shares caching duties dispatching method

Technical field

The invention belongs to computer software and chip multi-core processor resource management and task scheduling technique field, relate to a kind of method for scheduling task considering shared high-speed cache Cache.

Background technology

In recent years, along with improving constantly of VLSI (very large scale integrated circuit) integrated level and dominant frequency, the physics limit challenge that integrated circuit technique encounters that such as Interconnect Delay, short-channel effect, drift velocity are saturated, Hot-carrier Degradation Effects etc. is impassable.This challenge brings the problems such as manufacturing cost, power consumption, heat radiation to single core processor technology, impels chip manufacturer to turn to the polycaryon processor of integrated multiple processor core on chip.At present, the business multinuclear U processor of integrated tens of core, as the white dragon of 12 core Intel Xeon E5, AMD etc. has been widely used in the mass computing service fields such as a group of planes, data center, cloud computing.

The design feature of chip multi-core processor most critical is: its all processor core is not self-existent, but is connected to each other by many public resources, and these resources comprise high-speed cache Cache, Memory access channel etc.When this architecture feature makes multiple user application or parallel thread perform on polycaryon processor, even if there is not any resource across application and communication need, each application or thread are still subject to interfering with each other due to shared resource, such as, share high-speed cache Cache.Because multiple application or thread share chip multi-core processor afterbody Cache (L2 or L3), make Cache resource generation conflict of vying each other to each other, cause polycaryon processor concurrency performance to decline.

Polycaryon processor causes degradation problem to be domestic and international study hotspot due to shared Cache resource always, the present invention adopts software task scheduling strategy to overcome this problem, attempt to alleviate by rational task matching and regulation and control the interference that resource contention brings, effectively reduce concurrency performance and decline.

Mission Scheduling belongs to combinatorial optimization problem in essence, and the optimum solution of combinatorial optimization problem belongs to np complete problem, particularly demand fulfillment task sharing Cache block demand of the present invention, thus np complete problem especially.In fact np complete problem solution cost is obtained too large.

Summary of the invention

The present invention is directed to polycaryon processor parallel thread causes cross-thread to produce conflict because of competitive resource due to shared secondary or three grades of high-speed cache Cache, make polycaryon processor concurrency performance decline phenomenon, propose the method for scheduling task that shared Cache drives.

For realizing above-mentioned technical purpose, the technical solution adopted in the present invention is:

A kind of polycaryon processor shares caching duties dispatching method, and the method comprises the steps:

Step 1: high-speed cache Cache is shared to multi-core processor system and carries out the division of Cache block, first by column address space, shared Cache is divided into some Cache pages, and then shared Cache is divided into the Cache block be made up of Cache page;

Step 2: respectively initialization task early start execution time, single process core the earliest the complete time, single process core has shared Cache block number, system can use shared Cache block number;

Step 3: for each task in multi-core processor system task queue, required shared Cache block number when performing according to it, each process core in system is judged, as meet system can with shared Cache block number and respective handling core have shared Cache block number sum and be not less than this required by task shared Cache block number, then calculate the earliest complete time of this task on this process core, otherwise do not calculate, again next task is judged after traveling through all process core, till having judged all tasks;

Step 4: according to the result of step 3, judges whether to exist the process core of the task in queue of can executing the task, and namely whether calculating arbitrary task complete time the earliest on arbitrary process core, if any then performing step 6, otherwise performing step 5;

Step 5: the complete time of inquiring about all process core process self task existing, find the process core of the remaining complete shortest time of current Processing tasks, being updated to complete time of this process core is no longer complete time the earliest in all process core, and wait for that this process core is finished the work, then the shared Cache block number that this process core has is discharged, namely multi-core processor system can be updated to original piece of number+this shared Cache block number of having of process core with shared Cache block number, the shared Cache block that this process core has is set to 0, go to step 7,

Step 6: the earliest complete time of each task on respective handling core obtained according to step 3, finds out wherein complete time the earliest and corresponding task v _iand respective handling core p _k; System is task v _idistribute to process core p _k, update process core p _kthe complete time be the earliest this task v _iat process core p _kon the complete time the earliest, process core p _kthe shared Cache number of blocks had is updated to original place reason core p _kthe shared Cache block had and task v _irequired shared Cache block number sum, the quantity of the available shared Cache block that polycaryon processor has is updated to the available shared Cache block that former polycaryon processor has and deducts required by task shared Cache block number, goes to step 7;

Step 7: whether also have task in query task queue in wait scheduling, if there is no task, export task process core schedule sequences pair, otherwise return step 3 recalculate all tasks process core on the complete time the earliest and circulation perform until all task schedulings are complete.

Described a kind of polycaryon processor shares caching duties dispatching method, and in described step 1, the size of Cache page is 512B.

Described a kind of polycaryon processor shares caching duties dispatching method, in described step 1, and capacity=shared Cache capacity/(the processor check figure * 10) of Cache block.

Described a kind of polycaryon processor shares caching duties dispatching method, and in described step 3, each task in multi-core processor system task queue, when submitting to, can submit required Cache block number and corresponding execution time to simultaneously.

Described a kind of polycaryon processor shares caching duties dispatching method, in described step 5, after the process core finding the remaining complete shortest time of current Processing tasks, is updated to the complete time of the 3rd morning in all process core complete time the complete time of this process core.

Whether described a kind of polycaryon processor shares caching duties dispatching method, in described step 7, also have task namely to check whether task queue is empty in wait scheduling in query task queue.

Technique effect of the present invention is, heuristic mutation operations strategy is adopted to provide rational solution space to improve polycaryon processor concurrence performance task ability, promote processor performance, adopt the method, compare than existing multi-core processor oriented task scheduling theory and there is scheduling length and the performance advantage such as average response time is short.

Accompanying drawing explanation

Fig. 1 is that polycaryon processor provided by the invention shares caching duties dispatching method process flow diagram;

Fig. 2 is the multicore processor cache Cache level exemplary plot that the invention process provides;

Fig. 3 is 60 to the 200 task experimental results that 4 core processors have 28 shared Cache blocks;

Embodiment

Below in conjunction with drawings and Examples, the method for the invention is described in detail.

The present invention proposes the method for scheduling task that a kind of polycaryon processor shared Cache drives, its process flow diagram as shown in Figure 1.The method can make full use of polycaryon processor Cache to realize a kind of Task Scheduling Mechanism efficiently, thus improves polycaryon processor concurrent processing performance.

The present invention is achieved through the following technical solutions:

The present embodiment is for the concurrent application task with independence, and these tasks do not have the dependence such as data and control each other, can to its data set independent operating.But compete due to polycaryon processor shared Cache, cause task data efficient loading can not enter Cache and processor performance is reduced.The execution time that in the present invention, each task has relative to shared Cache block represents: w _i,j, wherein i represents task v _i, j represents task v _ishared Cache block number required during execution.As w _1,6=19.3 represent task v ₁the execution time obtaining 6 shared Cache blocks is 19.3s, w _1,10=17.5 represent task v ₁the execution time obtaining 10 shared Cache blocks is 17.5s.In the present embodiment, task obtains shared Cache block is that more execution time are fewer within the specific limits, but after exceeding this scope shared Cache block number do not affect task execution time.

Polycaryon processor, due on one chip integrated for multiple process core, therefore adopts Resource Sharing Technology in a large number.Wherein high-speed cache Cache is its most important technology of sharing, and Fig. 2 is the polycaryon processor of AMD, and each process core has L1 and L2Cache of its local, but all shares L3Cache.The present embodiment p _krepresent polycaryon processor kth process core.

The present embodiment proposes the Cache block division methods based on software support to shared Cache, and its basic idea is the size utilizing OS page distribution mechanism to carry out control task use Cache.First utilize polycaryon processor OS function, by column address space, shared Cache is divided into the minimum page, then according to polycaryon processor check figure, shared Cache is divided into some pieces, four core processors being 4M as shared Cache can be divided into 40Cache block.Its objective is and calculate required by task Cache block in dispatching algorithm, under polycaryon processor OS software support, adopt set associative to realize the effective mapping of data code to shared Cache.

First polycaryon processor shares caching duties dispatching method can carry out initialization to each being correlated with, and gives initial value respectively to following parameters.EST (v _i, p _k)=0 represents task v _iat process core p _kon initial Starting Executing Time be 0; FT (p _k)=0 represents current and distributes to process core p _kthe complete time the earliest of upper task is 0; PSCache [k]=0 represents that the shared Cache block initial value that all process core has is 0; AvailCache=MaxCache represents that current processor can be the maximum available shared Cache block number of current system with shared Cache block, the 40Cache block as above in example.

For each task in multi-core processor system task queue, according to its difference to shared Cache block number demand, the present embodiment is assessed each process successively and is calculated its task complete time the earliest.Its computing method are according to following formula:

EFT(v _i,p _k)＝EST(v _i,p _k)+ET(v _i,j,p _k)

＝FT(p _k)+w _i,j

subject?to?j≤AvailCache+pSCache[k]

When the available shared Cache block AvailCache+pSCache [k] that polycaryon processor can provide meets required by task shared Cache block, the earliest complete time EFT (v of calculation task on process core _i, p _k).The complete time is exactly task place process core allocating task FT of complete time (p the earliest to task the earliest _k) and task execution time w _i,jsum.The present embodiment repeats this computation process, until all tasks, the different sharing Cache block number of task and all process core all calculate (the v of EFT of complete time the earliest of task _i, p _k).

Manage throughout on core after execution when polycaryon processor distributes some tasks, available shared Cache block AvailCache will reduce.Condition AvailCache+pSCache [k] in some cases for likely situation all can not meet, now multicore processor resources management system will discharge some process cores.Dispatching method is by (the p of FT of complete time the earliest of all process core process self current tasks of inquiry _k), find in all process core and there is the minimum (p of FT of complete time the earliest _m), and upgrade polycaryon processor parameter: FT of complete time (p the earliest _m) assignment is the 3rd complete time FT (p early _k); Available shared Cache block AvailCache is updated to AvailCache+pSCache [m]; Process core p _mthe shared Cache block pSCache [m] had is set to 0.After polycaryon processor upgrades all systematic parameters, recalculate all tasks by continuing to return previous step, complete time the earliest of task under different sharing Cache block number and process core.

Perform if having task to satisfy condition can be dispatched on certain process core, the present embodiment method for scheduling task will inquire about all possible task complete time the earliest, find out minimum task EFT of complete time (v the earliest _i, p _k).System will task v _idistribute to process core p _k, and upgrade polycaryon processor correlation parameter: process core p _kfT of complete time (p the earliest _k)=EFT (v _i, p _k); Process core p _kshared Cache block pSCache [k]=pSCache [k]+j of disclosing; The available shared Cache block AvailCache=AvailCache-j that polycaryon processor has.

The shared caching duties dispatching method that the present embodiment adopts will continue each step above, until all tasks have all been dispatched in polycaryon processor task queue.

Performance evaluation and result verification:

It is a kind of low time complexity, high efficiency dispatching technique that the polycaryon processor of the present embodiment shares caching duties dispatching method, and its time complexity is O (N ²mL).Wherein N is number of tasks, and M is polycaryon processor process check figure, and L is the maximum required different sharing Cache block number of all tasks.

The polycaryon processor that in simulated experiment, the present invention proposes shares caching duties dispatching method called after SCAS, and main and classical task scheduling algorithm MIN-MIN compares.In order to effectively understand the performance of dispatching method SCAS, test and some change has been made to SCAS, be not that minimum task of complete time and process are the earliest checked when selecting OPTIMAL TASK, but maximum task of complete time and process are the earliest checked, the method called after MSCAS.Performance Evaluating Indexes mainly contains scheduling length (Makespan) and average response time (Average response time).

As shown in Figure 3, all experimental datas are all the mean value of many experiments to experimental result.Fig. 3 is 60 to the 200 task experimental results that 4 core processors have 28 shared Cache blocks.As can be seen from Figure 3, polycaryon processor is shared caching duties dispatching method SCAS and is all better than MIN-MIN, MSCAS in scheduling length or average response time.In fact, in average scheduling length, SCAS shorter than MSCAS by 16.1%, shorter than MIN-MIN 8.37%.For average response time, SCAS more excellent than MSCAS 149%, more excellent than MIN-MIN 10.5%.Above experimental result shows that shared caching duties dispatching method can effectively improve polycaryon processor performance.

In sum, the polycaryon processor that the present invention proposes is shared caching duties dispatching method and can be overcome the concurrency performance that existing polycaryon processor brings due to shared Cache and reduce, and effectively improves polycaryon processor performance.

Although content of the present invention has done detailed introduction by above-described embodiment, will be appreciated that above-mentioned description should not be considered to limitation of the present invention.After those skilled in the art have read foregoing, for multiple amendment of the present invention and substitute will be all apparent.Therefore, protection scope of the present invention should be limited to the appended claims.

Claims

1. polycaryon processor shares a caching duties dispatching method, it is characterized in that, the method comprises the steps:

2. a kind of polycaryon processor according to claim 1 shares caching duties dispatching method, it is characterized in that, in described step 1, the size of Cache page is 512B.

3. a kind of polycaryon processor according to claim 1 shares caching duties dispatching method, it is characterized in that, in described step 1, and capacity=shared Cache capacity/(the processor check figure * 10) of Cache block.

4. a kind of polycaryon processor according to claim 1 shares caching duties dispatching method, it is characterized in that, in described step 3, each task in multi-core processor system task queue, when submitting to, can submit required Cache block number and corresponding execution time to simultaneously.

5. a kind of polycaryon processor according to claim 1 shares caching duties dispatching method, it is characterized in that, in described step 5, after the process core finding the remaining complete shortest time of current Processing tasks, is updated to the complete time of the 3rd morning in all process core complete time the complete time of this process core.

6. whether a kind of polycaryon processor according to claim 1 shares caching duties dispatching method, it is characterized in that, in described step 7, also have task namely to check whether task queue is empty in wait scheduling in query task queue.