CN104281495B - Method for task scheduling of shared cache of multi-core processor - Google Patents

Method for task scheduling of shared cache of multi-core processor Download PDF

Info

Publication number
CN104281495B
CN104281495B CN201410537569.1A CN201410537569A CN104281495B CN 104281495 B CN104281495 B CN 104281495B CN 201410537569 A CN201410537569 A CN 201410537569A CN 104281495 B CN104281495 B CN 104281495B
Authority
CN
China
Prior art keywords
task
core
shared cache
processing core
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410537569.1A
Other languages
Chinese (zh)
Other versions
CN104281495A (en
Inventor
唐小勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Agricultural University
Original Assignee
Hunan Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Agricultural University filed Critical Hunan Agricultural University
Priority to CN201410537569.1A priority Critical patent/CN104281495B/en
Publication of CN104281495A publication Critical patent/CN104281495A/en
Application granted granted Critical
Publication of CN104281495B publication Critical patent/CN104281495B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)
  • Multi Processors (AREA)

Abstract

本发明公开了一种多核处理器共享高速缓存任务调度方法。第一步,把多核处理器共享Cache划分成若干块,同时初始化处理核相关参数;第二步,对多核处理器任务队列中每个任务和任务所需不同共享高速缓存Cache块,计算所有处理核与任务共享Cache块相对应的最早任务执行完成时间;第三步,判断有没有能满足任务所需共享Cache块条件的任务处理核对;第四步,查找最优可调度任务处理核对,并将任务调度到相应处理核上执行,更新多核处理器参数相关参数。第五步,判断任务队列中所有任务是否调度完毕,如果调度完毕则输出任务处理核对序列,否则循环执行第二、三、四步。本方法较之现有面向多核处理器任务调度理论相比具有调度长度和平均响应时间短等性能优势。

The invention discloses a multi-core processor shared cache task scheduling method. The first step is to divide the shared Cache of the multi-core processor into several blocks, and initialize the processing core-related parameters at the same time; the second step is to calculate all processing The earliest task execution completion time corresponding to the core and the task shared Cache block; the third step is to judge whether there is a task processing check that can meet the shared Cache block conditions required by the task; the fourth step is to find the optimal schedulable task processing check, and Tasks are scheduled to be executed on corresponding processing cores, and parameters related to multi-core processor parameters are updated. The fifth step is to judge whether all the tasks in the task queue have been scheduled, and if the scheduling is completed, the task processing check sequence is output, otherwise, the second, third, and fourth steps are cyclically executed. Compared with the existing multi-core processor-oriented task scheduling theory, this method has performance advantages such as short scheduling length and average response time.

Description

多核处理器共享高速缓存任务调度方法Multi-core processor shared cache task scheduling method

技术领域technical field

本发明属于计算机软件以及片上多核处理器资源管理与任务调度技术领域,涉及一种考虑共享高速缓存Cache的任务调度方法。The invention belongs to the technical field of computer software and on-chip multi-core processor resource management and task scheduling, and relates to a task scheduling method considering a shared high-speed cache cache.

背景技术Background technique

近年来,随着超大规模集成电路集成度和主频的不断提高,集成电路技术遇到了诸如互连线延时、短沟道效应、漂移速度饱和、热载流子退化效应等不可逾越的物理极限挑战。这种挑战给单核处理器技术带来了制造成本、功耗、散热等问题,促使芯片厂商转向在芯片上集成多个处理器核的多核处理器。目前,集成数十核的商业多核U处理器,如十二核Intel Xeon E5、AMD皓龙等已广泛应用在机群、数据中心、云计算等大型计算服务领域。In recent years, with the continuous improvement of VLSI integration and main frequency, integrated circuit technology has encountered insurmountable physical problems such as interconnection delay, short channel effect, drift velocity saturation, hot carrier degradation effect, etc. Extreme challenge. This challenge has brought problems such as manufacturing cost, power consumption, and heat dissipation to single-core processor technology, prompting chip manufacturers to turn to multi-core processors that integrate multiple processor cores on the chip. At present, commercial multi-core U processors integrating dozens of cores, such as twelve-core Intel Xeon E5 and AMD Opteron, have been widely used in large-scale computing services such as clusters, data centers, and cloud computing.

片上多核处理器最关键的结构特点在于:其所有处理器核并不是独立存在的,而是通过诸多公共资源互相连接,这些资源包括高速缓存Cache、访存通道等。这种体系结构特点使得多个用户应用或并行线程在多核处理器上执行时,即使不存在任何跨应用的资源与通讯需求,各个应用或线程仍然由于共享资源而受到彼此干扰,例如共享高速缓存Cache。由于多个应用或线程共享片上多核处理器最后一级Cache(L2或L3),使彼此间相互竞争Cache资源产生冲突,导致多核处理器并发性能下降。The most critical structural feature of an on-chip multi-core processor is that all its processor cores do not exist independently, but are connected to each other through many common resources, including high-speed cache, memory access channels, and so on. This architectural feature allows multiple user applications or parallel threads to execute on a multi-core processor, even if there is no cross-application resource and communication requirements, each application or thread still interferes with each other due to shared resources, such as shared cache Cache. Since multiple applications or threads share the last level of Cache (L2 or L3) of the on-chip multi-core processor, they compete with each other for Cache resources and conflict, resulting in a decrease in the concurrent performance of the multi-core processor.

多核处理器由于共享Cache资源导致性能下降问题一直是是国内外研究热点,本发明采用软件任务调度策略来克服此问题,试图通过合理的任务分配与调控来缓解资源冲突带来的干扰,有效减少并发性能下降。The performance degradation of multi-core processors due to the sharing of Cache resources has always been a research hotspot at home and abroad. The present invention uses a software task scheduling strategy to overcome this problem, trying to alleviate the interference caused by resource conflicts through reasonable task allocation and regulation, and effectively reduce Concurrency performance drops.

任务调度问题本质上属于组合优化问题,而组合优化问题的最优解属于NP完全问题,特别是本发明需要满足任务共享Cache块需求,因而更是NP完全问题。实际上获得NP完全问题解代价太大。The task scheduling problem is essentially a combinatorial optimization problem, and the optimal solution of the combinatorial optimization problem is an NP-complete problem. In particular, the present invention needs to meet the task sharing Cache block requirements, so it is an NP-complete problem. In fact, the cost of obtaining NP-complete solutions is too high.

发明内容Contents of the invention

本发明针对多核处理器并行线程由于共享二级或三级高速缓存Cache而导致线程间因竞争资源产生冲突,使多核处理器并发性能下降现象,提出共享Cache驱动的任务调度方法。Aiming at the phenomenon that parallel threads of multi-core processors share secondary or tertiary high-speed caches, resulting in conflicts between threads due to competing resources, resulting in a decline in concurrent performance of multi-core processors, the invention proposes a shared Cache-driven task scheduling method.

为实现上述技术目的,本发明所采用的技术方案为:For realizing above-mentioned technical purpose, the technical scheme that the present invention adopts is:

一种多核处理器共享高速缓存任务调度方法,该方法包括如下步骤:A multi-core processor shared cache task scheduling method, the method comprises the steps of:

步骤1:对多核处理器系统共享高速缓存Cache进行Cache块划分,首先按列地址空间把共享Cache分成若干Cache页,然后再将共享Cache划分成由Cache页构成的Cache块;Step 1: Carry out Cache block division for the shared high-speed cache Cache of the multi-core processor system, first divide the shared Cache into several Cache pages according to the column address space, and then divide the shared Cache into Cache blocks composed of Cache pages;

步骤2:分别初始化任务最早开始执行时间、单个处理核最早执行完成时间、单个处理核所拥有共享Cache块数、系统可用共享Cache块数;Step 2: Initialize the earliest execution time of the task, the earliest execution completion time of a single processing core, the number of shared Cache blocks owned by a single processing core, and the number of shared Cache blocks available to the system;

步骤3:对于多核处理器系统任务队列中的每个任务,根据其执行时所需求的共享Cache块数,来对系统中的每个处理核进行判断,如满足系统可用共享Cache块数与相应处理核所拥有共享Cache块数之和不小于该任务所需共享Cache块数,则计算该任务在该处理核上的最早执行完成时间,否则不进行计算,遍历所有处理核之后再对下一个任务进行判断,直至判断完所有的任务为止;Step 3: For each task in the task queue of the multi-core processor system, judge each processing core in the system according to the number of shared Cache blocks required for its execution, if the number of available shared Cache blocks in the system meets the corresponding If the sum of the shared Cache blocks owned by the processing cores is not less than the number of shared Cache blocks required by the task, then calculate the earliest execution completion time of the task on the processing core; Tasks are judged until all tasks are judged;

步骤4:根据步骤3的结果,判断是否存在能够执行任务队列中任务的处理核,即有没有计算出任一任务在任一处理核上最早执行完成时间,如有则执行步骤6,否则执行步骤5;Step 4: According to the result of step 3, determine whether there is a processing core capable of executing tasks in the task queue, that is, whether the earliest execution completion time of any task on any processing core has been calculated, if so, perform step 6, otherwise perform step 5 ;

步骤5:查询所有处理核处理现有自身任务的执行完成时间,找到当前处理任务剩余的执行完成时间最短的处理核,将此处理核的执行完成时间更新为不再是所有处理核中最早的执行完成时间,并等待此处理核完成任务,然后释放此处理核所拥有的共享Cache块数,多核处理器系统可用共享Cache块数即更新为原有块数+此处理核所拥有的共享Cache块数,此处理核所拥有的共享Cache块置为0,转步骤7;Step 5: Query the execution completion time of all processing cores to process existing tasks, find the processing core with the shortest remaining execution completion time of the current processing task, and update the execution completion time of this processing core so that it is no longer the earliest among all processing cores Execution completion time, wait for the processing core to complete the task, and then release the number of shared Cache blocks owned by the processing core. The number of available shared Cache blocks in the multi-core processor system is updated to the original number of blocks + the shared Cache owned by the processing core The number of blocks, set the shared Cache block owned by this processing core to 0, and go to step 7;

步骤6:根据步骤3得到的每个任务在相应处理核上的最早执行完成时间,找出其中最早的执行完成时间以及所对应的任务vi及相应处理核pk;系统把任务vi分配给处理核pk,更新处理核pk的最早执行完成时间为该任务vi在处理核pk上的最早执行完成时间,处理核pk所拥有的共享Cache块数量更新为原处理核pk所拥有的共享Cache块与任务vi所需的共享Cache块数之和,多核处理器所拥有的可用共享Cache块的数量更新为原多核处理器所拥有的可用共享Cache块减去任务所需共享Cache块数,转步骤7;Step 6: According to the earliest execution completion time of each task on the corresponding processing core obtained in step 3, find out the earliest execution completion time and the corresponding task v i and corresponding processing core p k ; the system assigns task v i Given the processing core p k , update the earliest execution completion time of the processing core p k to be the earliest execution completion time of the task v i on the processing core p k , and update the number of shared Cache blocks owned by the processing core p k to the original processing core p The sum of shared cache blocks owned by k and the number of shared cache blocks required by task v i , the number of available shared cache blocks owned by the multi-core processor is updated as the available shared cache blocks owned by the original multi-core processor minus the number of shared cache blocks owned by the task Need to share the number of Cache blocks, go to step 7;

步骤7:查询任务队列中是否还有任务在等待调度,如果没有任务则输出任务处理核调度序列对,否则返回步骤3重新计算所有任务在处理核上的最早执行完成时间并循环执行直至所有任务调度完毕。Step 7: Query whether there are tasks waiting to be scheduled in the task queue. If there is no task, output the task processing core scheduling sequence pair, otherwise return to step 3 to recalculate the earliest execution completion time of all tasks on the processing core and execute in a loop until all tasks Scheduled.

所述的一种多核处理器共享高速缓存任务调度方法,所述的步骤1中,Cache页的大小为512B。The multi-core processor shared cache task scheduling method, in the step 1, the size of the Cache page is 512B.

所述的一种多核处理器共享高速缓存任务调度方法,所述的步骤1中,Cache块的容量=共享Cache容量/(处理器核数*10)。The multi-core processor shared cache task scheduling method, in the step 1, the capacity of the Cache block=shared cache capacity/(processor core number*10).

所述的一种多核处理器共享高速缓存任务调度方法,所述的步骤3中,多核处理器系统任务队列中的每个任务在提交时,同时会提交所需的Cache块数和相应的执行时间。Described a kind of multi-core processor shared high-speed cache task scheduling method, in described step 3, each task in the multi-core processor system task queue can submit required Cache block number and corresponding execution simultaneously when submitting time.

所述的一种多核处理器共享高速缓存任务调度方法,所述步骤5中,在找到当前处理任务剩余的执行完成时间最短的处理核后,将此处理核的执行完成时间更新为所有处理核执行完成时间中第三早的执行完成时间。The multi-core processor shared cache task scheduling method, in step 5, after finding the processing core with the shortest remaining execution completion time of the current processing task, update the execution completion time of this processing core to all processing cores The third earliest execution completion time among execution completion times.

所述的一种多核处理器共享高速缓存任务调度方法,所述的步骤7中,查询任务队列中是否还有任务在等待调度即检查任务队列是否为空。According to the multi-core processor shared cache task scheduling method, in step 7, it is checked whether there are tasks waiting to be scheduled in the task queue, that is, checking whether the task queue is empty.

本发明的技术效果在于,采用启发式调度策略来提供合理的解空间以提高多核处理器并发执行任务能力,提升处理器性能,采用该方法,较之现有面向多核处理器任务调度理论相比具有调度长度和平均响应时间短等性能优势。The technical effect of the present invention is that a heuristic scheduling strategy is used to provide a reasonable solution space to improve the ability of multi-core processors to execute tasks concurrently and improve processor performance. Using this method, compared with the existing multi-core processor-oriented task scheduling theory It has performance advantages such as scheduling length and short average response time.

附图说明Description of drawings

图1是本发明提供的多核处理器共享高速缓存任务调度方法流程图;Fig. 1 is the flow chart of the multi-core processor shared cache task scheduling method provided by the present invention;

图2是本发明实施提供的多核处理器高速缓存Cache层级示例图;Fig. 2 is the multi-core processor high-speed cache cache level example diagram provided by the implementation of the present invention;

图3是4核处理器拥有28共享Cache块的60到200任务实验结果;Figure 3 is the experimental results of 60 to 200 tasks with 4-core processors with 28 shared Cache blocks;

具体实施方式detailed description

下面结合附图和实施例对本发明所述方法进行详细说明。The method of the present invention will be described in detail below in conjunction with the accompanying drawings and embodiments.

本发明提出了一种多核处理器共享Cache驱动的任务调度方法,其流程图如图1所示。该方法能充分利用多核处理器Cache来实现一种高效的任务调度机制,从而提高多核处理器并发处理性能。The present invention proposes a multi-core processor shared Cache-driven task scheduling method, the flowchart of which is shown in FIG. 1 . The method can make full use of the multi-core processor Cache to implement an efficient task scheduling mechanism, thereby improving the concurrent processing performance of the multi-core processor.

本发明通过下述技术方案实现:The present invention realizes through following technical scheme:

本实施例针对具有独立性的并行应用程序任务,这些任务相互间不具有数据和控制等依赖关系,能对其数据集独立运行。但由于多核处理器共享Cache而发生竞争,导致任务数据不能有效装载进入Cache而使处理器性能降低。本发明中每个任务相对于共享Cache块而具有的执行时间表示为:wi,j,其中i表示任务vi,j表示任务vi执行时所需的共享Cache块数。如w1,6=19.3表示任务v1获得6个共享Cache块的执行时间为19.3s,w1,10=17.5表示任务v1获得10个共享Cache块的执行时间为17.5s。本实施例中任务获得共享Cache块在一定范围内是越多执行时间就越少,但超出此范围后共享Cache块的多少不影响任务执行时间。This embodiment is aimed at independent parallel application tasks, these tasks do not have dependencies such as data and control, and can run independently on their data sets. However, due to competition among multi-core processors sharing the Cache, the task data cannot be effectively loaded into the Cache and the performance of the processor is reduced. In the present invention, the execution time of each task relative to the shared Cache block is expressed as: w i,j , where i represents task v i , and j represents the number of shared Cache blocks required for task v i to execute. For example, w 1,6 =19.3 indicates that the execution time for task v 1 to obtain 6 shared Cache blocks is 19.3s, and w 1,10 =17.5 indicates that the execution time for task v 1 to obtain 10 shared Cache blocks is 17.5s. In this embodiment, the more shared Cache blocks a task obtains within a certain range, the shorter the execution time will be. However, the number of shared Cache blocks beyond this range does not affect the task execution time.

多核处理器由于把多个处理核集成在一块芯片上,因此大量采用资源共享技术。其中高速缓存Cache是其最重要的一项共享技术,图2是AMD的多核处理器,每个处理核有其局部的L1和L2Cache,但都共享L3Cache。本实施例用pk表示多核处理器第k处理核。Since the multi-core processor integrates multiple processing cores on one chip, resource sharing technology is widely used. Among them, cache Cache is the most important sharing technology. Figure 2 shows AMD's multi-core processor. Each processing core has its own local L1 and L2Cache, but both share L3Cache. In this embodiment, p k is used to represent the kth processing core of the multi-core processor.

本实施例对共享Cache提出基于软件支持的Cache块划分方法,其基本想法是利用OS页面分配机制来控制任务使用Cache的大小。首先利用多核处理器OS功能按列地址空间把共享Cache分成最小的页面,然后依据多核处理器核数把共享Cache划分成若干块,如共享Cache为4M的四核处理器可划分为40Cache块。其目的是在调度算法计算出任务所需Cache块,在多核处理器OS软件支持下采用组相联实现数据代码向共享Cache的有效映射。This embodiment proposes a software-supported Cache block division method for the shared Cache. The basic idea is to use the OS page allocation mechanism to control the size of the Cache used by tasks. First, use the multi-core processor OS function to divide the shared cache into the smallest pages according to the column address space, and then divide the shared cache into several blocks according to the number of multi-core processor cores. For example, a quad-core processor with a shared cache of 4M can be divided into 40 cache blocks. Its purpose is to calculate the Cache blocks required by the task in the scheduling algorithm, and use set associative to realize the effective mapping of data codes to shared Cache under the support of multi-core processor OS software.

多核处理器共享高速缓存任务调度方法首先会对任务在各处理核上的最早开始执行时间、处理核最早执行任务时间、所有处理核拥有的共享Cache块数和处理器可用共享Cache块数进行初始化,分别给下列参数给予初值。EST(vi,pk)=0表示任务vi在处理核pk上的初始开始执行时间为0;FT(pk)=0表示当前分配给处理核pk上任务的最早执行完成时间为0;pSCache[k]=0表示所有处理核所拥有的共享Cache块初始值为0;AvailCache=MaxCache表示当前处理器可用共享Cache块为当前系统最大可用共享Cache块数,如上例子中的40Cache块。The multi-core processor shared cache task scheduling method first initializes the earliest execution time of the task on each processing core, the earliest execution time of the processing core, the number of shared Cache blocks owned by all processing cores, and the number of shared Cache blocks available to the processor. , giving initial values to the following parameters respectively. EST(v i ,p k )=0 means that the initial execution time of task v i on processing core p k is 0; FT(p k )=0 means the earliest execution completion time of tasks currently assigned to processing core p k is 0; pSCache[k]=0 means that the initial value of shared Cache blocks owned by all processing cores is 0; AvailCache=MaxCache means that the shared Cache blocks available to the current processor are the maximum number of shared Cache blocks available in the current system, such as 40Cache in the above example Piece.

对于多核处理器系统任务队列中的每个任务,依据其对共享Cache块数需求的不同,本实施例依次对各个处理核计算其任务最早执行完成时间。其计算方法依据如下公式:For each task in the task queue of the multi-core processor system, according to its different requirements for the number of shared Cache blocks, this embodiment calculates the earliest execution completion time of the task for each processing core in turn. Its calculation method is based on the following formula:

EFT(vi,pk)=EST(vi,pk)+ET(vi,j,pk)EFT(v i ,p k )=EST(v i ,p k )+ET(v i ,j,p k )

=FT(pk)+wi,j =FT(p k )+w i,j

subject to j≤AvailCache+pSCache[k]subject to j≤AvailCache+pSCache[k]

当多核处理器能提供的可用共享Cache块AvailCache+pSCache[k]满足任务所需共享Cache块时,计算任务在处理核上的最早执行完成时间EFT(vi,pk)。任务最早执行完成时间就是任务所在处理核已分配任务最早执行完成时间FT(pk)与任务执行时间wi,j之和。本实施例重复此计算过程,直到所有任务,任务的不同共享Cache块数和所有处理核都计算出任务的最早执行完成时间EFT(vi,pk)。When the available shared Cache block AvailCache+pSCache[k] provided by the multi-core processor meets the shared Cache block required by the task, calculate the earliest execution completion time EFT(v i ,p k ) of the task on the processing core. The earliest execution completion time of the task is the sum of the earliest execution completion time FT(p k ) of the assigned task of the processing core where the task is located and the task execution time w i,j . In this embodiment, this calculation process is repeated until the earliest execution completion time EFT(v i , p k ) of the task is calculated for all tasks, different numbers of shared Cache blocks of the tasks and all processing cores.

当多核处理器分配一些任务在各处理核上执行后,可用共享Cache块AvailCache将会减少。条件AvailCache+pSCache[k]在某些情况下对于所有可能情况都不能满足,此时多核处理器资源管理系统将释放一些处理核。调度方法将查询所有处理核处理自身当前任务的最早执行完成时间FT(pk),找到所有处理核中具有最小的最早执行完成时间FT(pm),并更新多核处理器参数:最早执行完成时间FT(pm)赋值为第三早的执行完成时间FT(pk);可用共享Cache块AvailCache更新为AvailCache+pSCache[m];处理核pm所拥有的共享Cache块pSCache[m]置为0。多核处理器更新所有系统参数后,将继续返回上一步重新计算所有任务,不同共享Cache块数和处理核下任务的最早执行完成时间。When the multi-core processor assigns some tasks to be executed on each processing core, the available shared Cache block AvailCache will decrease. The condition AvailCache+pSCache[k] cannot be satisfied for all possible situations in some cases, at this time, the multi-core processor resource management system will release some processing cores. The scheduling method will query the earliest execution completion time FT(p k ) of all processing cores to process their own current tasks, find the earliest execution completion time FT(p m ) with the smallest among all processing cores, and update the multi-core processor parameters: the earliest execution completion The time FT(p m ) is assigned as the third earliest execution completion time FT(p k ); the available shared Cache block AvailCache is updated to AvailCache+pSCache[m]; the shared Cache block pSCache[m] owned by the processing core p m is set is 0. After the multi-core processor updates all system parameters, it will continue to return to the previous step to recalculate all tasks, the number of different shared Cache blocks and the earliest execution completion time of tasks under the processing core.

如果有任务满足条件可以调度到某台处理核上执行,本实施例任务调度方法将查询所有可能的任务最早执行完成时间,找出最小的任务最早执行完成时间EFT(vi,pk)。系统将会把任务vi分配给处理核pk,并更新多核处理器相关参数:处理核pk最早执行完成时间FT(pk)=EFT(vi,pk);处理核pk所捅有的共享Cache块pSCache[k]=pSCache[k]+j;多核处理器所拥有的可用共享Cache块AvailCache=AvailCache-j。If there is a task that meets the conditions and can be scheduled to be executed on a certain processing core, the task scheduling method of this embodiment will query the earliest execution completion time of all possible tasks, and find out the smallest task earliest execution completion time EFT(v i ,p k ). The system will assign the task v i to the processing core p k , and update the relevant parameters of the multi-core processor: the earliest execution completion time of the processing core p k FT(p k ) = EFT(v i ,p k ); The existing shared Cache block pSCache[k]=pSCache[k]+j; the available shared Cache block AvailCache owned by the multi-core processor=AvailCache-j.

本实施例采用的共享高速缓存任务调度方法将继续以上各个步骤,直到多核处理器任务队列中所有任务都调度完成。The shared cache task scheduling method adopted in this embodiment will continue the above steps until all tasks in the task queue of the multi-core processor are scheduled.

性能分析及结果验证:Performance analysis and result verification:

本实施例的多核处理器共享高速缓存任务调度方法是一种低时间复杂度、高效率的调度技术,其时间复杂度为O(N2ML)。其中N为任务数,M为多核处理器处理核数,L为所有任务的最大所需不同共享Cache块数。The multi-core processor shared cache task scheduling method of this embodiment is a scheduling technology with low time complexity and high efficiency, and its time complexity is O(N 2 ML). Among them, N is the number of tasks, M is the number of cores processed by the multi-core processor, and L is the maximum number of different shared Cache blocks required by all tasks.

模拟实验中本发明提出的多核处理器共享高速缓存任务调度方法命名为SCAS,主要与经典任务调度算法MIN-MIN进行比较。为了有效了解调度方法SCAS的性能,实验对SCAS做了一点改变,在选择最优任务时不是最小最早执行完成时间任务和处理核对,而是最大最早执行完成时间任务和处理核对,此方法命名为MSCAS。性能评价指标主要有调度长度(Makespan)和平均响应时间(Average response time)。In the simulation experiment, the multi-core processor shared cache task scheduling method proposed by the present invention is named SCAS, and is mainly compared with the classic task scheduling algorithm MIN-MIN. In order to effectively understand the performance of the scheduling method SCAS, the experiment made some changes to SCAS. When selecting the optimal task, it is not the minimum and earliest execution completion time task and processing check, but the maximum and earliest execution completion time task and processing check. This method is named MSCAS. Performance evaluation indicators mainly include scheduling length (Makespan) and average response time (Average response time).

实验结果如图3所示,所有的实验数据都是多次实验的平均值。图3是4核处理器拥有28共享Cache块的60到200任务实验结果。从图3可知,多核处理器共享高速缓存任务调度方法SCAS无论是在调度长度还是平均响应时间都要优于MIN-MIN、MSCAS。实际上,在平均调度长度方面,SCAS要比MSCAS短16.1%、比MIN-MIN短8.37%。对于平均响应时间而言,SCAS比MSCAS优149%、比MIN-MIN优10.5%。以上实验结果表明共享高速缓存任务调度方法能有效提高多核处理器性能。The experimental results are shown in Figure 3, and all experimental data are average values of multiple experiments. Figure 3 shows the experimental results of 60 to 200 tasks with 4-core processors and 28 shared Cache blocks. It can be seen from Figure 3 that the multi-core processor shared cache task scheduling method SCAS is superior to MIN-MIN and MSCAS in terms of scheduling length and average response time. In fact, in terms of average scheduling length, SCAS is 16.1% shorter than MSCAS and 8.37% shorter than MIN-MIN. For average response time, SCAS is 149% better than MSCAS and 10.5% better than MIN-MIN. The above experimental results show that the shared cache task scheduling method can effectively improve the performance of multi-core processors.

综上所述,本发明提出的多核处理器共享高速缓存任务调度方法能克服现有多核处理器由于共享Cache带来的并发性能降低,有效提高多核处理器性能。In summary, the multi-core processor shared cache task scheduling method proposed by the present invention can overcome the reduction of concurrent performance of the existing multi-core processor due to shared Cache, and effectively improve the performance of the multi-core processor.

尽管本发明的内容已经通过上述实施例作了详细介绍,但应当认识到上述的描述不应被认为是对本发明的限制。在本领域技术人员阅读了上述内容后,对于本发明的多种修改和替代都将是显而易见的。因此,本发明的保护范围应由所附的权利要求来限定。Although the content of the present invention has been described in detail through the above embodiments, it should be understood that the above description should not be considered as limiting the present invention. Various modifications and alterations to the present invention will become apparent to those skilled in the art upon reading the above disclosure. Therefore, the protection scope of the present invention should be defined by the appended claims.

Claims (6)

1.一种多核处理器共享高速缓存任务调度方法,其特征在于,该方法包括如下步骤:1. a multi-core processor shared high-speed cache task scheduling method, is characterized in that, the method comprises the steps: 步骤1:对多核处理器系统共享高速缓存Cache进行Cache块划分,首先按列地址空间把共享Cache分成若干Cache页,然后再将共享Cache划分成由Cache页构成的Cache块;Step 1: Carry out Cache block division for the shared high-speed cache Cache of the multi-core processor system, first divide the shared Cache into several Cache pages according to the column address space, and then divide the shared Cache into Cache blocks composed of Cache pages; 步骤2:分别初始化任务最早开始执行时间、单个处理核最早执行完成时间、单个处理核所拥有共享Cache块数、系统可用共享Cache块数;Step 2: Initialize the earliest execution time of the task, the earliest execution completion time of a single processing core, the number of shared Cache blocks owned by a single processing core, and the number of shared Cache blocks available to the system; 步骤3:对于多核处理器系统任务队列中的每个任务,根据其执行时所需求的共享Cache块数,来对系统中的每个处理核进行判断,如满足系统可用共享Cache块数与相应处理核所拥有共享Cache块数之和不小于该任务所需共享Cache块数,则计算该任务在该处理核上的最早执行完成时间,否则不进行计算,遍历所有处理核之后再对下一个任务进行判断,直至判断完所有的任务为止;Step 3: For each task in the task queue of the multi-core processor system, judge each processing core in the system according to the number of shared Cache blocks required for its execution, if the number of available shared Cache blocks in the system meets the corresponding If the sum of the shared Cache blocks owned by the processing cores is not less than the number of shared Cache blocks required by the task, then calculate the earliest execution completion time of the task on the processing core; Tasks are judged until all tasks are judged; 步骤4:根据步骤3的结果,判断是否存在能够执行任务队列中任务的处理核,即有没有计算出任一任务在任一处理核上最早执行完成时间,如有则执行步骤6,否则执行步骤5;Step 4: According to the result of step 3, determine whether there is a processing core capable of executing tasks in the task queue, that is, whether the earliest execution completion time of any task on any processing core has been calculated, if so, perform step 6, otherwise perform step 5 ; 步骤5:查询所有处理核处理现有自身任务的执行完成时间,找到当前处理任务剩余的执行完成时间最短的处理核,将此处理核的执行完成时间更新为不再是所有处理核中最早的执行完成时间,并等待此处理核完成任务,然后释放此处理核所拥有的共享Cache块数,多核处理器系统可用共享Cache块数即更新为原有块数加上此处理核所拥有的共享Cache块数,此处理核所拥有的共享Cache块置为0,转步骤7;Step 5: Query the execution completion time of all processing cores to process existing tasks, find the processing core with the shortest remaining execution completion time of the current processing task, and update the execution completion time of this processing core so that it is no longer the earliest among all processing cores Execution completion time, wait for the processing core to complete the task, and then release the number of shared Cache blocks owned by the processing core. The number of available shared Cache blocks in the multi-core processor system will be updated to the original number of blocks plus the share Cache block number, the shared Cache block owned by this processing core is set to 0, go to step 7; 步骤6:根据步骤3得到的每个任务在相应处理核上的最早执行完成时间,找出其中最早的执行完成时间以及所对应的任务vi及相应处理核pk;系统把任务vi分配给处理核pk,更新处理核pk的最早执行完成时间为该任务vi在处理核pk上的最早执行完成时间,处理核pk所拥有的共享Cache块数量更新为原处理核pk所拥有的共享Cache块与任务vi所需的共享Cache块数之和,多核处理器所拥有的可用共享Cache块的数量更新为原多核处理器所拥有的可用共享Cache块减去任务所需共享Cache块数,转步骤7;Step 6: According to the earliest execution completion time of each task on the corresponding processing core obtained in step 3, find out the earliest execution completion time and the corresponding task v i and corresponding processing core p k ; the system assigns task v i Given the processing core p k , update the earliest execution completion time of the processing core p k to be the earliest execution completion time of the task v i on the processing core p k , and update the number of shared Cache blocks owned by the processing core p k to the original processing core p The sum of shared cache blocks owned by k and the number of shared cache blocks required by task v i , the number of available shared cache blocks owned by the multi-core processor is updated as the available shared cache blocks owned by the original multi-core processor minus the number of shared cache blocks owned by the task Need to share the number of Cache blocks, go to step 7; 步骤7:查询任务队列中是否还有任务在等待调度,如果没有任务则输出任务处理核调度序列对,否则返回步骤3重新计算所有任务在处理核上的最早执行完成时间并循环执行直至所有任务调度完毕。Step 7: Query whether there are tasks waiting to be scheduled in the task queue. If there is no task, output the task processing core scheduling sequence pair, otherwise return to step 3 to recalculate the earliest execution completion time of all tasks on the processing core and execute in a loop until all tasks Scheduled. 2.根据权利要求1所述的一种多核处理器共享高速缓存任务调度方法,其特征在于,所述的步骤1中,Cache页的大小为512B。2. A kind of multi-core processor shared high-speed cache task scheduling method according to claim 1, is characterized in that, in described step 1, the size of Cache page is 512B. 3.根据权利要求1所述的一种多核处理器共享高速缓存任务调度方法,其特征在于,所述的步骤1中,Cache块的容量=共享Cache容量/(处理器核数*10)。3. a kind of multi-core processor shared high-speed cache task scheduling method according to claim 1, is characterized in that, in described step 1, the capacity of Cache block=shared Cache capacity/(processor core number*10). 4.根据权利要求1所述的一种多核处理器共享高速缓存任务调度方法,其特征在于,所述的步骤3中,多核处理器系统任务队列中的每个任务在提交时,同时会提交所需的Cache块数和相应的执行时间。4. a kind of multi-core processor shared cache task scheduling method according to claim 1, is characterized in that, in described step 3, each task in the multi-core processor system task queue will submit simultaneously when submitting The number of Cache blocks required and the corresponding execution time. 5.根据权利要求1所述的一种多核处理器共享高速缓存任务调度方法,其特征在于,所述步骤5中,在找到当前处理任务剩余的执行完成时间最短的处理核后,将此处理核的执行完成时间更新为所有处理核执行完成时间中第三早的执行完成时间。5. A kind of multi-core processor shared high-speed cache task scheduling method according to claim 1, is characterized in that, in described step 5, after finding the processing core with the shortest execution completion time remaining of the current processing task, process this The execution completion time of the core is updated to be the third earliest execution completion time among all processing core execution completion times. 6.根据权利要求1所述的一种多核处理器共享高速缓存任务调度方法,其特征在于,所述的步骤7中,查询任务队列中是否还有任务在等待调度即检查任务队列是否为空。6. a kind of multi-core processor shared high-speed cache task scheduling method according to claim 1, is characterized in that, in described step 7, whether also has task waiting to schedule in the task queue of inquiry namely checks whether the task queue is empty .
CN201410537569.1A 2014-10-13 2014-10-13 Method for task scheduling of shared cache of multi-core processor Expired - Fee Related CN104281495B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410537569.1A CN104281495B (en) 2014-10-13 2014-10-13 Method for task scheduling of shared cache of multi-core processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410537569.1A CN104281495B (en) 2014-10-13 2014-10-13 Method for task scheduling of shared cache of multi-core processor

Publications (2)

Publication Number Publication Date
CN104281495A CN104281495A (en) 2015-01-14
CN104281495B true CN104281495B (en) 2017-04-26

Family

ID=52256396

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410537569.1A Expired - Fee Related CN104281495B (en) 2014-10-13 2014-10-13 Method for task scheduling of shared cache of multi-core processor

Country Status (1)

Country Link
CN (1) CN104281495B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9582329B2 (en) * 2015-02-17 2017-02-28 Qualcomm Incorporated Process scheduling to improve victim cache mode
CN105468567B (en) * 2015-11-24 2018-02-06 无锡江南计算技术研究所 A kind of discrete memory access optimization method of isomery many-core
CN107329822B (en) * 2017-01-15 2022-01-28 齐德昱 Multi-core scheduling method based on hyper task network and oriented to multi-source multi-core system
CN106789447B (en) * 2017-02-20 2019-11-26 成都欧飞凌通讯技术有限公司 The not method of packet loss is realized when super finite automata figure changes in a kind of multicore
CN109766168B (en) * 2017-11-09 2023-01-17 阿里巴巴集团控股有限公司 Task scheduling method and device, storage medium and computing equipment
US10642657B2 (en) * 2018-06-27 2020-05-05 The Hong Kong Polytechnic University Client-server architecture for multicore computer system to realize single-core-equivalent view
CN109144720A (en) * 2018-07-13 2019-01-04 哈尔滨工程大学 A kind of multi-core processor task schedule selection method based on shared resource sensitivity
CN111582629B (en) * 2020-03-24 2023-11-17 青岛奥利普奇智智能工业技术有限公司 Resource scheduling method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010093003A1 (en) * 2009-02-13 2010-08-19 日本電気株式会社 Calculation resource allocation device, calculation resource allocation method, and calculation resource allocation program
JP2011059777A (en) * 2009-09-07 2011-03-24 Toshiba Corp Task scheduling method and multi-core system
CN102591843A (en) * 2011-12-30 2012-07-18 中国科学技术大学苏州研究院 Inter-core communication method for multi-core processor
CN103440223A (en) * 2013-08-29 2013-12-11 西安电子科技大学 Layering system for achieving caching consistency protocol and method thereof
CN103440173A (en) * 2013-08-23 2013-12-11 华为技术有限公司 Scheduling method and related devices of multi-core processors

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010093003A1 (en) * 2009-02-13 2010-08-19 日本電気株式会社 Calculation resource allocation device, calculation resource allocation method, and calculation resource allocation program
JP2011059777A (en) * 2009-09-07 2011-03-24 Toshiba Corp Task scheduling method and multi-core system
CN102591843A (en) * 2011-12-30 2012-07-18 中国科学技术大学苏州研究院 Inter-core communication method for multi-core processor
CN103440173A (en) * 2013-08-23 2013-12-11 华为技术有限公司 Scheduling method and related devices of multi-core processors
CN103440223A (en) * 2013-08-29 2013-12-11 西安电子科技大学 Layering system for achieving caching consistency protocol and method thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于NUCA结构的同构单芯片多处理器;陈宏铭,林昶志,陈麒安;《中国集成电路》;20111105;32-54 *

Also Published As

Publication number Publication date
CN104281495A (en) 2015-01-14

Similar Documents

Publication Publication Date Title
CN104281495B (en) Method for task scheduling of shared cache of multi-core processor
CN103034614B (en) Single task multi-core dispatching method based on critical path and Task Duplication
CN103235742B (en) Dependency-based parallel task grouping scheduling method on multi-core cluster server
US8893145B2 (en) Method to reduce queue synchronization of multiple work items in a system with high memory latency between processing nodes
CN110704360A (en) Graph calculation optimization method based on heterogeneous FPGA data flow
US20140304491A1 (en) Processor system and accelerator
CN111476344A (en) Multipath neural network, method for resource allocation and multipath neural network analyzer
CN111190735B (en) On-chip CPU/GPU pipelining calculation method based on Linux and computer system
CN102866912A (en) Single-instruction-set heterogeneous multi-core system static task scheduling method
US20170228319A1 (en) Memory-Constrained Aggregation Using Intra-Operator Pipelining
CN113553103B (en) Multi-core parallel scheduling method based on CPU+GPU heterogeneous processing platform
CN109445565B (en) A GPU Quality of Service Guarantee Method Based on Streaming Multiprocessor Core Exclusive and Reservation
WO2023051505A1 (en) Job solving method and apparatus
CN111104211A (en) Method, system, device and medium for computing offloading based on task dependency
CN102306205A (en) Method and device for allocating transactions
CN103257900B (en) Real-time task collection method for obligating resource on the multiprocessor that minimizing CPU takies
CN101593132A (en) Multi-core parallel simulated annealing method based on thread constructing module
CN113407352A (en) Method, processor, device and readable storage medium for processing task
CN112925616A (en) Task allocation method and device, storage medium and electronic equipment
CN108132834B (en) Method and system for task allocation under a multi-level shared cache architecture
CN107870871A (en) Method and device for allocating cache
CN104572501A (en) Access trace locality analysis-based shared buffer optimization method in multi-core environment
CN106325995A (en) GPU resource distribution method and system
CN105430074B (en) Optimization method and system based on the distribution storage of the cloud data of data dependency and visit capacity
CN100589080C (en) CMP Task Allocation Method Based on Hypercube Structure

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170426

Termination date: 20171013

CF01 Termination of patent right due to non-payment of annual fee