CN106933663A - A kind of multithread scheduling method and system towards many-core system - Google Patents

A kind of multithread scheduling method and system towards many-core system Download PDF

Info

Publication number
CN106933663A
CN106933663A CN201710132627.6A CN201710132627A CN106933663A CN 106933663 A CN106933663 A CN 106933663A CN 201710132627 A CN201710132627 A CN 201710132627A CN 106933663 A CN106933663 A CN 106933663A
Authority
CN
China
Prior art keywords
communication cost
core
thread
traffic
processor core
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710132627.6A
Other languages
Chinese (zh)
Other versions
CN106933663B (en
Inventor
沈欢
胡威
唐玉馨
刘小明
戴文丽
马梦东
张凯
刘俊
吕晴阳
刘丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University of Science and Engineering WUSE
Wuhan University of Science and Technology WHUST
Original Assignee
Wuhan University of Science and Engineering WUSE
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University of Science and Engineering WUSE filed Critical Wuhan University of Science and Engineering WUSE
Priority to CN201710132627.6A priority Critical patent/CN106933663B/en
Publication of CN106933663A publication Critical patent/CN106933663A/en
Application granted granted Critical
Publication of CN106933663B publication Critical patent/CN106933663B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Multi Processors (AREA)

Abstract

The invention discloses a kind of multithread scheduling method towards many-core system, methods described includes:Obtain the communication cost between first processor core and second processing device core in default processor core set;Obtain first traffic between each two thread in default first multithreading set;According to first traffic, obtain second traffic of single thread, wherein, second traffic is the thread to the traffic of each thread in the first multithreading set and the traffic sum of each thread in the first multithreading set to the thread;All processor cores are ranked up according to the communication cost, by the maximum thread scheduling of second traffic to the minimum processor core of communication cost.The many-core system that the interconnection structure that the multithread scheduling method and system that the present invention is provided solve use network-on-chip of the prior art is realized has that tasks carrying efficiency is low, execution time technical problem long.

Description

A kind of multithread scheduling method and system towards many-core system
Technical field
The present invention relates to field of computer technology, more particularly to a kind of multithread scheduling method towards many-core system and it is System.
Background technology
With the development of computer technology, polycaryon processor has also obtained larger development, the symmetric multiprocessor of early stage (SMP) it is utilized in collecting one group of mode of CPU on same computer, shared drive subsystem and total knot between them more Structure.Afterwards due to the introducing of nanometer fabrication technology, SMP starts to be changed into chip multiprocessors (Chip Multiprocessor, CMP), i.e. integrated multiple processing cores on the same chip are formd at multinuclear now described in us Reason device.Direct shared buffer memory and bus structures between multi-core, can reduce wire delay, improve communication efficiency.When multinuclear system When processor core in system continues to increase, many-core system has been occurred as soon as.There are more processor cores in many-core system.
Efficient communication mechanism is generally included based on the cache structures for sharing bus and based on the mutual of network-on-chip on current piece Link structure.Cache structures based on shared bus refer to that each process cores possesses shared two grades or three-level cache, for protecting The more commonly used data are deposited, and is communicated by bus.The advantage of this system is simple structure, and communication speed is fast;Shortcoming It is poor expandability.The need for shared bus obviously cannot meet large scale system.Interference networks are used for system-on-chip designs, The Communication between component on piece is solved, here it is network-on-chip.Network-on-chip (Network On Chip, NoC) technology with The features such as it supports access simultaneously, reliability is high, reusability is high is considered as more preferable extensive CMP interconnection techniques. Network-on-chip overcomes the shortcoming of bus structures poor expandability, is for 1,000,000,000 transistor epoch were provided on a kind of feasible piece System communication mechanism.
Present inventor has found that at least there are the following problems in the prior art when technical scheme is realized:
In current many-core system, because the quantity of processor core is more so that the concurrency of Multi-core is greatly carried Rise, sharply increased the internuclear traffic so that processor is switched to " communications-intensive " by " computation-intensive ", existing many-core What the communication means of system typically considered is specific architecture characteristics, although the interconnection structure based on network-on-chip is certain Bus structures poor expandability is overcome in degree, but is not divided for the multithreading task run in many-core system Analysis, because the utilization ratio of many-core system is low, causes that tasks carrying efficiency is low, the execution time is long.
It can be seen that, there is tasks carrying efficiency in the many-core system that the interconnection structure of use network-on-chip of the prior art is realized Low, execution time technical problem long.
The content of the invention
The embodiment of the present invention provides a kind of multithread scheduling method and system towards many-core system, is used to solve existing skill The many-core system that the interconnection structure of the use network-on-chip in art is realized has that tasks carrying efficiency is low, execution time technology long is asked Topic.
In a first aspect, the invention discloses a kind of multithread scheduling method towards many-core system, methods described includes:
The communication cost between first processor core and second processing device core in default processor core set is obtained, its In, the first processor core, second processing device core are any two processor core in processor core set;
Obtain first traffic between each two thread in default first multithreading set;
According to first traffic, second traffic of single thread is obtained, wherein, second traffic is described The traffic and in the first multithreading set each thread to institute of the thread to each thread in the first multithreading set The traffic sum of thread is stated, the thread is any thread in the first multithreading set;
All processor cores are ranked up according to the communication cost, by the maximum thread scheduling of second traffic to leading to In the processor core of letter Least-cost.
Optionally, it is described to obtain leading between first processor core and second processing device core in default processor core set Letter cost, including:
Obtain the first processor core to the first communication cost of the second processing core;
Obtain the second processing device core to the second communication cost of first process cores;
Using the summation of first communication cost and second communication cost as the communication cost.
Optionally, it is described to obtain the first processor core to the first communication cost of the second processing core, including:
Obtain the first processor core to the physical path number between the second processing core;
Obtain the first processor core to every third communication cost of physical path between the second processing core;
The third communication cost is sued for peace, third communication cost summation is obtained;
Using the ratio of the third communication cost summation and the physical path number as first communication cost.
Optionally, it is described all processor cores are ranked up according to the communication cost, second traffic is maximum Thread scheduling in the minimum processor core of communication cost, including:
It is ranked up according to all processor cores of the communication cost, obtains communication cost set;
All threads are ranked up according to second traffic, obtain the second multithreading set;
To lead in the maximum thread scheduling of second traffic in the second multithreading set to the communication cost set In the processor core of letter Least-cost.
Optionally, in the thread scheduling that second traffic in the second multithreading set is maximum to the communication In cost set in the minimum processor core of communication cost before, also include:
If second multithreading set is not sky, and the communication cost set is not sky;
Judge whether the processor core of communication cost minimum in the communication cost set is allocated;
If the processor core is unassigned, by the thread scheduling to the processor core.
Optionally, in the thread scheduling that second traffic in the second multithreading set is maximum to the communication In cost set in the minimum processor core of communication cost after, also include:
Delete the processor core of communication cost minimum in the communication cost set;
Delete the thread of second traffic maximum in second sets of threads.
Based on same inventive concept, present invention also offers a kind of multithread scheduling system towards many-core system, institute The system of stating includes:
First acquisition module, for obtain in default processor core set first processor core and second processing device core it Between communication cost, wherein, the first processor core, second processing device core be processor core set in any two treatment Device core;
Second acquisition module, for obtaining the first communication in default first multithreading set between each two thread Amount;
3rd acquisition module, for according to first traffic, obtaining second traffic of single thread, wherein, institute It is the traffic and first multithreading of the thread to each thread in the first multithreading set to state second traffic To the traffic sum of the thread, the thread is any line in the first multithreading set to each thread in set Journey;
Scheduler module, it is for being ranked up to all processor cores according to the communication cost, second traffic is maximum Thread scheduling in the minimum processor core of communication cost.
Optionally, first acquisition module is additionally operable to:
Obtain the first processor core to the first communication cost of the second processing core;
Obtain the second processing device core to the second communication cost of first process cores;
Using the summation of first communication cost and second communication cost as the communication cost.
Optionally, it is described to obtain the first processor core to the first communication cost of the second processing core, including:
Obtain the first processor core to the physical path number between the second processing core;
Obtain the first processor core to every third communication cost of physical path between the second processing core;
The third communication cost is sued for peace, third communication cost summation is obtained;
Using the ratio of the third communication cost summation and the physical path number as first communication cost.
Optionally, the scheduler module is additionally operable to:
It is ranked up according to all processor cores of the communication cost, obtains communication cost set;
All threads are ranked up according to second traffic, obtain the second multithreading set;
To lead in the maximum thread scheduling of second traffic in the second multithreading set to the communication cost set In the processor core of letter Least-cost.One or more technical schemes provided in the embodiment of the present invention, at least with following skill Art effect or advantage:
The multithread scheduling method and system towards many-core system that the embodiment of the present application is provided, obtain default place first Communication cost in reason device core set between first processor core and second processing device core;And obtain default first multithreading collection First traffic in conjunction between each two thread;Then according to first traffic, the second communication of single thread is obtained All processor cores are ranked up by amount finally according to the communication cost, by the maximum thread scheduling of second traffic to leading to Believe in the processor core of Least-cost, the application is analyzed and optimized from the angle of multithreading tasks carrying stream, from treatment The traffic between communication cost and thread between device core is analyzed, and can improve the utilization ratio of many-core system, so that Shorten the execution time of multithreading task, improve the speed for performing.Solve the interconnection of use network-on-chip of the prior art The many-core system that structure is realized has that tasks carrying efficiency is low, execution time technical problem long.
Described above is only the general introduction of technical solution of the present invention, in order to better understand technological means of the invention, And can be practiced according to the content of specification, and in order to allow the above and other objects of the present invention, feature and advantage can Become apparent, below especially exemplified by specific embodiment of the invention.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing The accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are this hairs Some bright embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, can be with root Other accompanying drawings are obtained according to these accompanying drawings.
Fig. 1 be the embodiment of the present invention in towards many-core system multithread scheduling method flow chart;
Fig. 2 be the embodiment of the present invention in towards many-core system multithread scheduling system building-block of logic.
Specific embodiment
The embodiment of the present invention provides a kind of multithread scheduling method and system towards many-core system, is used to solve existing skill The many-core system that the interconnection structure of the use network-on-chip in art is realized has that tasks carrying efficiency is low, execution time technology long is asked Topic.The execution time for shortening multithreading task is realized, the technique effect of the speed for performing is improved.
Technical scheme in the embodiment of the present application, general thought is as follows:
A kind of multithread scheduling method towards many-core system, methods described includes:
The communication cost between first processor core and second processing device core in default processor core set is obtained, its In, the first processor core, second processing device core are any two processor core in processor core set;
Obtain first traffic between each two thread in default first multithreading set;
According to first traffic, second traffic of single thread is obtained, wherein, second traffic is described The traffic and in the first multithreading set each thread to institute of the thread to each thread in the first multithreading set The traffic sum of thread is stated, the thread is any thread in the first multithreading set;
All processor cores are ranked up according to the communication cost, by the maximum thread scheduling of second traffic to leading to In the processor core of letter Least-cost.
In the above method, obtain first in default processor core set between first processor core and second processing device core Communication cost;And obtain first traffic in default first multithreading set between each two thread;Then according to institute First traffic is stated, second traffic of single thread is obtained, all processor cores are carried out finally according to the communication cost Sequence, by the maximum thread scheduling of second traffic to the minimum processor core of communication cost, the application is from multithreading task Perform the angle of stream to analyze and optimize, divided from the traffic between the communication cost and thread between processor core Analysis, can improve the utilization ratio of many-core system, so as to shorten the execution time of multithreading task, improve the speed for performing.Solution Determined use network-on-chip of the prior art interconnection structure realize many-core system exist tasks carrying efficiency it is low, perform when Between technical problem long.
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is A part of embodiment of the present invention, rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art The every other embodiment obtained under the premise of creative work is not made, belongs to the scope of protection of the invention.
Embodiment one
The present embodiment provides a kind of multithread scheduling method towards many-core system, and methods described includes:
Step S101:Obtain the communication between first processor core and second processing device core in default processor core set Cost;
Step S102:Obtain first traffic between each two thread in default first multithreading set;
Step S103:According to first traffic, second traffic of single thread is obtained, wherein, described second leads to Traffic is that the traffic of the thread to each thread in the first multithreading set is every with the first multithreading set Traffic sum of the individual thread to the thread;
Step S104:All processor cores are ranked up according to the communication cost, by the line that second traffic is maximum Journey is dispatched in the minimum processor core of communication cost.
In said system, all processor cores are ranked up according to the communication cost, second traffic is maximum Thread scheduling is analyzed simultaneously in the minimum processor core of communication cost due to the application from the angle of multithreading tasks carrying stream Optimize, be analyzed from the traffic between the communication cost and thread between processor core, many-core system can be improved Utilization ratio, so as to shorten the execution time of multithreading task, improve the speed for performing.Solve use of the prior art The many-core system that the interconnection structure of network-on-chip is realized has that tasks carrying efficiency is low, execution time technical problem long.
It should be noted that in the application, the step S101 and step S102 can be in no particular order first sequentially Perform step S101, or first carry out step S102.
Below, the multithread scheduling method that the application is provided is described in detail with reference to Fig. 1:
First, step S101 is performed, first processor core and second processing device core in default processor core set is obtained Between communication cost.
In the embodiment of the present application, processor core set includes multiple processor cores, and specific quantity does not make specific limit System, the first processor core and second processing device core are any two processor core in the processor core set, i.e., above-mentioned the Communication cost between one processor core and second processing device core is also a set.
Next, performing step step S102:Obtain in default first multithreading set between each two thread One traffic.
In the embodiment of the present application, the first multithreading set includes multiple threads, and specific quantity is not specifically limited, Above-mentioned acquisition is the traffic between any two thread.
Subsequently, step S103 is performed:According to first traffic, second traffic of single thread is obtained, wherein, Second traffic is that the traffic of the thread to each thread in the first multithreading set is multi-thread with described first Traffic sum of each thread to the thread in Cheng Jihe.
In the embodiment of the present application, due to obtaining first traffic between each two thread, you can with according to this first The total traffic capacity of the traffic most each thread is calculated.
Finally, step S104 is performed:All processor cores are ranked up according to the communication cost, by second traffic Maximum thread scheduling is in the minimum processor core of communication cost.
Specifically, in multithread scheduling method provided in an embodiment of the present invention, the is obtained in default processor core set Communication cost between one processor core and second processing device core is specifically included:
Obtain the first processor core to the first communication cost of the second processing core;
Obtain the second processing device core to the second communication cost of first process cores;
Using the summation of first communication cost and second communication cost as the communication cost.
In multithread scheduling method provided in an embodiment of the present invention, the first processor core to the second processing is obtained First communication cost of core, specifically includes:
Obtain the first processor core to the physical path number between the second processing core;
Obtain the first processor core to every third communication cost of physical path between the second processing core;
The third communication cost is sued for peace, third communication cost summation is obtained;
Using the ratio of the third communication cost summation and the physical path number as first communication cost.
In multithread scheduling method provided in an embodiment of the present invention, all processor cores are carried out according to the communication cost Sequence, by the maximum thread scheduling of second traffic to the minimum processor core of communication cost, including:
It is ranked up according to all processor cores of the communication cost, obtains communication cost set;
All threads are ranked up according to second traffic, obtain the second multithreading set;
To lead in the maximum thread scheduling of second traffic in the second multithreading set to the communication cost set In the processor core of letter Least-cost.
In multithread scheduling method provided in an embodiment of the present invention, by second traffic in the second multithreading set Maximum thread scheduling also includes to before in the minimum processor core of communication cost in the communication cost set:
If second multithreading set is not sky, and the communication cost set is not sky;
Judge whether the processor core of communication cost minimum in the communication cost set is allocated;
If the processor core is unassigned, by the thread scheduling to the processor core.
In multithread scheduling method provided in an embodiment of the present invention, by second traffic in the second multithreading set Maximum thread scheduling also includes to after in the minimum processor core of communication cost in the communication cost set:
Delete the processor core of communication cost minimum in the communication cost set;
Delete the thread of second traffic maximum in second sets of threads.
Process is implemented in order to illustrate more clearly of a kind of multithread scheduling method for providing of the invention, below by One complete logical instance is explained.
During concrete implementation, for m processor core C0,C1,…Cm-1Many-core system M, many-core system M can be expressed as M={ C0,C1,…Cm-1};Appoint to two processor core CaAnd Cb, one or more of junctions can be found Reason device core CaAnd CbPhysical path;For with the d processor core C of physical pathaAnd Cb, processor core CaAnd CbBetween Average communication cost is designated as Rab
Wherein, in above-mentioned formula (1), RabH () represents processor core CaAnd CbThe h articles communication cost of physical path; Communication cost in many-core system M between whole processor cores according to descending arrangement form many-core system M communication cost set CC, communication cost set CC can be represented using bivariate table as shown in table 1 below:
Table 1
C0 C1 C2 Cm-1
C0 0 R01 R02 R0m-1
C1 R10 0 R12 R1m-1
C2 R20 R21 0 R2m-1
0
Cm-1 Rm-1 0 Rm-1 1 Rm-1 2 0
In table 1, " 0 " is indicated without the processor core of the communication cost on physical path, i.e., to the communication generation of itself Valency is 0;
Then processor core CaAnd CbBetween communication cost be:
R(CaCb)=Rab+Rba (2)
In above-mentioned formula (2), RabIt is the average communication cost of processor core a to processor b, RbaFor processor core b is arrived The average communication cost of processor a, as above-mentioned first processor core and second processing device core.
The communication cost to processor core is processed for convenience, and the total communication cost between processor core can be pressed According to ascending order arrangement, so as to construct communication cost set CCT={ R0,R1,…,Rp, wherein p is unit in total communication cost set The number of element, the computational methods of p are:
In above-mentioned formula (3), m is the quantity of processor core.Due to two treatment of each element correspondence in communication cost Device core, when being scheduled, can arbitrarily be distributed the two processor cores, for further Optimization Scheduling, this Application is processed in the following way:If two communication costs are identical in communication cost set, by processor core sequence number Small total communication cost sorts preceding.
For example, for 8 many-core system M of processor core, many-core system M={ C0,C1,C2,C3,C4,C5, C6,C7, the communication cost set CC between process cores is:
C0 C1 C2 C3 C4 C5 C6 C7
C0 0 1 2 3 4 5 6 7
C1 1 0 1 2 3 4 5 6
C2 2 1 0 1 2 3 4 5
C3 3 2 1 0 1 2 3 4
C4 4 3 2 1 0 1 2 3
C5 5 4 3 2 1 0 1 2
C6 6 5 4 3 2 1 0 1
C7 7 6 5 4 3 2 1 0
Communication cost set CCT={ R0,R1,…,R27}
R0=2 C0,C1 R7=4 C0,C2 R14=6 C1,C4 R21=8 C3,C7
R1=2 C1,C2 R8=4 C1,C3 R15=6 C2,C5 R22=10 C0,C5
R2=2 C2,C3 R9=4 C2,C4 R16=6 C3,C6 R23=10 C1,C6
R3=2 C3,C4 R10=4 C3,C5 R17=6 C4,C7 R24=10 C2,C7
R4=2 C4,C5 R11=4 C4,C6 R18=8 C0,C4 R25=12 C0,C6
R5=2 C5,C6 R12=4 C5,C7 R19=8 C1,C5 R26=12 C1,C7
R6=2 C6,C7 R13=6 C0,C3 R20=8 C2,C6 R27=14 C0,C7
Next, the traffic between multithreading is calculated, specifically, for example, for n thread T0,T1, T2,…,Tn-1The first multithreading set Δ, the first multithreading set Δ is expressed as Δ={ T0,T1,T2,…,Tn-1};Appoint to two Individual thread TlAnd Tk, TFlkRepresent from thread TlTo TkFirst traffic, then in the first multithreading set Δ between whole threads First traffic can be represented with the bivariate table shown in table 2 below:
Table 2
T0 T1 T2 Tn-1
T0 0 TF01 TF02 TF0n-1
T1 TF10 0 TF12 TF1n-1
T2 TF20 TF21 0 TF2n-1
0
Tn-1 TFn-1 0 Tn-1 1 Tn-1 2 0
In table 2, without the traffic between two threads of " 0 " expression;Then any one thread TiTotal traffic capacity, i.e., second lead to Traffic TF (Ti) be:
WhereinIt is above-mentioned thread TiEach thread is (from T in the first multithreading set0To Tn-1) The traffic,For in the first multithreading set each thread (from T0To Tn-1) to the thread the traffic it With.
Due to being calculated second traffic of single thread, then to the thread in multithreading set Δ according to thread Total traffic capacity size carry out descending arrangement, obtain multithreading set Δ '={ T0’,T1’,T2’,…,Tn-1’};Preferably, If the total traffic capacity of two threads is identical, the small thread ordering of sequence number is preceding.
For multithreading set Δ={ T0,T1,T2,T3,T4,T5,T6,T7, the traffic between thread is as shown in the table:
Multithreading set Δ can be obtained from upper table '={ T0’,T1’,T2’,T3’,T4’,T5’,T6’,T7', wherein:
T0' it is T in multithreading set Δ0
T1' it is T in multithreading set Δ7
T2' it is T in multithreading set Δ1
T3' it is T in multithreading set Δ6
T4' it is T in multithreading set Δ2
T5' it is T in multithreading set Δ5
T6' it is T in multithreading set Δ3
T7' it is T in multithreading set Δ4
Next, being illustrated to a kind of preferred thread scheduling method provided in an embodiment of the present invention, specific steps are such as Under:
First, it is determined that whether communication cost set CCT and the second multithreading set are empty, if communication cost set CCT Or second multithreading collection be combined into sky, then thread scheduling terminates;If communication cost set CCT and the second multithreading set are not Sky, then from the member that communication cost set CCT selected and sorteds are 1, the member is designated as Rw, RwIt is total between two processor cores Communication cost, the two processor cores are designated as CuAnd Cv.Remove R from communication cost set CCTw.Remove from many-core system M Processor core CuAnd Cv
For further Optimization Scheduling, before being scheduled, also judge that communication cost set CCT selected and sorteds are 1 The corresponding processor core C of memberuAnd CvWhether be allocated and multithreading set Δ ' in remaining number of threads.
In specific implementation process, specific step is as described below:
Step 1:If total communication cost set CCT is sky, without distributable processor core, thread scheduling terminates; If total communication cost set CCT is not sky, to step 2.
Step 2:From the member that total communication cost set CCT selected and sorteds are 1, the member is designated as Rw, RwIt is two processors Total communication cost between core, the two processor cores are designated as CuAnd Cv.Remove R from total communication cost set CCTw.From many-core Remove processor core C in system MuAnd Cv
Step 3:If processor core CuAnd CvAll be not previously allocated, and multithreading set Δ ' in remaining number of threads be more than Equal to 2, to step 4;If processor core CuAnd CvMiddle only one of which processor core is allocated, and multithreading set Δ ' be not Sky, to step 5;If processor core CuAnd CvAll be not previously allocated, and multithreading set Δ ' in remaining number of threads be equal to 1, To step 5;If processor core CuAnd CvIt is all allocated, return to step 1;
Step 4:From multithreading set Δ ' selected and sorted be 1 and 2 two thread TxAnd Ty, it is assigned to processor core CuWith Cv;From multithreading set Δ ' in remove thread TxAnd Ty.If multithreading set Δ ' it is not sky, return to step 1;If multi-thread Cheng Jihe Δs ' be sky, then the arrival of new thread is waited, thread scheduling terminates.
Step 5:From multithreading set Δ ' selected and sorted be 1 thread Tx, it is assigned to unassigned processor core:Such as Fruit processor core CuIt is allocated, then say thread TxIt is assigned to Cv;If processor core CvIt is allocated, then say thread TxPoint It is fitted on Cu;From multithreading set Δ ' in remove thread Tx.If multithreading set Δ ' it is not sky, return to step 1;If multi-thread Cheng Jihe Δs ' be sky, then the arrival of new thread is waited, thread scheduling terminates.
By multithreading set Δ '={ T0’,T1’,T2’,T3’,T4’,T5’,T6’,T7' in thread scheduling to many-core system M={ C0,C1,C2,C3,C4,C5,C6,C7Process it is as follows, wherein communication cost set CCT={ R0,R1,…,R27}:
1), communication cost set CCT is not sky, to step 2.
2) it is R from the member that communication cost set CCT selected and sorteds are 10, corresponding two processor cores are C0And C1。 Remove R from communication cost set CCT0.Remove processor core C from many-core system M0And C1
3) processor core C0And C1All be not previously allocated, and multithreading set Δ ' in remaining number of threads be 8, to step 4);
4), from multithreading set Δ ' selected and sorted be 1 and 2 two thread T0' and T1', it is assigned to processor core C0With C1;From multithreading set Δ ' in remove thread T0' and T1’.Return to step 1.
5) repeat the above steps, until all of thread is assigned.Allocation result is as shown in the table:
In upper table, multithreading set Δ ' in thread T0' it is scheduled for the processor core C in many-core system M0, i.e., it is multi-thread Thread T in Cheng Jihe Δs0It is scheduled for the processor core C in many-core system M0;Multithreading set Δ ' in thread T1' adjusted The processor core C spent in many-core system M1, i.e., the thread T in multithreading set Δ7It is scheduled for the treatment in many-core system M Device core C1;By that analogy, all threads in multithreading set Δ are all scheduled on processor core.
Based on the inventive concept same with embodiment one, the present invention is that embodiment two additionally provides one kind towards many-core system Multithread scheduling system, the system includes:
First acquisition module, for obtain in default processor core set first processor core and second processing device core it Between communication cost;
Second acquisition module, for obtaining the first communication in default first multithreading set between each two thread Amount;
3rd acquisition module, for according to first traffic, obtaining second traffic of single thread, wherein, institute It is the traffic and first multithreading of the thread to each thread in the first multithreading set to state second traffic Traffic sum of each thread to the thread in set;
Scheduler module, it is for being ranked up to all processor cores according to the communication cost, second traffic is maximum Thread scheduling in the minimum processor core of communication cost.
Alternatively, first acquisition module is additionally operable to:
Obtain the first processor core to the first communication cost of the second processing core;
Obtain the second processing device core to the second communication cost of first process cores;
Using the summation of first communication cost and second communication cost as the communication cost.
Alternatively, it is described to obtain the first processor core to the first communication cost of the second processing core, including:
Obtain the first processor core to the physical path number between the second processing core;
Obtain the first processor core to every third communication cost of physical path between the second processing core;
The third communication cost is sued for peace, third communication cost summation is obtained;
Using the ratio of the third communication cost summation and the physical path number as first communication cost.
Alternatively, the scheduler module is additionally operable to:
It is ranked up according to all processor cores of the communication cost, obtains communication cost set;
All threads are ranked up according to second traffic, obtain the second multithreading set;
To lead in the maximum thread scheduling of second traffic in the second multithreading set to the communication cost set In the processor core of letter Least-cost.
By the system that the embodiment of the present invention two is introduced, to implement the method institute of the thread scheduling of the embodiment of the present invention one The system of use, so the method introduced based on the embodiment of the present invention one, the affiliated personnel in this area will appreciate that the system Concrete structure and deformation, so will not be repeated here.The system that the method for every embodiment of the present invention one is used belongs to this The scope to be protected of invention.
One or more technical schemes provided in the embodiment of the present invention, at least have the following technical effect that or advantage:
The multithread scheduling method and system towards many-core system that the embodiment of the present application is provided, obtain default place first Communication cost in reason device core set between first processor core and second processing device core;And obtain default first multithreading collection First traffic in conjunction between each two thread;Then according to first traffic, the second communication of single thread is obtained All processor cores are ranked up by amount finally according to the communication cost, by the maximum thread scheduling of second traffic to leading to Believe in the processor core of Least-cost, the application is analyzed and optimized from the angle of multithreading tasks carrying stream, from treatment The traffic between communication cost and thread between device core is analyzed, and can improve the utilization ratio of many-core system, so that Shorten the execution time of multithreading task, improve the speed for performing.Solve the interconnection of use network-on-chip of the prior art The many-core system that structure is realized has that tasks carrying efficiency is low, execution time technical problem long.
, but those skilled in the art once know basic creation although preferred embodiments of the present invention have been described Property concept, then can make other change and modification to these embodiments.So, appended claims are intended to be construed to include excellent Select embodiment and fall into having altered and changing for the scope of the invention.
Obviously, those skilled in the art can carry out various changes and modification without deviating from this hair to the embodiment of the present invention The spirit and scope of bright embodiment.So, if these modifications of the embodiment of the present invention and modification belong to the claims in the present invention And its within the scope of equivalent technologies, then the present invention is also intended to comprising these changes and modification.

Claims (10)

1. a kind of multithread scheduling method towards many-core system, it is characterised in that methods described includes:
The communication cost between first processor core and second processing device core in default processor core set is obtained, wherein, institute It is any two processor core in processor core set to state first processor core, second processing device core;
Obtain first traffic between each two thread in default first multithreading set;
According to first traffic, second traffic of single thread is obtained, wherein, second traffic is the thread The traffic of each thread and each thread in the first multithreading set to the line in the first multithreading set The traffic sum of journey, the thread is any thread in the first multithreading set;
All processor cores are ranked up according to the communication cost, by the maximum thread scheduling of second traffic to communication generation In the minimum processor core of valency.
2. the method for claim 1, it is characterised in that first processor in the default processor core set of acquisition Communication cost between core and second processing device core, including:
Obtain the first processor core to the first communication cost of the second processing core;
Obtain the second processing device core to the second communication cost of first process cores;
Using the summation of first communication cost and second communication cost as the communication cost.
3. method as claimed in claim 2, it is characterised in that the acquisition first processor core to the second processing First communication cost of core, including:
Obtain the first processor core to the physical path number between the second processing core;
Obtain the first processor core to every third communication cost of physical path between the second processing core;
The third communication cost is sued for peace, third communication cost summation is obtained;
Using the ratio of the third communication cost summation and the physical path number as first communication cost.
4. the method for claim 1, it is characterised in that described all processor cores are carried out according to the communication cost Sequence, by the maximum thread scheduling of second traffic to the minimum processor core of communication cost, including:
It is ranked up according to all processor cores of the communication cost, obtains communication cost set;
All threads are ranked up according to second traffic, obtain the second multithreading set;
To be communicated generation in the maximum thread scheduling of second traffic in the second multithreading set to the communication cost set In the minimum processor core of valency.
5. method as claimed in claim 4, it is characterised in that described by second traffic in the second multithreading set Maximum thread scheduling also includes to before in the minimum processor core of communication cost in the communication cost set:
If second multithreading set is not sky, and the communication cost set is not sky;
Judge whether the processor core of communication cost minimum in the communication cost set is allocated;
If the processor core is unassigned, by the thread scheduling to the processor core.
6. method as claimed in claim 4, it is characterised in that described by second traffic in the second multithreading set Maximum thread scheduling also includes to after in the minimum processor core of communication cost in the communication cost set:
Delete the processor core of communication cost minimum in the communication cost set;
Delete the thread of second traffic maximum in second sets of threads.
7. a kind of multithread scheduling system towards many-core system, it is characterised in that the system includes:
First acquisition module, for obtaining in default processor core set between first processor core and second processing device core Communication cost, wherein, the first processor core, second processing device core are any two processor in processor core set Core;
Second acquisition module, for obtaining first traffic in default first multithreading set between each two thread;
3rd acquisition module, for according to first traffic, obtaining second traffic of single thread, wherein, described the Two traffics are the traffic and the first multithreading set of the thread to each thread in the first multithreading set In each thread to the thread traffic sum, the thread is any thread in the first multithreading set;
Scheduler module, for being ranked up to all processor cores according to the communication cost, by the line that second traffic is maximum Journey is dispatched in the minimum processor core of communication cost.
8. system as claimed in claim 7, it is characterised in that first acquisition module is additionally operable to:
Obtain the first processor core to the first communication cost of the second processing core;
Obtain the second processing device core to the second communication cost of first process cores;
Using the summation of first communication cost and second communication cost as the communication cost.
9. system as claimed in claim 8, it is characterised in that the acquisition first processor core to the second processing First communication cost of core, including:
Obtain the first processor core to the physical path number between the second processing core;
Obtain the first processor core to every third communication cost of physical path between the second processing core;
The third communication cost is sued for peace, third communication cost summation is obtained;
Using the ratio of the third communication cost summation and the physical path number as first communication cost.
10. system as claimed in claim 7, it is characterised in that the scheduler module is additionally operable to:
It is ranked up according to all processor cores of the communication cost, obtains communication cost set;
All threads are ranked up according to second traffic, obtain the second multithreading set;
To be communicated generation in the maximum thread scheduling of second traffic in the second multithreading set to the communication cost set In the minimum processor core of valency.
CN201710132627.6A 2017-03-07 2017-03-07 A kind of multithread scheduling method and system towards many-core system Expired - Fee Related CN106933663B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710132627.6A CN106933663B (en) 2017-03-07 2017-03-07 A kind of multithread scheduling method and system towards many-core system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710132627.6A CN106933663B (en) 2017-03-07 2017-03-07 A kind of multithread scheduling method and system towards many-core system

Publications (2)

Publication Number Publication Date
CN106933663A true CN106933663A (en) 2017-07-07
CN106933663B CN106933663B (en) 2019-07-23

Family

ID=59424539

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710132627.6A Expired - Fee Related CN106933663B (en) 2017-03-07 2017-03-07 A kind of multithread scheduling method and system towards many-core system

Country Status (1)

Country Link
CN (1) CN106933663B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109709806A (en) * 2018-12-27 2019-05-03 杭州铭展网络科技有限公司 A kind of self-adapting data acquisition system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102193779A (en) * 2011-05-16 2011-09-21 武汉科技大学 MPSoC (multi-processor system-on-chip)-oriented multithread scheduling method
CN103838631A (en) * 2014-03-11 2014-06-04 武汉科技大学 Multi-thread scheduling realization method oriented to network on chip
US20160062798A1 (en) * 2014-09-01 2016-03-03 Samsung Electronics Co., Ltd. System-on-chip including multi-core processor and thread scheduling method thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102193779A (en) * 2011-05-16 2011-09-21 武汉科技大学 MPSoC (multi-processor system-on-chip)-oriented multithread scheduling method
CN103838631A (en) * 2014-03-11 2014-06-04 武汉科技大学 Multi-thread scheduling realization method oriented to network on chip
US20160062798A1 (en) * 2014-09-01 2016-03-03 Samsung Electronics Co., Ltd. System-on-chip including multi-core processor and thread scheduling method thereof

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109709806A (en) * 2018-12-27 2019-05-03 杭州铭展网络科技有限公司 A kind of self-adapting data acquisition system
CN109709806B (en) * 2018-12-27 2022-07-19 杭州铭展网络科技有限公司 Self-adaptive data acquisition system

Also Published As

Publication number Publication date
CN106933663B (en) 2019-07-23

Similar Documents

Publication Publication Date Title
Yuan et al. Complexity effective memory access scheduling for many-core accelerator architectures
CN101923491A (en) Thread group address space scheduling and thread switching method under multi-core environment
CN103927231B (en) The energy optimization data set distribution method that a kind of data-oriented processes
Lee et al. Design space exploration of on-chip ring interconnection for a CPU–GPU heterogeneous architecture
CN104331331A (en) Resource distribution method for reconfigurable chip multiprocessor with task number and performance sensing functions
Maqsood et al. Congestion-aware core mapping for network-on-chip based systems using betweenness centrality
CN113886034A (en) Task scheduling method, system, electronic device and storage medium
Goodarzi et al. Task migration in mesh NoCs over virtual point-to-point connections
CN106933663A (en) A kind of multithread scheduling method and system towards many-core system
Slijepcevic et al. pTNoC: Probabilistically time-analyzable tree-based noc for mixed-criticality systems
Xu et al. Hybrid scheduling deadline-constrained multi-DAGs based on reverse HEFT
Daoud et al. Processor allocation algorithm based on frame combing with memorization for 2d mesh cmps
Kohútka et al. A novel hardware-accelerated real-time task scheduler based on robust earliest deadline algorithm
CN103631659B (en) Schedule optimization method for communication energy consumption in on-chip network
CN106445661A (en) Dynamic optimization method and system
Sudheer et al. Dynamic load balancing for petascale quantum Monte Carlo applications: The Alias method
Yazdanpanah et al. A comprehensive view of MapReduce aware scheduling algorithms in cloud environments
Gautam et al. Improving system performance in homogeneous multicore systems
CN117472448B (en) Parallel acceleration method, device and medium for secondary core cluster of Shenwei many-core processor
Frid et al. Memory-aware multiobjective design space exploration of heteregeneous MPSoC
CN112905351B (en) GPU and CPU load scheduling method, device, equipment and medium
Sano et al. Pattern-based systematic task mapping for many-core processors
Guo et al. Machine Learning Assisted Optical Network Resource Scheduling in Data Center Networks
Senthilkumar et al. Energy Efficient Dynamic Slot Allocation of Map Reduce Tasks for Big Data Applications
Wang DupM: a Data Replica Allocation Strategy for Distributed Mining

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190723

Termination date: 20200307

CF01 Termination of patent right due to non-payment of annual fee