CN106933663A - A kind of multithread scheduling method and system towards many-core system - Google Patents
A kind of multithread scheduling method and system towards many-core system Download PDFInfo
- Publication number
- CN106933663A CN106933663A CN201710132627.6A CN201710132627A CN106933663A CN 106933663 A CN106933663 A CN 106933663A CN 201710132627 A CN201710132627 A CN 201710132627A CN 106933663 A CN106933663 A CN 106933663A
- Authority
- CN
- China
- Prior art keywords
- communication cost
- core
- thread
- traffic
- processor core
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Multi Processors (AREA)
Abstract
The invention discloses a kind of multithread scheduling method towards many-core system, methods described includes:Obtain the communication cost between first processor core and second processing device core in default processor core set;Obtain first traffic between each two thread in default first multithreading set;According to first traffic, obtain second traffic of single thread, wherein, second traffic is the thread to the traffic of each thread in the first multithreading set and the traffic sum of each thread in the first multithreading set to the thread;All processor cores are ranked up according to the communication cost, by the maximum thread scheduling of second traffic to the minimum processor core of communication cost.The many-core system that the interconnection structure that the multithread scheduling method and system that the present invention is provided solve use network-on-chip of the prior art is realized has that tasks carrying efficiency is low, execution time technical problem long.
Description
Technical field
The present invention relates to field of computer technology, more particularly to a kind of multithread scheduling method towards many-core system and it is
System.
Background technology
With the development of computer technology, polycaryon processor has also obtained larger development, the symmetric multiprocessor of early stage
(SMP) it is utilized in collecting one group of mode of CPU on same computer, shared drive subsystem and total knot between them more
Structure.Afterwards due to the introducing of nanometer fabrication technology, SMP starts to be changed into chip multiprocessors (Chip
Multiprocessor, CMP), i.e. integrated multiple processing cores on the same chip are formd at multinuclear now described in us
Reason device.Direct shared buffer memory and bus structures between multi-core, can reduce wire delay, improve communication efficiency.When multinuclear system
When processor core in system continues to increase, many-core system has been occurred as soon as.There are more processor cores in many-core system.
Efficient communication mechanism is generally included based on the cache structures for sharing bus and based on the mutual of network-on-chip on current piece
Link structure.Cache structures based on shared bus refer to that each process cores possesses shared two grades or three-level cache, for protecting
The more commonly used data are deposited, and is communicated by bus.The advantage of this system is simple structure, and communication speed is fast;Shortcoming
It is poor expandability.The need for shared bus obviously cannot meet large scale system.Interference networks are used for system-on-chip designs,
The Communication between component on piece is solved, here it is network-on-chip.Network-on-chip (Network On Chip, NoC) technology with
The features such as it supports access simultaneously, reliability is high, reusability is high is considered as more preferable extensive CMP interconnection techniques.
Network-on-chip overcomes the shortcoming of bus structures poor expandability, is for 1,000,000,000 transistor epoch were provided on a kind of feasible piece
System communication mechanism.
Present inventor has found that at least there are the following problems in the prior art when technical scheme is realized:
In current many-core system, because the quantity of processor core is more so that the concurrency of Multi-core is greatly carried
Rise, sharply increased the internuclear traffic so that processor is switched to " communications-intensive " by " computation-intensive ", existing many-core
What the communication means of system typically considered is specific architecture characteristics, although the interconnection structure based on network-on-chip is certain
Bus structures poor expandability is overcome in degree, but is not divided for the multithreading task run in many-core system
Analysis, because the utilization ratio of many-core system is low, causes that tasks carrying efficiency is low, the execution time is long.
It can be seen that, there is tasks carrying efficiency in the many-core system that the interconnection structure of use network-on-chip of the prior art is realized
Low, execution time technical problem long.
The content of the invention
The embodiment of the present invention provides a kind of multithread scheduling method and system towards many-core system, is used to solve existing skill
The many-core system that the interconnection structure of the use network-on-chip in art is realized has that tasks carrying efficiency is low, execution time technology long is asked
Topic.
In a first aspect, the invention discloses a kind of multithread scheduling method towards many-core system, methods described includes:
The communication cost between first processor core and second processing device core in default processor core set is obtained, its
In, the first processor core, second processing device core are any two processor core in processor core set;
Obtain first traffic between each two thread in default first multithreading set;
According to first traffic, second traffic of single thread is obtained, wherein, second traffic is described
The traffic and in the first multithreading set each thread to institute of the thread to each thread in the first multithreading set
The traffic sum of thread is stated, the thread is any thread in the first multithreading set;
All processor cores are ranked up according to the communication cost, by the maximum thread scheduling of second traffic to leading to
In the processor core of letter Least-cost.
Optionally, it is described to obtain leading between first processor core and second processing device core in default processor core set
Letter cost, including:
Obtain the first processor core to the first communication cost of the second processing core;
Obtain the second processing device core to the second communication cost of first process cores;
Using the summation of first communication cost and second communication cost as the communication cost.
Optionally, it is described to obtain the first processor core to the first communication cost of the second processing core, including:
Obtain the first processor core to the physical path number between the second processing core;
Obtain the first processor core to every third communication cost of physical path between the second processing core;
The third communication cost is sued for peace, third communication cost summation is obtained;
Using the ratio of the third communication cost summation and the physical path number as first communication cost.
Optionally, it is described all processor cores are ranked up according to the communication cost, second traffic is maximum
Thread scheduling in the minimum processor core of communication cost, including:
It is ranked up according to all processor cores of the communication cost, obtains communication cost set;
All threads are ranked up according to second traffic, obtain the second multithreading set;
To lead in the maximum thread scheduling of second traffic in the second multithreading set to the communication cost set
In the processor core of letter Least-cost.
Optionally, in the thread scheduling that second traffic in the second multithreading set is maximum to the communication
In cost set in the minimum processor core of communication cost before, also include:
If second multithreading set is not sky, and the communication cost set is not sky;
Judge whether the processor core of communication cost minimum in the communication cost set is allocated;
If the processor core is unassigned, by the thread scheduling to the processor core.
Optionally, in the thread scheduling that second traffic in the second multithreading set is maximum to the communication
In cost set in the minimum processor core of communication cost after, also include:
Delete the processor core of communication cost minimum in the communication cost set;
Delete the thread of second traffic maximum in second sets of threads.
Based on same inventive concept, present invention also offers a kind of multithread scheduling system towards many-core system, institute
The system of stating includes:
First acquisition module, for obtain in default processor core set first processor core and second processing device core it
Between communication cost, wherein, the first processor core, second processing device core be processor core set in any two treatment
Device core;
Second acquisition module, for obtaining the first communication in default first multithreading set between each two thread
Amount;
3rd acquisition module, for according to first traffic, obtaining second traffic of single thread, wherein, institute
It is the traffic and first multithreading of the thread to each thread in the first multithreading set to state second traffic
To the traffic sum of the thread, the thread is any line in the first multithreading set to each thread in set
Journey;
Scheduler module, it is for being ranked up to all processor cores according to the communication cost, second traffic is maximum
Thread scheduling in the minimum processor core of communication cost.
Optionally, first acquisition module is additionally operable to:
Obtain the first processor core to the first communication cost of the second processing core;
Obtain the second processing device core to the second communication cost of first process cores;
Using the summation of first communication cost and second communication cost as the communication cost.
Optionally, it is described to obtain the first processor core to the first communication cost of the second processing core, including:
Obtain the first processor core to the physical path number between the second processing core;
Obtain the first processor core to every third communication cost of physical path between the second processing core;
The third communication cost is sued for peace, third communication cost summation is obtained;
Using the ratio of the third communication cost summation and the physical path number as first communication cost.
Optionally, the scheduler module is additionally operable to:
It is ranked up according to all processor cores of the communication cost, obtains communication cost set;
All threads are ranked up according to second traffic, obtain the second multithreading set;
To lead in the maximum thread scheduling of second traffic in the second multithreading set to the communication cost set
In the processor core of letter Least-cost.One or more technical schemes provided in the embodiment of the present invention, at least with following skill
Art effect or advantage:
The multithread scheduling method and system towards many-core system that the embodiment of the present application is provided, obtain default place first
Communication cost in reason device core set between first processor core and second processing device core;And obtain default first multithreading collection
First traffic in conjunction between each two thread;Then according to first traffic, the second communication of single thread is obtained
All processor cores are ranked up by amount finally according to the communication cost, by the maximum thread scheduling of second traffic to leading to
Believe in the processor core of Least-cost, the application is analyzed and optimized from the angle of multithreading tasks carrying stream, from treatment
The traffic between communication cost and thread between device core is analyzed, and can improve the utilization ratio of many-core system, so that
Shorten the execution time of multithreading task, improve the speed for performing.Solve the interconnection of use network-on-chip of the prior art
The many-core system that structure is realized has that tasks carrying efficiency is low, execution time technical problem long.
Described above is only the general introduction of technical solution of the present invention, in order to better understand technological means of the invention,
And can be practiced according to the content of specification, and in order to allow the above and other objects of the present invention, feature and advantage can
Become apparent, below especially exemplified by specific embodiment of the invention.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
The accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are this hairs
Some bright embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, can be with root
Other accompanying drawings are obtained according to these accompanying drawings.
Fig. 1 be the embodiment of the present invention in towards many-core system multithread scheduling method flow chart;
Fig. 2 be the embodiment of the present invention in towards many-core system multithread scheduling system building-block of logic.
Specific embodiment
The embodiment of the present invention provides a kind of multithread scheduling method and system towards many-core system, is used to solve existing skill
The many-core system that the interconnection structure of the use network-on-chip in art is realized has that tasks carrying efficiency is low, execution time technology long is asked
Topic.The execution time for shortening multithreading task is realized, the technique effect of the speed for performing is improved.
Technical scheme in the embodiment of the present application, general thought is as follows:
A kind of multithread scheduling method towards many-core system, methods described includes:
The communication cost between first processor core and second processing device core in default processor core set is obtained, its
In, the first processor core, second processing device core are any two processor core in processor core set;
Obtain first traffic between each two thread in default first multithreading set;
According to first traffic, second traffic of single thread is obtained, wherein, second traffic is described
The traffic and in the first multithreading set each thread to institute of the thread to each thread in the first multithreading set
The traffic sum of thread is stated, the thread is any thread in the first multithreading set;
All processor cores are ranked up according to the communication cost, by the maximum thread scheduling of second traffic to leading to
In the processor core of letter Least-cost.
In the above method, obtain first in default processor core set between first processor core and second processing device core
Communication cost;And obtain first traffic in default first multithreading set between each two thread;Then according to institute
First traffic is stated, second traffic of single thread is obtained, all processor cores are carried out finally according to the communication cost
Sequence, by the maximum thread scheduling of second traffic to the minimum processor core of communication cost, the application is from multithreading task
Perform the angle of stream to analyze and optimize, divided from the traffic between the communication cost and thread between processor core
Analysis, can improve the utilization ratio of many-core system, so as to shorten the execution time of multithreading task, improve the speed for performing.Solution
Determined use network-on-chip of the prior art interconnection structure realize many-core system exist tasks carrying efficiency it is low, perform when
Between technical problem long.
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention
In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is
A part of embodiment of the present invention, rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art
The every other embodiment obtained under the premise of creative work is not made, belongs to the scope of protection of the invention.
Embodiment one
The present embodiment provides a kind of multithread scheduling method towards many-core system, and methods described includes:
Step S101:Obtain the communication between first processor core and second processing device core in default processor core set
Cost;
Step S102:Obtain first traffic between each two thread in default first multithreading set;
Step S103:According to first traffic, second traffic of single thread is obtained, wherein, described second leads to
Traffic is that the traffic of the thread to each thread in the first multithreading set is every with the first multithreading set
Traffic sum of the individual thread to the thread;
Step S104:All processor cores are ranked up according to the communication cost, by the line that second traffic is maximum
Journey is dispatched in the minimum processor core of communication cost.
In said system, all processor cores are ranked up according to the communication cost, second traffic is maximum
Thread scheduling is analyzed simultaneously in the minimum processor core of communication cost due to the application from the angle of multithreading tasks carrying stream
Optimize, be analyzed from the traffic between the communication cost and thread between processor core, many-core system can be improved
Utilization ratio, so as to shorten the execution time of multithreading task, improve the speed for performing.Solve use of the prior art
The many-core system that the interconnection structure of network-on-chip is realized has that tasks carrying efficiency is low, execution time technical problem long.
It should be noted that in the application, the step S101 and step S102 can be in no particular order first sequentially
Perform step S101, or first carry out step S102.
Below, the multithread scheduling method that the application is provided is described in detail with reference to Fig. 1:
First, step S101 is performed, first processor core and second processing device core in default processor core set is obtained
Between communication cost.
In the embodiment of the present application, processor core set includes multiple processor cores, and specific quantity does not make specific limit
System, the first processor core and second processing device core are any two processor core in the processor core set, i.e., above-mentioned the
Communication cost between one processor core and second processing device core is also a set.
Next, performing step step S102:Obtain in default first multithreading set between each two thread
One traffic.
In the embodiment of the present application, the first multithreading set includes multiple threads, and specific quantity is not specifically limited,
Above-mentioned acquisition is the traffic between any two thread.
Subsequently, step S103 is performed:According to first traffic, second traffic of single thread is obtained, wherein,
Second traffic is that the traffic of the thread to each thread in the first multithreading set is multi-thread with described first
Traffic sum of each thread to the thread in Cheng Jihe.
In the embodiment of the present application, due to obtaining first traffic between each two thread, you can with according to this first
The total traffic capacity of the traffic most each thread is calculated.
Finally, step S104 is performed:All processor cores are ranked up according to the communication cost, by second traffic
Maximum thread scheduling is in the minimum processor core of communication cost.
Specifically, in multithread scheduling method provided in an embodiment of the present invention, the is obtained in default processor core set
Communication cost between one processor core and second processing device core is specifically included:
Obtain the first processor core to the first communication cost of the second processing core;
Obtain the second processing device core to the second communication cost of first process cores;
Using the summation of first communication cost and second communication cost as the communication cost.
In multithread scheduling method provided in an embodiment of the present invention, the first processor core to the second processing is obtained
First communication cost of core, specifically includes:
Obtain the first processor core to the physical path number between the second processing core;
Obtain the first processor core to every third communication cost of physical path between the second processing core;
The third communication cost is sued for peace, third communication cost summation is obtained;
Using the ratio of the third communication cost summation and the physical path number as first communication cost.
In multithread scheduling method provided in an embodiment of the present invention, all processor cores are carried out according to the communication cost
Sequence, by the maximum thread scheduling of second traffic to the minimum processor core of communication cost, including:
It is ranked up according to all processor cores of the communication cost, obtains communication cost set;
All threads are ranked up according to second traffic, obtain the second multithreading set;
To lead in the maximum thread scheduling of second traffic in the second multithreading set to the communication cost set
In the processor core of letter Least-cost.
In multithread scheduling method provided in an embodiment of the present invention, by second traffic in the second multithreading set
Maximum thread scheduling also includes to before in the minimum processor core of communication cost in the communication cost set:
If second multithreading set is not sky, and the communication cost set is not sky;
Judge whether the processor core of communication cost minimum in the communication cost set is allocated;
If the processor core is unassigned, by the thread scheduling to the processor core.
In multithread scheduling method provided in an embodiment of the present invention, by second traffic in the second multithreading set
Maximum thread scheduling also includes to after in the minimum processor core of communication cost in the communication cost set:
Delete the processor core of communication cost minimum in the communication cost set;
Delete the thread of second traffic maximum in second sets of threads.
Process is implemented in order to illustrate more clearly of a kind of multithread scheduling method for providing of the invention, below by
One complete logical instance is explained.
During concrete implementation, for m processor core C0,C1,…Cm-1Many-core system M, many-core system
M can be expressed as M={ C0,C1,…Cm-1};Appoint to two processor core CaAnd Cb, one or more of junctions can be found
Reason device core CaAnd CbPhysical path;For with the d processor core C of physical pathaAnd Cb, processor core CaAnd CbBetween
Average communication cost is designated as Rab:
Wherein, in above-mentioned formula (1), RabH () represents processor core CaAnd CbThe h articles communication cost of physical path;
Communication cost in many-core system M between whole processor cores according to descending arrangement form many-core system M communication cost set
CC, communication cost set CC can be represented using bivariate table as shown in table 1 below:
Table 1
C0 | C1 | C2 | … | Cm-1 | |
C0 | 0 | R01 | R02 | … | R0m-1 |
C1 | R10 | 0 | R12 | … | R1m-1 |
C2 | R20 | R21 | 0 | … | R2m-1 |
… | … | … | … | 0 | … |
Cm-1 | Rm-1 0 | Rm-1 1 | Rm-1 2 | … | 0 |
In table 1, " 0 " is indicated without the processor core of the communication cost on physical path, i.e., to the communication generation of itself
Valency is 0;
Then processor core CaAnd CbBetween communication cost be:
R(CaCb)=Rab+Rba (2)
In above-mentioned formula (2), RabIt is the average communication cost of processor core a to processor b, RbaFor processor core b is arrived
The average communication cost of processor a, as above-mentioned first processor core and second processing device core.
The communication cost to processor core is processed for convenience, and the total communication cost between processor core can be pressed
According to ascending order arrangement, so as to construct communication cost set CCT={ R0,R1,…,Rp, wherein p is unit in total communication cost set
The number of element, the computational methods of p are:
In above-mentioned formula (3), m is the quantity of processor core.Due to two treatment of each element correspondence in communication cost
Device core, when being scheduled, can arbitrarily be distributed the two processor cores, for further Optimization Scheduling, this
Application is processed in the following way:If two communication costs are identical in communication cost set, by processor core sequence number
Small total communication cost sorts preceding.
For example, for 8 many-core system M of processor core, many-core system M={ C0,C1,C2,C3,C4,C5,
C6,C7, the communication cost set CC between process cores is:
C0 | C1 | C2 | C3 | C4 | C5 | C6 | C7 | |
C0 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
C1 | 1 | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
C2 | 2 | 1 | 0 | 1 | 2 | 3 | 4 | 5 |
C3 | 3 | 2 | 1 | 0 | 1 | 2 | 3 | 4 |
C4 | 4 | 3 | 2 | 1 | 0 | 1 | 2 | 3 |
C5 | 5 | 4 | 3 | 2 | 1 | 0 | 1 | 2 |
C6 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | 1 |
C7 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
Communication cost set CCT={ R0,R1,…,R27}
R0=2 | C0,C1 | R7=4 | C0,C2 | R14=6 | C1,C4 | R21=8 | C3,C7 |
R1=2 | C1,C2 | R8=4 | C1,C3 | R15=6 | C2,C5 | R22=10 | C0,C5 |
R2=2 | C2,C3 | R9=4 | C2,C4 | R16=6 | C3,C6 | R23=10 | C1,C6 |
R3=2 | C3,C4 | R10=4 | C3,C5 | R17=6 | C4,C7 | R24=10 | C2,C7 |
R4=2 | C4,C5 | R11=4 | C4,C6 | R18=8 | C0,C4 | R25=12 | C0,C6 |
R5=2 | C5,C6 | R12=4 | C5,C7 | R19=8 | C1,C5 | R26=12 | C1,C7 |
R6=2 | C6,C7 | R13=6 | C0,C3 | R20=8 | C2,C6 | R27=14 | C0,C7 |
Next, the traffic between multithreading is calculated, specifically, for example, for n thread T0,T1,
T2,…,Tn-1The first multithreading set Δ, the first multithreading set Δ is expressed as Δ={ T0,T1,T2,…,Tn-1};Appoint to two
Individual thread TlAnd Tk, TFlkRepresent from thread TlTo TkFirst traffic, then in the first multithreading set Δ between whole threads
First traffic can be represented with the bivariate table shown in table 2 below:
Table 2
T0 | T1 | T2 | … | Tn-1 | |
T0 | 0 | TF01 | TF02 | … | TF0n-1 |
T1 | TF10 | 0 | TF12 | … | TF1n-1 |
T2 | TF20 | TF21 | 0 | … | TF2n-1 |
… | … | … | … | 0 | … |
Tn-1 | TFn-1 0 | Tn-1 1 | Tn-1 2 | … | 0 |
In table 2, without the traffic between two threads of " 0 " expression;Then any one thread TiTotal traffic capacity, i.e., second lead to
Traffic TF (Ti) be:
WhereinIt is above-mentioned thread TiEach thread is (from T in the first multithreading set0To Tn-1)
The traffic,For in the first multithreading set each thread (from T0To Tn-1) to the thread the traffic it
With.
Due to being calculated second traffic of single thread, then to the thread in multithreading set Δ according to thread
Total traffic capacity size carry out descending arrangement, obtain multithreading set Δ '={ T0’,T1’,T2’,…,Tn-1’};Preferably,
If the total traffic capacity of two threads is identical, the small thread ordering of sequence number is preceding.
For multithreading set Δ={ T0,T1,T2,T3,T4,T5,T6,T7, the traffic between thread is as shown in the table:
Multithreading set Δ can be obtained from upper table '={ T0’,T1’,T2’,T3’,T4’,T5’,T6’,T7', wherein:
T0' it is T in multithreading set Δ0;
T1' it is T in multithreading set Δ7;
T2' it is T in multithreading set Δ1;
T3' it is T in multithreading set Δ6;
T4' it is T in multithreading set Δ2;
T5' it is T in multithreading set Δ5;
T6' it is T in multithreading set Δ3;
T7' it is T in multithreading set Δ4;
Next, being illustrated to a kind of preferred thread scheduling method provided in an embodiment of the present invention, specific steps are such as
Under:
First, it is determined that whether communication cost set CCT and the second multithreading set are empty, if communication cost set CCT
Or second multithreading collection be combined into sky, then thread scheduling terminates;If communication cost set CCT and the second multithreading set are not
Sky, then from the member that communication cost set CCT selected and sorteds are 1, the member is designated as Rw, RwIt is total between two processor cores
Communication cost, the two processor cores are designated as CuAnd Cv.Remove R from communication cost set CCTw.Remove from many-core system M
Processor core CuAnd Cv。
For further Optimization Scheduling, before being scheduled, also judge that communication cost set CCT selected and sorteds are 1
The corresponding processor core C of memberuAnd CvWhether be allocated and multithreading set Δ ' in remaining number of threads.
In specific implementation process, specific step is as described below:
Step 1:If total communication cost set CCT is sky, without distributable processor core, thread scheduling terminates;
If total communication cost set CCT is not sky, to step 2.
Step 2:From the member that total communication cost set CCT selected and sorteds are 1, the member is designated as Rw, RwIt is two processors
Total communication cost between core, the two processor cores are designated as CuAnd Cv.Remove R from total communication cost set CCTw.From many-core
Remove processor core C in system MuAnd Cv。
Step 3:If processor core CuAnd CvAll be not previously allocated, and multithreading set Δ ' in remaining number of threads be more than
Equal to 2, to step 4;If processor core CuAnd CvMiddle only one of which processor core is allocated, and multithreading set Δ ' be not
Sky, to step 5;If processor core CuAnd CvAll be not previously allocated, and multithreading set Δ ' in remaining number of threads be equal to 1,
To step 5;If processor core CuAnd CvIt is all allocated, return to step 1;
Step 4:From multithreading set Δ ' selected and sorted be 1 and 2 two thread TxAnd Ty, it is assigned to processor core CuWith
Cv;From multithreading set Δ ' in remove thread TxAnd Ty.If multithreading set Δ ' it is not sky, return to step 1;If multi-thread
Cheng Jihe Δs ' be sky, then the arrival of new thread is waited, thread scheduling terminates.
Step 5:From multithreading set Δ ' selected and sorted be 1 thread Tx, it is assigned to unassigned processor core:Such as
Fruit processor core CuIt is allocated, then say thread TxIt is assigned to Cv;If processor core CvIt is allocated, then say thread TxPoint
It is fitted on Cu;From multithreading set Δ ' in remove thread Tx.If multithreading set Δ ' it is not sky, return to step 1;If multi-thread
Cheng Jihe Δs ' be sky, then the arrival of new thread is waited, thread scheduling terminates.
By multithreading set Δ '={ T0’,T1’,T2’,T3’,T4’,T5’,T6’,T7' in thread scheduling to many-core system
M={ C0,C1,C2,C3,C4,C5,C6,C7Process it is as follows, wherein communication cost set CCT={ R0,R1,…,R27}:
1), communication cost set CCT is not sky, to step 2.
2) it is R from the member that communication cost set CCT selected and sorteds are 10, corresponding two processor cores are C0And C1。
Remove R from communication cost set CCT0.Remove processor core C from many-core system M0And C1。
3) processor core C0And C1All be not previously allocated, and multithreading set Δ ' in remaining number of threads be 8, to step
4);
4), from multithreading set Δ ' selected and sorted be 1 and 2 two thread T0' and T1', it is assigned to processor core C0With
C1;From multithreading set Δ ' in remove thread T0' and T1’.Return to step 1.
5) repeat the above steps, until all of thread is assigned.Allocation result is as shown in the table:
In upper table, multithreading set Δ ' in thread T0' it is scheduled for the processor core C in many-core system M0, i.e., it is multi-thread
Thread T in Cheng Jihe Δs0It is scheduled for the processor core C in many-core system M0;Multithreading set Δ ' in thread T1' adjusted
The processor core C spent in many-core system M1, i.e., the thread T in multithreading set Δ7It is scheduled for the treatment in many-core system M
Device core C1;By that analogy, all threads in multithreading set Δ are all scheduled on processor core.
Based on the inventive concept same with embodiment one, the present invention is that embodiment two additionally provides one kind towards many-core system
Multithread scheduling system, the system includes:
First acquisition module, for obtain in default processor core set first processor core and second processing device core it
Between communication cost;
Second acquisition module, for obtaining the first communication in default first multithreading set between each two thread
Amount;
3rd acquisition module, for according to first traffic, obtaining second traffic of single thread, wherein, institute
It is the traffic and first multithreading of the thread to each thread in the first multithreading set to state second traffic
Traffic sum of each thread to the thread in set;
Scheduler module, it is for being ranked up to all processor cores according to the communication cost, second traffic is maximum
Thread scheduling in the minimum processor core of communication cost.
Alternatively, first acquisition module is additionally operable to:
Obtain the first processor core to the first communication cost of the second processing core;
Obtain the second processing device core to the second communication cost of first process cores;
Using the summation of first communication cost and second communication cost as the communication cost.
Alternatively, it is described to obtain the first processor core to the first communication cost of the second processing core, including:
Obtain the first processor core to the physical path number between the second processing core;
Obtain the first processor core to every third communication cost of physical path between the second processing core;
The third communication cost is sued for peace, third communication cost summation is obtained;
Using the ratio of the third communication cost summation and the physical path number as first communication cost.
Alternatively, the scheduler module is additionally operable to:
It is ranked up according to all processor cores of the communication cost, obtains communication cost set;
All threads are ranked up according to second traffic, obtain the second multithreading set;
To lead in the maximum thread scheduling of second traffic in the second multithreading set to the communication cost set
In the processor core of letter Least-cost.
By the system that the embodiment of the present invention two is introduced, to implement the method institute of the thread scheduling of the embodiment of the present invention one
The system of use, so the method introduced based on the embodiment of the present invention one, the affiliated personnel in this area will appreciate that the system
Concrete structure and deformation, so will not be repeated here.The system that the method for every embodiment of the present invention one is used belongs to this
The scope to be protected of invention.
One or more technical schemes provided in the embodiment of the present invention, at least have the following technical effect that or advantage:
The multithread scheduling method and system towards many-core system that the embodiment of the present application is provided, obtain default place first
Communication cost in reason device core set between first processor core and second processing device core;And obtain default first multithreading collection
First traffic in conjunction between each two thread;Then according to first traffic, the second communication of single thread is obtained
All processor cores are ranked up by amount finally according to the communication cost, by the maximum thread scheduling of second traffic to leading to
Believe in the processor core of Least-cost, the application is analyzed and optimized from the angle of multithreading tasks carrying stream, from treatment
The traffic between communication cost and thread between device core is analyzed, and can improve the utilization ratio of many-core system, so that
Shorten the execution time of multithreading task, improve the speed for performing.Solve the interconnection of use network-on-chip of the prior art
The many-core system that structure is realized has that tasks carrying efficiency is low, execution time technical problem long.
, but those skilled in the art once know basic creation although preferred embodiments of the present invention have been described
Property concept, then can make other change and modification to these embodiments.So, appended claims are intended to be construed to include excellent
Select embodiment and fall into having altered and changing for the scope of the invention.
Obviously, those skilled in the art can carry out various changes and modification without deviating from this hair to the embodiment of the present invention
The spirit and scope of bright embodiment.So, if these modifications of the embodiment of the present invention and modification belong to the claims in the present invention
And its within the scope of equivalent technologies, then the present invention is also intended to comprising these changes and modification.
Claims (10)
1. a kind of multithread scheduling method towards many-core system, it is characterised in that methods described includes:
The communication cost between first processor core and second processing device core in default processor core set is obtained, wherein, institute
It is any two processor core in processor core set to state first processor core, second processing device core;
Obtain first traffic between each two thread in default first multithreading set;
According to first traffic, second traffic of single thread is obtained, wherein, second traffic is the thread
The traffic of each thread and each thread in the first multithreading set to the line in the first multithreading set
The traffic sum of journey, the thread is any thread in the first multithreading set;
All processor cores are ranked up according to the communication cost, by the maximum thread scheduling of second traffic to communication generation
In the minimum processor core of valency.
2. the method for claim 1, it is characterised in that first processor in the default processor core set of acquisition
Communication cost between core and second processing device core, including:
Obtain the first processor core to the first communication cost of the second processing core;
Obtain the second processing device core to the second communication cost of first process cores;
Using the summation of first communication cost and second communication cost as the communication cost.
3. method as claimed in claim 2, it is characterised in that the acquisition first processor core to the second processing
First communication cost of core, including:
Obtain the first processor core to the physical path number between the second processing core;
Obtain the first processor core to every third communication cost of physical path between the second processing core;
The third communication cost is sued for peace, third communication cost summation is obtained;
Using the ratio of the third communication cost summation and the physical path number as first communication cost.
4. the method for claim 1, it is characterised in that described all processor cores are carried out according to the communication cost
Sequence, by the maximum thread scheduling of second traffic to the minimum processor core of communication cost, including:
It is ranked up according to all processor cores of the communication cost, obtains communication cost set;
All threads are ranked up according to second traffic, obtain the second multithreading set;
To be communicated generation in the maximum thread scheduling of second traffic in the second multithreading set to the communication cost set
In the minimum processor core of valency.
5. method as claimed in claim 4, it is characterised in that described by second traffic in the second multithreading set
Maximum thread scheduling also includes to before in the minimum processor core of communication cost in the communication cost set:
If second multithreading set is not sky, and the communication cost set is not sky;
Judge whether the processor core of communication cost minimum in the communication cost set is allocated;
If the processor core is unassigned, by the thread scheduling to the processor core.
6. method as claimed in claim 4, it is characterised in that described by second traffic in the second multithreading set
Maximum thread scheduling also includes to after in the minimum processor core of communication cost in the communication cost set:
Delete the processor core of communication cost minimum in the communication cost set;
Delete the thread of second traffic maximum in second sets of threads.
7. a kind of multithread scheduling system towards many-core system, it is characterised in that the system includes:
First acquisition module, for obtaining in default processor core set between first processor core and second processing device core
Communication cost, wherein, the first processor core, second processing device core are any two processor in processor core set
Core;
Second acquisition module, for obtaining first traffic in default first multithreading set between each two thread;
3rd acquisition module, for according to first traffic, obtaining second traffic of single thread, wherein, described the
Two traffics are the traffic and the first multithreading set of the thread to each thread in the first multithreading set
In each thread to the thread traffic sum, the thread is any thread in the first multithreading set;
Scheduler module, for being ranked up to all processor cores according to the communication cost, by the line that second traffic is maximum
Journey is dispatched in the minimum processor core of communication cost.
8. system as claimed in claim 7, it is characterised in that first acquisition module is additionally operable to:
Obtain the first processor core to the first communication cost of the second processing core;
Obtain the second processing device core to the second communication cost of first process cores;
Using the summation of first communication cost and second communication cost as the communication cost.
9. system as claimed in claim 8, it is characterised in that the acquisition first processor core to the second processing
First communication cost of core, including:
Obtain the first processor core to the physical path number between the second processing core;
Obtain the first processor core to every third communication cost of physical path between the second processing core;
The third communication cost is sued for peace, third communication cost summation is obtained;
Using the ratio of the third communication cost summation and the physical path number as first communication cost.
10. system as claimed in claim 7, it is characterised in that the scheduler module is additionally operable to:
It is ranked up according to all processor cores of the communication cost, obtains communication cost set;
All threads are ranked up according to second traffic, obtain the second multithreading set;
To be communicated generation in the maximum thread scheduling of second traffic in the second multithreading set to the communication cost set
In the minimum processor core of valency.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710132627.6A CN106933663B (en) | 2017-03-07 | 2017-03-07 | A kind of multithread scheduling method and system towards many-core system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710132627.6A CN106933663B (en) | 2017-03-07 | 2017-03-07 | A kind of multithread scheduling method and system towards many-core system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106933663A true CN106933663A (en) | 2017-07-07 |
CN106933663B CN106933663B (en) | 2019-07-23 |
Family
ID=59424539
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710132627.6A Expired - Fee Related CN106933663B (en) | 2017-03-07 | 2017-03-07 | A kind of multithread scheduling method and system towards many-core system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106933663B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109709806A (en) * | 2018-12-27 | 2019-05-03 | 杭州铭展网络科技有限公司 | A kind of self-adapting data acquisition system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102193779A (en) * | 2011-05-16 | 2011-09-21 | 武汉科技大学 | MPSoC (multi-processor system-on-chip)-oriented multithread scheduling method |
CN103838631A (en) * | 2014-03-11 | 2014-06-04 | 武汉科技大学 | Multi-thread scheduling realization method oriented to network on chip |
US20160062798A1 (en) * | 2014-09-01 | 2016-03-03 | Samsung Electronics Co., Ltd. | System-on-chip including multi-core processor and thread scheduling method thereof |
-
2017
- 2017-03-07 CN CN201710132627.6A patent/CN106933663B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102193779A (en) * | 2011-05-16 | 2011-09-21 | 武汉科技大学 | MPSoC (multi-processor system-on-chip)-oriented multithread scheduling method |
CN103838631A (en) * | 2014-03-11 | 2014-06-04 | 武汉科技大学 | Multi-thread scheduling realization method oriented to network on chip |
US20160062798A1 (en) * | 2014-09-01 | 2016-03-03 | Samsung Electronics Co., Ltd. | System-on-chip including multi-core processor and thread scheduling method thereof |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109709806A (en) * | 2018-12-27 | 2019-05-03 | 杭州铭展网络科技有限公司 | A kind of self-adapting data acquisition system |
CN109709806B (en) * | 2018-12-27 | 2022-07-19 | 杭州铭展网络科技有限公司 | Self-adaptive data acquisition system |
Also Published As
Publication number | Publication date |
---|---|
CN106933663B (en) | 2019-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yuan et al. | Complexity effective memory access scheduling for many-core accelerator architectures | |
CN101923491A (en) | Thread group address space scheduling and thread switching method under multi-core environment | |
CN103927231B (en) | The energy optimization data set distribution method that a kind of data-oriented processes | |
Lee et al. | Design space exploration of on-chip ring interconnection for a CPU–GPU heterogeneous architecture | |
CN104331331A (en) | Resource distribution method for reconfigurable chip multiprocessor with task number and performance sensing functions | |
Maqsood et al. | Congestion-aware core mapping for network-on-chip based systems using betweenness centrality | |
CN113886034A (en) | Task scheduling method, system, electronic device and storage medium | |
Goodarzi et al. | Task migration in mesh NoCs over virtual point-to-point connections | |
CN106933663A (en) | A kind of multithread scheduling method and system towards many-core system | |
Slijepcevic et al. | pTNoC: Probabilistically time-analyzable tree-based noc for mixed-criticality systems | |
Xu et al. | Hybrid scheduling deadline-constrained multi-DAGs based on reverse HEFT | |
Daoud et al. | Processor allocation algorithm based on frame combing with memorization for 2d mesh cmps | |
Kohútka et al. | A novel hardware-accelerated real-time task scheduler based on robust earliest deadline algorithm | |
CN103631659B (en) | Schedule optimization method for communication energy consumption in on-chip network | |
CN106445661A (en) | Dynamic optimization method and system | |
Sudheer et al. | Dynamic load balancing for petascale quantum Monte Carlo applications: The Alias method | |
Yazdanpanah et al. | A comprehensive view of MapReduce aware scheduling algorithms in cloud environments | |
Gautam et al. | Improving system performance in homogeneous multicore systems | |
CN117472448B (en) | Parallel acceleration method, device and medium for secondary core cluster of Shenwei many-core processor | |
Frid et al. | Memory-aware multiobjective design space exploration of heteregeneous MPSoC | |
CN112905351B (en) | GPU and CPU load scheduling method, device, equipment and medium | |
Sano et al. | Pattern-based systematic task mapping for many-core processors | |
Guo et al. | Machine Learning Assisted Optical Network Resource Scheduling in Data Center Networks | |
Senthilkumar et al. | Energy Efficient Dynamic Slot Allocation of Map Reduce Tasks for Big Data Applications | |
Wang | DupM: a Data Replica Allocation Strategy for Distributed Mining |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20190723 Termination date: 20200307 |
|
CF01 | Termination of patent right due to non-payment of annual fee |