CN106933663A

CN106933663A - A kind of multithread scheduling method and system towards many-core system

Info

Publication number: CN106933663A
Application number: CN201710132627.6A
Authority: CN
Inventors: 沈欢; 胡威; 唐玉馨; 刘小明; 戴文丽; 马梦东; 张凯; 刘俊; 吕晴阳; 刘丹
Original assignee: Wuhan University of Science and Engineering WUSE
Current assignee: Wuhan University of Science and Engineering WUSE; Wuhan University of Science and Technology WHUST
Priority date: 2017-03-07
Filing date: 2017-03-07
Publication date: 2017-07-07
Anticipated expiration: 2037-03-07
Also published as: CN106933663B

Abstract

The invention discloses a kind of multithread scheduling method towards many-core system, methods described includes：Obtain the communication cost between first processor core and second processing device core in default processor core set；Obtain first traffic between each two thread in default first multithreading set；According to first traffic, obtain second traffic of single thread, wherein, second traffic is the thread to the traffic of each thread in the first multithreading set and the traffic sum of each thread in the first multithreading set to the thread；All processor cores are ranked up according to the communication cost, by the maximum thread scheduling of second traffic to the minimum processor core of communication cost.The many-core system that the interconnection structure that the multithread scheduling method and system that the present invention is provided solve use network-on-chip of the prior art is realized has that tasks carrying efficiency is low, execution time technical problem long.

Description

A kind of multithread scheduling method and system towards many-core system

Technical field

The present invention relates to field of computer technology, more particularly to a kind of multithread scheduling method towards many-core system and it is System.

Background technology

With the development of computer technology, polycaryon processor has also obtained larger development, the symmetric multiprocessor of early stage (SMP) it is utilized in collecting one group of mode of CPU on same computer, shared drive subsystem and total knot between them more Structure.Afterwards due to the introducing of nanometer fabrication technology, SMP starts to be changed into chip multiprocessors (Chip Multiprocessor, CMP), i.e. integrated multiple processing cores on the same chip are formd at multinuclear now described in us Reason device.Direct shared buffer memory and bus structures between multi-core, can reduce wire delay, improve communication efficiency.When multinuclear system When processor core in system continues to increase, many-core system has been occurred as soon as.There are more processor cores in many-core system.

Efficient communication mechanism is generally included based on the cache structures for sharing bus and based on the mutual of network-on-chip on current piece Link structure.Cache structures based on shared bus refer to that each process cores possesses shared two grades or three-level cache, for protecting The more commonly used data are deposited, and is communicated by bus.The advantage of this system is simple structure, and communication speed is fast；Shortcoming It is poor expandability.The need for shared bus obviously cannot meet large scale system.Interference networks are used for system-on-chip designs, The Communication between component on piece is solved, here it is network-on-chip.Network-on-chip (Network On Chip, NoC) technology with The features such as it supports access simultaneously, reliability is high, reusability is high is considered as more preferable extensive CMP interconnection techniques. Network-on-chip overcomes the shortcoming of bus structures poor expandability, is for 1,000,000,000 transistor epoch were provided on a kind of feasible piece System communication mechanism.

Present inventor has found that at least there are the following problems in the prior art when technical scheme is realized：

In current many-core system, because the quantity of processor core is more so that the concurrency of Multi-core is greatly carried Rise, sharply increased the internuclear traffic so that processor is switched to " communications-intensive " by " computation-intensive ", existing many-core What the communication means of system typically considered is specific architecture characteristics, although the interconnection structure based on network-on-chip is certain Bus structures poor expandability is overcome in degree, but is not divided for the multithreading task run in many-core system Analysis, because the utilization ratio of many-core system is low, causes that tasks carrying efficiency is low, the execution time is long.

It can be seen that, there is tasks carrying efficiency in the many-core system that the interconnection structure of use network-on-chip of the prior art is realized Low, execution time technical problem long.

The content of the invention

The embodiment of the present invention provides a kind of multithread scheduling method and system towards many-core system, is used to solve existing skill The many-core system that the interconnection structure of the use network-on-chip in art is realized has that tasks carrying efficiency is low, execution time technology long is asked Topic.

In a first aspect, the invention discloses a kind of multithread scheduling method towards many-core system, methods described includes：

The communication cost between first processor core and second processing device core in default processor core set is obtained, its In, the first processor core, second processing device core are any two processor core in processor core set；

Obtain first traffic between each two thread in default first multithreading set；

According to first traffic, second traffic of single thread is obtained, wherein, second traffic is described The traffic and in the first multithreading set each thread to institute of the thread to each thread in the first multithreading set The traffic sum of thread is stated, the thread is any thread in the first multithreading set；

All processor cores are ranked up according to the communication cost, by the maximum thread scheduling of second traffic to leading to In the processor core of letter Least-cost.

Optionally, it is described to obtain leading between first processor core and second processing device core in default processor core set Letter cost, including：

Obtain the first processor core to the first communication cost of the second processing core；

Obtain the second processing device core to the second communication cost of first process cores；

Using the summation of first communication cost and second communication cost as the communication cost.

Optionally, it is described to obtain the first processor core to the first communication cost of the second processing core, including：

Obtain the first processor core to the physical path number between the second processing core；

Obtain the first processor core to every third communication cost of physical path between the second processing core；

The third communication cost is sued for peace, third communication cost summation is obtained；

Using the ratio of the third communication cost summation and the physical path number as first communication cost.

Optionally, it is described all processor cores are ranked up according to the communication cost, second traffic is maximum Thread scheduling in the minimum processor core of communication cost, including：

It is ranked up according to all processor cores of the communication cost, obtains communication cost set；

All threads are ranked up according to second traffic, obtain the second multithreading set；

To lead in the maximum thread scheduling of second traffic in the second multithreading set to the communication cost set In the processor core of letter Least-cost.

Optionally, in the thread scheduling that second traffic in the second multithreading set is maximum to the communication In cost set in the minimum processor core of communication cost before, also include：

If second multithreading set is not sky, and the communication cost set is not sky；

Judge whether the processor core of communication cost minimum in the communication cost set is allocated；

If the processor core is unassigned, by the thread scheduling to the processor core.

Optionally, in the thread scheduling that second traffic in the second multithreading set is maximum to the communication In cost set in the minimum processor core of communication cost after, also include：

Delete the processor core of communication cost minimum in the communication cost set；

Delete the thread of second traffic maximum in second sets of threads.

Based on same inventive concept, present invention also offers a kind of multithread scheduling system towards many-core system, institute The system of stating includes：

First acquisition module, for obtain in default processor core set first processor core and second processing device core it Between communication cost, wherein, the first processor core, second processing device core be processor core set in any two treatment Device core；

Second acquisition module, for obtaining the first communication in default first multithreading set between each two thread Amount；

3rd acquisition module, for according to first traffic, obtaining second traffic of single thread, wherein, institute It is the traffic and first multithreading of the thread to each thread in the first multithreading set to state second traffic To the traffic sum of the thread, the thread is any line in the first multithreading set to each thread in set Journey；

Scheduler module, it is for being ranked up to all processor cores according to the communication cost, second traffic is maximum Thread scheduling in the minimum processor core of communication cost.

Optionally, first acquisition module is additionally operable to：

Optionally, the scheduler module is additionally operable to：

To lead in the maximum thread scheduling of second traffic in the second multithreading set to the communication cost set In the processor core of letter Least-cost.One or more technical schemes provided in the embodiment of the present invention, at least with following skill Art effect or advantage：

The multithread scheduling method and system towards many-core system that the embodiment of the present application is provided, obtain default place first Communication cost in reason device core set between first processor core and second processing device core；And obtain default first multithreading collection First traffic in conjunction between each two thread；Then according to first traffic, the second communication of single thread is obtained All processor cores are ranked up by amount finally according to the communication cost, by the maximum thread scheduling of second traffic to leading to Believe in the processor core of Least-cost, the application is analyzed and optimized from the angle of multithreading tasks carrying stream, from treatment The traffic between communication cost and thread between device core is analyzed, and can improve the utilization ratio of many-core system, so that Shorten the execution time of multithreading task, improve the speed for performing.Solve the interconnection of use network-on-chip of the prior art The many-core system that structure is realized has that tasks carrying efficiency is low, execution time technical problem long.

Described above is only the general introduction of technical solution of the present invention, in order to better understand technological means of the invention, And can be practiced according to the content of specification, and in order to allow the above and other objects of the present invention, feature and advantage can Become apparent, below especially exemplified by specific embodiment of the invention.

Brief description of the drawings

In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing The accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are this hairs Some bright embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, can be with root Other accompanying drawings are obtained according to these accompanying drawings.

Fig. 1 be the embodiment of the present invention in towards many-core system multithread scheduling method flow chart；

Fig. 2 be the embodiment of the present invention in towards many-core system multithread scheduling system building-block of logic.

Specific embodiment

The embodiment of the present invention provides a kind of multithread scheduling method and system towards many-core system, is used to solve existing skill The many-core system that the interconnection structure of the use network-on-chip in art is realized has that tasks carrying efficiency is low, execution time technology long is asked Topic.The execution time for shortening multithreading task is realized, the technique effect of the speed for performing is improved.

Technical scheme in the embodiment of the present application, general thought is as follows：

A kind of multithread scheduling method towards many-core system, methods described includes：

In the above method, obtain first in default processor core set between first processor core and second processing device core Communication cost；And obtain first traffic in default first multithreading set between each two thread；Then according to institute First traffic is stated, second traffic of single thread is obtained, all processor cores are carried out finally according to the communication cost Sequence, by the maximum thread scheduling of second traffic to the minimum processor core of communication cost, the application is from multithreading task Perform the angle of stream to analyze and optimize, divided from the traffic between the communication cost and thread between processor core Analysis, can improve the utilization ratio of many-core system, so as to shorten the execution time of multithreading task, improve the speed for performing.Solution Determined use network-on-chip of the prior art interconnection structure realize many-core system exist tasks carrying efficiency it is low, perform when Between technical problem long.

To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is A part of embodiment of the present invention, rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art The every other embodiment obtained under the premise of creative work is not made, belongs to the scope of protection of the invention.

Embodiment one

The present embodiment provides a kind of multithread scheduling method towards many-core system, and methods described includes：

Step S101：Obtain the communication between first processor core and second processing device core in default processor core set Cost；

Step S102：Obtain first traffic between each two thread in default first multithreading set；

Step S103：According to first traffic, second traffic of single thread is obtained, wherein, described second leads to Traffic is that the traffic of the thread to each thread in the first multithreading set is every with the first multithreading set Traffic sum of the individual thread to the thread；

Step S104：All processor cores are ranked up according to the communication cost, by the line that second traffic is maximum Journey is dispatched in the minimum processor core of communication cost.

In said system, all processor cores are ranked up according to the communication cost, second traffic is maximum Thread scheduling is analyzed simultaneously in the minimum processor core of communication cost due to the application from the angle of multithreading tasks carrying stream Optimize, be analyzed from the traffic between the communication cost and thread between processor core, many-core system can be improved Utilization ratio, so as to shorten the execution time of multithreading task, improve the speed for performing.Solve use of the prior art The many-core system that the interconnection structure of network-on-chip is realized has that tasks carrying efficiency is low, execution time technical problem long.

It should be noted that in the application, the step S101 and step S102 can be in no particular order first sequentially Perform step S101, or first carry out step S102.

Below, the multithread scheduling method that the application is provided is described in detail with reference to Fig. 1：

First, step S101 is performed, first processor core and second processing device core in default processor core set is obtained Between communication cost.

In the embodiment of the present application, processor core set includes multiple processor cores, and specific quantity does not make specific limit System, the first processor core and second processing device core are any two processor core in the processor core set, i.e., above-mentioned the Communication cost between one processor core and second processing device core is also a set.

Next, performing step step S102：Obtain in default first multithreading set between each two thread One traffic.

In the embodiment of the present application, the first multithreading set includes multiple threads, and specific quantity is not specifically limited, Above-mentioned acquisition is the traffic between any two thread.

Subsequently, step S103 is performed：According to first traffic, second traffic of single thread is obtained, wherein, Second traffic is that the traffic of the thread to each thread in the first multithreading set is multi-thread with described first Traffic sum of each thread to the thread in Cheng Jihe.

In the embodiment of the present application, due to obtaining first traffic between each two thread, you can with according to this first The total traffic capacity of the traffic most each thread is calculated.

Finally, step S104 is performed：All processor cores are ranked up according to the communication cost, by second traffic Maximum thread scheduling is in the minimum processor core of communication cost.

Specifically, in multithread scheduling method provided in an embodiment of the present invention, the is obtained in default processor core set Communication cost between one processor core and second processing device core is specifically included：

In multithread scheduling method provided in an embodiment of the present invention, the first processor core to the second processing is obtained First communication cost of core, specifically includes：

In multithread scheduling method provided in an embodiment of the present invention, all processor cores are carried out according to the communication cost Sequence, by the maximum thread scheduling of second traffic to the minimum processor core of communication cost, including：

In multithread scheduling method provided in an embodiment of the present invention, by second traffic in the second multithreading set Maximum thread scheduling also includes to before in the minimum processor core of communication cost in the communication cost set：

In multithread scheduling method provided in an embodiment of the present invention, by second traffic in the second multithreading set Maximum thread scheduling also includes to after in the minimum processor core of communication cost in the communication cost set：

Delete the thread of second traffic maximum in second sets of threads.

Process is implemented in order to illustrate more clearly of a kind of multithread scheduling method for providing of the invention, below by One complete logical instance is explained.

During concrete implementation, for m processor core C₀,C1,…C_m-1Many-core system M, many-core system M can be expressed as M={ C₀,C1,…C_m-1}；Appoint to two processor core C_aAnd C_b, one or more of junctions can be found Reason device core C_aAnd C_bPhysical path；For with the d processor core C of physical path_aAnd C_b, processor core C_aAnd C_bBetween Average communication cost is designated as R_ab：

Wherein, in above-mentioned formula (1), R_abH () represents processor core C_aAnd C_bThe h articles communication cost of physical path； Communication cost in many-core system M between whole processor cores according to descending arrangement form many-core system M communication cost set CC, communication cost set CC can be represented using bivariate table as shown in table 1 below：

Table 1

	C₀	C₁	C₂	…	C_m-1
						C₀	0	R₀₁	R₀₂	…	R_0m-1
C₁	R₁₀	0	R₁₂	…	R_1m-1
						C₂	R₂₀	R₂₁	0	…	R_2m-1
…	…	…	…	0	…
						C_m-1	R_{m-1 0}	R_{m-1 1}	R_{m-1 2}	…	0

In table 1, " 0 " is indicated without the processor core of the communication cost on physical path, i.e., to the communication generation of itself Valency is 0；

Then processor core C_aAnd C_bBetween communication cost be：

R(C_aC_b)=R_ab+R_ba (2)

In above-mentioned formula (2), R_abIt is the average communication cost of processor core a to processor b, R_baFor processor core b is arrived The average communication cost of processor a, as above-mentioned first processor core and second processing device core.

The communication cost to processor core is processed for convenience, and the total communication cost between processor core can be pressed According to ascending order arrangement, so as to construct communication cost set CCT={ R₀,R₁,…,R_p, wherein p is unit in total communication cost set The number of element, the computational methods of p are：

In above-mentioned formula (3), m is the quantity of processor core.Due to two treatment of each element correspondence in communication cost Device core, when being scheduled, can arbitrarily be distributed the two processor cores, for further Optimization Scheduling, this Application is processed in the following way：If two communication costs are identical in communication cost set, by processor core sequence number Small total communication cost sorts preceding.

For example, for 8 many-core system M of processor core, many-core system M={ C₀,C₁,C₂,C₃,C₄,C₅, C₆,C₇, the communication cost set CC between process cores is：

	C₀	C₁	C₂	C₃	C₄	C₅	C₆	C₇
									C₀	0	1	2	3	4	5	6	7
C₁	1	0	1	2	3	4	5	6
									C₂	2	1	0	1	2	3	4	5
C₃	3	2	1	0	1	2	3	4
									C₄	4	3	2	1	0	1	2	3
C₅	5	4	3	2	1	0	1	2
									C₆	6	5	4	3	2	1	0	1
C₇	7	6	5	4	3	2	1	0

Communication cost set CCT={ R₀,R₁,…,R₂₇}

R₀=2

C₀,C₁

R₇=4

C₀,C₂

R₁₄=6

C₁,C₄

R₂₁=8

C₃,C₇

R₁=2

C₁,C₂

R₈=4

C₁,C₃

R₁₅=6

C₂,C₅

R₂₂=10

C₀,C₅

R₂=2

C₂,C₃

R₉=4

C₂,C₄

R₁₆=6

C₃,C₆

R₂₃=10

C₁,C₆

R₃=2

C₃,C₄

R₁₀=4

C₃,C₅

R₁₇=6

C₄,C₇

R₂₄=10

C₂,C₇

R₄=2

C₄,C₅

R₁₁=4

C₄,C₆

R₁₈=8

C₀,C₄

R₂₅=12

C₀,C₆

R₅=2

C₅,C₆

R₁₂=4

C₅,C₇

R₁₉=8

C₁,C₅

R₂₆=12

C₁,C₇

R₆=2

C₆,C₇

R₁₃=6

C₀,C₃

R₂₀=8

C₂,C₆

R₂₇=14

C₀,C₇

Next, the traffic between multithreading is calculated, specifically, for example, for n thread T₀,T₁, T₂,…,T_n-1The first multithreading set Δ, the first multithreading set Δ is expressed as Δ={ T₀,T₁,T₂,…,T_n-1}；Appoint to two Individual thread T_lAnd T_k, TF_lkRepresent from thread T_lTo T_kFirst traffic, then in the first multithreading set Δ between whole threads First traffic can be represented with the bivariate table shown in table 2 below：

Table 2

	T₀	T₁	T₂	…	T_n-1
						T₀	0	TF₀₁	TF₀₂	…	TF_0n-1
T₁	TF₁₀	0	TF₁₂	…	TF_1n-1
						T₂	TF₂₀	TF₂₁	0	…	TF_2n-1
…	…	…	…	0	…
						T_n-1	TF_{n-1 0}	T_{n-1 1}	T_{n-1 2}	…	0

In table 2, without the traffic between two threads of " 0 " expression；Then any one thread T_iTotal traffic capacity, i.e., second lead to Traffic TF (T_i) be：

WhereinIt is above-mentioned thread T_iEach thread is (from T in the first multithreading set₀To T_n-1) The traffic,For in the first multithreading set each thread (from T₀To T_n-1) to the thread the traffic it With.

Due to being calculated second traffic of single thread, then to the thread in multithreading set Δ according to thread Total traffic capacity size carry out descending arrangement, obtain multithreading set Δ '={ T₀’,T₁’,T₂’,…,T_n-1’}；Preferably, If the total traffic capacity of two threads is identical, the small thread ordering of sequence number is preceding.

For multithreading set Δ={ T₀,T₁,T₂,T₃,T₄,T₅,T₆,T₇, the traffic between thread is as shown in the table：

Multithreading set Δ can be obtained from upper table '={ T₀’,T₁’,T₂’,T₃’,T₄’,T₅’,T₆’,T₇', wherein：

T₀' it is T in multithreading set Δ₀；

T₁' it is T in multithreading set Δ₇；

T₂' it is T in multithreading set Δ₁；

T₃' it is T in multithreading set Δ₆；

T₄' it is T in multithreading set Δ₂；

T₅' it is T in multithreading set Δ₅；

T₆' it is T in multithreading set Δ₃；

T₇' it is T in multithreading set Δ₄；

Next, being illustrated to a kind of preferred thread scheduling method provided in an embodiment of the present invention, specific steps are such as Under：

First, it is determined that whether communication cost set CCT and the second multithreading set are empty, if communication cost set CCT Or second multithreading collection be combined into sky, then thread scheduling terminates；If communication cost set CCT and the second multithreading set are not Sky, then from the member that communication cost set CCT selected and sorteds are 1, the member is designated as R_w, R_wIt is total between two processor cores Communication cost, the two processor cores are designated as C_uAnd C_v.Remove R from communication cost set CCT_w.Remove from many-core system M Processor core C_uAnd C_v。

For further Optimization Scheduling, before being scheduled, also judge that communication cost set CCT selected and sorteds are 1 The corresponding processor core C of member_uAnd C_vWhether be allocated and multithreading set Δ ' in remaining number of threads.

In specific implementation process, specific step is as described below：

Step 1：If total communication cost set CCT is sky, without distributable processor core, thread scheduling terminates； If total communication cost set CCT is not sky, to step 2.

Step 2：From the member that total communication cost set CCT selected and sorteds are 1, the member is designated as R_w, R_wIt is two processors Total communication cost between core, the two processor cores are designated as C_uAnd C_v.Remove R from total communication cost set CCT_w.From many-core Remove processor core C in system M_uAnd C_v。

Step 3：If processor core C_uAnd C_vAll be not previously allocated, and multithreading set Δ ' in remaining number of threads be more than Equal to 2, to step 4；If processor core C_uAnd C_vMiddle only one of which processor core is allocated, and multithreading set Δ ' be not Sky, to step 5；If processor core C_uAnd C_vAll be not previously allocated, and multithreading set Δ ' in remaining number of threads be equal to 1, To step 5；If processor core C_uAnd C_vIt is all allocated, return to step 1；

Step 4：From multithreading set Δ ' selected and sorted be 1 and 2 two thread T_xAnd T_y, it is assigned to processor core C_uWith C_v；From multithreading set Δ ' in remove thread T_xAnd T_y.If multithreading set Δ ' it is not sky, return to step 1；If multi-thread Cheng Jihe Δs ' be sky, then the arrival of new thread is waited, thread scheduling terminates.

Step 5：From multithreading set Δ ' selected and sorted be 1 thread T_x, it is assigned to unassigned processor core：Such as Fruit processor core C_uIt is allocated, then say thread T_xIt is assigned to C_v；If processor core C_vIt is allocated, then say thread T_xPoint It is fitted on C_u；From multithreading set Δ ' in remove thread T_x.If multithreading set Δ ' it is not sky, return to step 1；If multi-thread Cheng Jihe Δs ' be sky, then the arrival of new thread is waited, thread scheduling terminates.

By multithreading set Δ '={ T₀’,T₁’,T₂’,T₃’,T₄’,T₅’,T₆’,T₇' in thread scheduling to many-core system M={ C₀,C₁,C₂,C₃,C₄,C₅,C₆,C₇Process it is as follows, wherein communication cost set CCT={ R₀,R₁,…,R₂₇}：

1), communication cost set CCT is not sky, to step 2.

2) it is R from the member that communication cost set CCT selected and sorteds are 1₀, corresponding two processor cores are C₀And C₁。 Remove R from communication cost set CCT₀.Remove processor core C from many-core system M₀And C₁。

3) processor core C₀And C₁All be not previously allocated, and multithreading set Δ ' in remaining number of threads be 8, to step 4)；

4), from multithreading set Δ ' selected and sorted be 1 and 2 two thread T₀' and T₁', it is assigned to processor core C₀With C₁；From multithreading set Δ ' in remove thread T₀' and T₁’.Return to step 1.

5) repeat the above steps, until all of thread is assigned.Allocation result is as shown in the table：

In upper table, multithreading set Δ ' in thread T₀' it is scheduled for the processor core C in many-core system M₀, i.e., it is multi-thread Thread T in Cheng Jihe Δs₀It is scheduled for the processor core C in many-core system M₀；Multithreading set Δ ' in thread T₁' adjusted The processor core C spent in many-core system M₁, i.e., the thread T in multithreading set Δ₇It is scheduled for the treatment in many-core system M Device core C₁；By that analogy, all threads in multithreading set Δ are all scheduled on processor core.

Based on the inventive concept same with embodiment one, the present invention is that embodiment two additionally provides one kind towards many-core system Multithread scheduling system, the system includes：

First acquisition module, for obtain in default processor core set first processor core and second processing device core it Between communication cost；

3rd acquisition module, for according to first traffic, obtaining second traffic of single thread, wherein, institute It is the traffic and first multithreading of the thread to each thread in the first multithreading set to state second traffic Traffic sum of each thread to the thread in set；

Alternatively, first acquisition module is additionally operable to：

Alternatively, it is described to obtain the first processor core to the first communication cost of the second processing core, including：

Alternatively, the scheduler module is additionally operable to：

By the system that the embodiment of the present invention two is introduced, to implement the method institute of the thread scheduling of the embodiment of the present invention one The system of use, so the method introduced based on the embodiment of the present invention one, the affiliated personnel in this area will appreciate that the system Concrete structure and deformation, so will not be repeated here.The system that the method for every embodiment of the present invention one is used belongs to this The scope to be protected of invention.

One or more technical schemes provided in the embodiment of the present invention, at least have the following technical effect that or advantage：

, but those skilled in the art once know basic creation although preferred embodiments of the present invention have been described Property concept, then can make other change and modification to these embodiments.So, appended claims are intended to be construed to include excellent Select embodiment and fall into having altered and changing for the scope of the invention.

Obviously, those skilled in the art can carry out various changes and modification without deviating from this hair to the embodiment of the present invention The spirit and scope of bright embodiment.So, if these modifications of the embodiment of the present invention and modification belong to the claims in the present invention And its within the scope of equivalent technologies, then the present invention is also intended to comprising these changes and modification.

Claims

1. a kind of multithread scheduling method towards many-core system, it is characterised in that methods described includes：

The communication cost between first processor core and second processing device core in default processor core set is obtained, wherein, institute It is any two processor core in processor core set to state first processor core, second processing device core；

According to first traffic, second traffic of single thread is obtained, wherein, second traffic is the thread The traffic of each thread and each thread in the first multithreading set to the line in the first multithreading set The traffic sum of journey, the thread is any thread in the first multithreading set；

All processor cores are ranked up according to the communication cost, by the maximum thread scheduling of second traffic to communication generation In the minimum processor core of valency.

2. the method for claim 1, it is characterised in that first processor in the default processor core set of acquisition Communication cost between core and second processing device core, including：

3. method as claimed in claim 2, it is characterised in that the acquisition first processor core to the second processing First communication cost of core, including：

4. the method for claim 1, it is characterised in that described all processor cores are carried out according to the communication cost Sequence, by the maximum thread scheduling of second traffic to the minimum processor core of communication cost, including：

To be communicated generation in the maximum thread scheduling of second traffic in the second multithreading set to the communication cost set In the minimum processor core of valency.

5. method as claimed in claim 4, it is characterised in that described by second traffic in the second multithreading set Maximum thread scheduling also includes to before in the minimum processor core of communication cost in the communication cost set：

6. method as claimed in claim 4, it is characterised in that described by second traffic in the second multithreading set Maximum thread scheduling also includes to after in the minimum processor core of communication cost in the communication cost set：

Delete the thread of second traffic maximum in second sets of threads.

7. a kind of multithread scheduling system towards many-core system, it is characterised in that the system includes：

First acquisition module, for obtaining in default processor core set between first processor core and second processing device core Communication cost, wherein, the first processor core, second processing device core are any two processor in processor core set Core；

Second acquisition module, for obtaining first traffic in default first multithreading set between each two thread；

3rd acquisition module, for according to first traffic, obtaining second traffic of single thread, wherein, described the Two traffics are the traffic and the first multithreading set of the thread to each thread in the first multithreading set In each thread to the thread traffic sum, the thread is any thread in the first multithreading set；

Scheduler module, for being ranked up to all processor cores according to the communication cost, by the line that second traffic is maximum Journey is dispatched in the minimum processor core of communication cost.

8. system as claimed in claim 7, it is characterised in that first acquisition module is additionally operable to：

9. system as claimed in claim 8, it is characterised in that the acquisition first processor core to the second processing First communication cost of core, including：

10. system as claimed in claim 7, it is characterised in that the scheduler module is additionally operable to：