CN103197977B - A kind of thread scheduling method, thread scheduling device and multi-core processor system - Google Patents

A kind of thread scheduling method, thread scheduling device and multi-core processor system Download PDF

Info

Publication number
CN103197977B
CN103197977B CN201310134356.XA CN201310134356A CN103197977B CN 103197977 B CN103197977 B CN 103197977B CN 201310134356 A CN201310134356 A CN 201310134356A CN 103197977 B CN103197977 B CN 103197977B
Authority
CN
China
Prior art keywords
processor core
thread
processor
people logging
cache
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310134356.XA
Other languages
Chinese (zh)
Other versions
CN103197977A (en
Inventor
刘仪阳
陈渝
谭玺
崔岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Huawei Technologies Co Ltd
Original Assignee
Tsinghua University
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, Huawei Technologies Co Ltd filed Critical Tsinghua University
Priority to CN201310134356.XA priority Critical patent/CN103197977B/en
Priority claimed from CN201110362773.0A external-priority patent/CN102495762B/en
Publication of CN103197977A publication Critical patent/CN103197977A/en
Application granted granted Critical
Publication of CN103197977B publication Critical patent/CN103197977B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The embodiment of the invention discloses a kind of thread scheduling method, thread scheduling device and multi-core processor system, carry out thread scheduling for processor core.The method comprise the steps that when first processor core generation thread context switches, determine, with first processor core, there is the type of the thread that the second processor core of corresponding relation currently runs;If what the second processor core currently ran is Cache Sensitive molded line journey, then it is in the set treating active thread of ready state lookup one caching insensitive thread what first processor verification was answered, or, if what the second processor core currently ran is caching insensitive thread, then it is in the set treating active thread of ready state one Cache Sensitive molded line journey of lookup what first processor verification was answered;When the thread finding desirable type in the set of active thread being in ready state answered in first processor verification, the thread currently run is switched to the thread found.

Description

A kind of thread scheduling method, thread scheduling device and multi-core processor system
Technical field
The present invention relates to computer realm, particularly relate to a kind of thread scheduling method, thread scheduling device and multinuclear and process Device system.
Background technology
Thread is an entity in process, does not have system resource, only performs more necessary data structures, thread Can create and cancel, thus realize the concurrently execution of program.Thread general according to have ready, block and perform three kinds of basic shapes State.
At present in multi-core processor system, all of processor core can access in internal memory, I/O and outside sharedly Disconnected.Hardware resource in system can be shared by multiple processor cores, such as Memory Controller Hub, afterbody cache storage Device (LLC, Last Level cache) etc..
When multi-core processor system of the prior art runs application program, mostly run with thread for thread, so And, inventor finds under study for action, during current thread scheduling, is to determine, according to the priority of thread, the line that will switch Journey, and have ignored multi-core processor system share resource produce resource contention or waste and cause multi-core processor system performance The problem declined.
Summary of the invention
Embodiments provide a kind of thread scheduling method, thread scheduling device and multi-core processor system, be used for Thread in multi-core processor system is scheduling, it is possible to the effective utilization rate sharing resource that improves, mitigation processor core Competition to shared resource, thus improve the performance of multi-core processor system.
Thread scheduling method in the embodiment of the present invention includes:
When first processor core generation thread context switches, determine, with first processor core, there is the of corresponding relation The type of the thread that two processor cores currently run;
If what the second processor core currently ran is Cache Sensitive molded line journey, then just it is in what first processor verification was answered The set treating active thread of not-ready status is searched a caching insensitive thread, or, if the second processor core is currently transported Row is caching insensitive thread, then in the set treating active thread being in ready state that first processor verification is answered Search a Cache Sensitive molded line journey;
It is in the set treating active thread of ready state finds desirable type when answer in first processor verification Thread time, the thread currently run is switched to the thread found.
Thread scheduling method in the embodiment of the present invention includes:
When first processor core generation thread context switches, the thread currently run by first processor core is currently The cache memory cache rate of people logging in of timeslice is added in the cache rate of people logging in that first processor core is total, will be cumulative secondary Counting number value adds one;
Obtain, with first processor core, there is the total cache rate of people logging in of the second processor core of corresponding relation and accumulative frequency Count value;
The cache rate of people logging in total according to first processor core and accumulative frequency count value, calculate the flat of first processor core All cache rate of people logging in, the cache rate of people logging in total according to the second processor core and accumulative frequency count value, calculate the second processor The average cache rate of people logging in of core, and average by the average cache rate of people logging in of first processor core and the second processor core Cache rate of people logging in is sued for peace as the first parameter value;
Scanning first processor checks the set treating active thread being in ready state answered, and calculates the line of Current Scan The thread that journey is currently run at cache rate of people logging in and second processor core of last timeslice is visited at the cache of last timeslice Ask the sum of rate, as the second parameter value;
Difference between the first parameter value and the second parameter value more than or equal to preset numerical value, then will currently be run Thread switches to the thread of Current Scan.
Thread scheduling device in the embodiment of the present invention includes:
Determine unit, for when first processor core generation thread context switches, determine and have with first processor core There is the type of the thread that the second processor core of corresponding relation currently runs;
Search unit, if currently run for the second processor core is Cache Sensitive molded line journey, then at first processor What verification was answered is in the set treating active thread of ready state lookup one caching insensitive thread, or, if second What processor core currently ran is caching insensitive thread, then check the as ready being in ready state answered at first processor The set of line journey is searched a Cache Sensitive molded line journey;
Switch unit, for being in the set treating active thread of ready state look into when answer in first processor verification When finding the thread of desirable type, then the thread currently run is switched to the thread found.
Thread scheduling device in the embodiment of the present invention includes:
First summing elements, for when first processor core generation thread context switches, works as first processor core The cache memory cache rate of people logging in of the thread of front operation is added in the cache rate of people logging in that first processor core is total, will Accumulative frequency count value adds one;
First acquiring unit, second processor core for acquisition and first processor core with corresponding relation is total Cache rate of people logging in and accumulative frequency count value;
First computing unit, for the cache rate of people logging in total according to first processor core and accumulative frequency count value, calculates The average cache rate of people logging in of first processor core, the cache rate of people logging in total according to the second processor core and accumulative frequency counting Value, calculates the average cache rate of people logging in of the second processor core, and by the average cache rate of people logging in and second of first processor core The average cache rate of people logging in of processor core is sued for peace as the first parameter value;
First scanning computing unit, for scanning the active thread treated being in ready state that first processor verification is answered Set, calculates the thread that the thread of Current Scan currently runs at cache rate of people logging in and second processor core of last timeslice In the sum of the cache rate of people logging in of last timeslice, as the second parameter value;
First processing unit, for when the difference between the first parameter value and the second parameter value is more than or equal to preset number Value, then switch to the thread of Current Scan by the thread currently run.
Multi-core processor system in the embodiment of the present invention includes:
First processor core and the second processor core, and the hardware resource shared;
First processor core and the second processor core access the hardware resource shared;
First processor core is used for: when first processor core generation thread context switches, determine and first processor Core has the type of the thread that the second processor core of corresponding relation currently runs;If what the second processor core currently ran is slow Deposit responsive type thread, then delay in the lookup one that is in the set treating active thread of ready state that first processor verification is answered Deposit insensitive thread, if or the second processor core currently run be caching insensitive thread, then at first processor What verification was answered is in the set treating active thread of ready state one Cache Sensitive molded line journey of lookup;When at first processor What verification was answered is in the thread finding desirable type in the set treating active thread of ready state, the thread that will currently run Switch to find thread;
Or,
First processor core is used for: when first processor core generation thread context switches, and is worked as by first processor core The thread of front operation is added in total cache rate of people logging at the cache memory cache rate of people logging in of current time sheet, will Accumulative frequency count value adds one;Obtain and with first processor core, there is the total cache of the second processor core of corresponding relation and access Rate and accumulative frequency count value;The cache rate of people logging in total according to first processor core and accumulative frequency count value, calculate at first The average cache rate of people logging in of reason device core, the cache rate of people logging in total according to the second processor core and accumulative frequency count value, calculate The average cache rate of people logging in of the second processor core, and by the average cache rate of people logging in of first processor core and the second processor core Average cache rate of people logging in sue for peace as the first parameter value;Scanning first processor checks the as ready being in ready state answered The set of line journey, the thread calculating Current Scan is currently transported at cache rate of people logging in and second processor core of last timeslice The thread of row is in the sum of the cache rate of people logging in of last timeslice, as the second parameter value;When the first parameter value and the second parameter value Between difference more than or equal to preset numerical value, then the thread currently run is switched to the thread of Current Scan.
As can be seen from the above technical solutions, the embodiment of the present invention has the advantage that
When first processor core generation thread context switches, determine, with this first processor core, there is corresponding relation Second processor core, if what this second processor core currently ran is Cache Sensitive molded line journey, then should in first processor verification Be in the set treating active thread of ready state lookup one caching insensitive thread, if or the second processor core Current run is caching insensitive thread, then the active thread for the treatment of being in ready state answered in first processor verification Set is searched a Cache Sensitive molded line journey, and the thread of the desirable type found is switched to by this first processor core Run, thus the thread that the thread scheduling device in the embodiment of the present invention can make different cache feature type can coordinate fortune OK, and then avoid first processor core and the second processor core to run the thread of same type and the resource contention that produces or money Source is wasted, and has effectively relaxed the processor core competition to shared resource, and can improve the utilization rate of shared resource, improves many The performance of core processor system.
Accompanying drawing explanation
Fig. 1 is a schematic diagram of a kind of thread scheduling method in the embodiment of the present invention;
Fig. 2 is another schematic diagram of a kind of thread scheduling method in the embodiment of the present invention;
Fig. 3 is another schematic diagram of a kind of thread scheduling method in the embodiment of the present invention;
Fig. 4 is a schematic diagram of a kind of thread scheduling device in the embodiment of the present invention;
Fig. 5 is another schematic diagram of a kind of thread scheduling device in the embodiment of the present invention;
Fig. 6 is another schematic diagram of a kind of thread scheduling device in the embodiment of the present invention;
Fig. 7 is a schematic diagram of multi-core processor system in the embodiment of the present invention;
Fig. 8-a is a physical structure schematic diagram of multi-core processor system in the embodiment of the present invention;
Fig. 8-b is a physical structure schematic diagram of multi-core processor system in the embodiment of the present invention;
Fig. 8-c is a physical structure schematic diagram of multi-core processor system in the embodiment of the present invention.
Detailed description of the invention
Embodiments provide a kind of thread scheduling method, thread scheduling device and multi-core processor system, be used for The thread run on the processor core of the shared hardware resource in multi-core processor system is scheduling, it is possible to effectively relax The competition to shared hardware resource of the multiple processor cores of shared hardware resource, thus improve the utilization rate sharing resource, improve The performance of multi-core processor system.
In embodiments of the present invention, at performed linking format (ELF, the Executable and that processor core is corresponding Linkable Format) file creates thread after, need to be determined the class of thread in this ELF file by emulation experiment Type, particularly as follows:
1) if there being n thread, being then 1~n by this n thread number consecutively, selecting any two thread to run simultaneously, if Thread i runs with thread j simultaneously, then by thread j in the performance loss run with thread i simultaneously, be designated as dij, at each line After Cheng Jun and other thread run simultaneously, available following matrix D:
Wherein, in matrix D, the i-th row represents thread 1 to the n influence degree by thread i, and 2 norms of the i-th row vector can be made Intensive index for thread i;I-th list timberline journey i is by the influence degree of thread 1 to n, and 2 norms of the i-th column vector can be made Sensitivity indices for thread i.
2) the intensive index of computational threads 1~n and sensitivity indices, concrete computing formula is respectively as follows:
Wherein, and i ∈ (1, n)
Utilize above-mentioned computing formula, intensive index and the sensitivity indices of thread 1~n can be calculated respectively.
3) Cache Sensitive value H of each thread is calculated respectively according to the intensive index of thread and sensitivity indices, specifically Computing formula be:
Hi=tan (the intensive index of the sensitivity indices of thread i/thread i), wherein i ∈ (1, n);
If | Hi-1 |≤preset numerical value, it is determined that thread i is that caching compares responsive type thread;
If | Hi-1 | the numerical value that > is preset, it is determined that thread i is Cache Sensitive molded line journey or caching insensitive thread, And need to further determine that the type of thread i, the method further determined that are: if the intensive index of thread i is more than or equal to The meansigma methods of the intensive index of this n thread, it is determined that thread i is Cache Sensitive molded line journey, if the intensive index of thread i During less than the meansigma methods of the intensive index of this n thread, it is determined that thread i is caching insensitive thread.
Determine the type of n thread by above-mentioned method after, the type identification of thread can be set, by the type mark of thread Know be saved in the ELF file that thread is corresponding so that when the thread in ELF operationally, the type mark of the thread being currently running Knowledge can be saved in the current active thread descriptor of alignment processing device core, and i.e. current active thread descriptor is used for preserving process The type identification of the thread that device core currently runs.
Additionally, in embodiments of the present invention, in addition it is also necessary to by the shared same shared resource in multi-core processor system Processor core be grouped, particularly as follows:
If the number of processor core sharing same shared resource is even number, then press processor core Identity Code (ID, Identity) order, is one group with 2 processor cores and is grouped, and that sets up in each group between two processor cores is right Should be related to.
If the number of processor core sharing same cache is odd number, then by the order of the ID of processor core with 2 for one Group is grouped, and a remaining processor core is not grouped, and after being grouped processor core, sets up in each group at two Corresponding relation between reason device core, the ID of available processor core arranges the concrete side calculating corresponding processor core according to ID Method, or by the way of setting up processor core grouped table, set up the corresponding relation between two processor cores.Need explanation It is, in embodiments of the present invention, when ungrouped processor core generation thread context switches, by thread of the prior art Scheduling method process.
The resource that the embodiment of the present invention is shared for the multinuclear on computer multicore architecture platform.Under normal circumstances, one In individual multi-core processor system, there is the system resource that a lot of multinuclear is shared, such as LLC, when the one group of processor sharing same LLC Core, when running Cache Sensitive molded line journey simultaneously, will produce LLC competition, affect systematic function;When sharing at a group of same LLC Reason device core, when running cache caching insensitive thread simultaneously, produces the LLC wasting of resources, will use in embodiments of the present invention The dispatching method of type based on thread so that the one group of processor core sharing same resource is separately operable Cache Sensitive molded line journey With caching insensitive thread, reach to avoid sharing resource contention and waste, improve and share resource utilization, improve systematic function Purpose.
It should be noted that processor core can be central processing unit in multiple core processing system in embodiments of the present invention (CPU, Central Processing Unit), or microprocessor (MPU, Micro Processor Unit) or numeral Signal processor (DSP, Digital Signal Processing) or graphic process unit (GPU, graphic process unit).
Below by the concrete method introducing the scheduling of embodiment of the present invention thread, refer to Fig. 1, for the embodiment of the present invention The embodiment of middle a kind of thread scheduling method, it should be appreciated that the executive agent of the method for the embodiment of the present invention can be many Processor core in core processor system, the embodiment of the present invention is using first processor core as the executive agent of method for example Bright, the method for the embodiment of the present invention includes:
101, when first processor core generation thread context switches, determine, with first processor core, there is corresponding relation The type of thread currently run of the second processor core;
In embodiments of the present invention, polycaryon processor core is during active thread, if sharing same shared resource Having certain CUP that thread context switching occurs in processor core, the thread switching of self will be processed by this CPU.
In embodiments of the present invention, for more preferable description technique scheme, the processor of thread context switching will be there is Core is referred to as first processor core, and the processor core having corresponding relation with this first processor core is referred to as the second processor core, Therefore, when first processor core generation thread context switches, first processor core has determining with first processor core Second processor core of corresponding relation.
If what 102 second processor cores currently ran is Cache Sensitive molded line journey, then from first processor verification is answered A caching insensitive thread is searched in the set treating active thread of ready state, or, if the second processor core is worked as Front operation is caching insensitive thread, then check the collection treating active thread being in ready state answered from first processor Conjunction is searched a Cache Sensitive molded line journey;
In embodiments of the present invention, the thread that the second processor core currently runs is probably and caches relatively responsive type thread, delays Deposit in responsive type thread, caching insensitive thread any one, currently run when the second processor core is Cache Sensitive During molded line journey, first processor core is non-by searching a caching from corresponding being in the set treating active thread of ready state Responsive type thread, the thread currently run when the second processor core is caching insensitive thread, and first processor core is then from right That answers is in the set treating active thread of ready state one Cache Sensitive molded line journey of lookup.
It should be noted that in embodiments of the present invention, the set of the thread to be run being in ready state is to process What device verification was answered the treat set of the priority query of preset number in operation queue or the thread of preset number or chained list Set, or the thread of RBTree organizational structure.
It should be noted that in embodiments of the present invention, the thread currently run when the second processor core be cache quicker Sense molded line journey, first processor core will be completed the switching of thread by method of the prior art, and here is omitted.
103, find required when being in the set treating active thread of ready state of answering in first processor verification During the thread of type, the thread currently run is switched to the thread found.
In embodiments of the present invention, first processor core is in the set of the corresponding thread to be run being in ready state The thread of middle lookup desirable type, if finding the thread of desirable type, the thread switching that first processor core will currently run Become the thread found, complete the switching of thread so that be when running responsive type thread on the second processor core, corresponding Insensitive thread is run on first processor core, when running insensitive thread on the second processor core, corresponding Responsive type thread is run on first processor core.
In embodiments of the present invention, when first processor core generation thread context switches, by according to this first The type of the thread that the second processor core that processor core is corresponding currently runs determines the thread that first processor core will run Type, and check, at first processor, the thread treating to search in active thread the type being in ready state answered, it is possible to Effectively avoid resource contention or waste that first processor core and the second processor core produce on same cache, effectively Alleviate resource contention, improve the utilization rate of shared resource, improve the system of system.
For the technical scheme being better understood from the present invention, refer to Fig. 2, adjust for thread a kind of in the embodiment of the present invention The embodiment of the method for degree, it should be appreciated that the executive agent of the method for the embodiment of the present invention can be polycaryon processor system Processor core in system, the embodiment of the present invention is come the brightest using first processor core as the executive agent of method, this The method of bright embodiment includes:
201, when first processor core generation thread context switches, determine, with first processor core, there is corresponding relation The type of thread currently run of the second processor core;
In embodiments of the present invention, first processor core can be true according to the ID of first processor core and preset computational methods Fixed second processor core, wherein, preset computational methods are relevant with the method being grouped by processor core, such as, if processor core ID be 0,1,2,3, ID be the processor core of 0 and 1 be one group, ID be the processor core of 2 and 3 be one group, the most preset calculating Method can be when the ID of first processor core is even number, adds 1 by the ID of the ID of processor core Yu this first processor core It is worth identical processor core as the second processor core, if the ID of first processor core is radix, then by the ID of processor core The processor core identical with the value that the ID of this first processor core subtracts is as the second processor core.Additionally, system also can incited somebody to action During processor core packet, set up processor core grouped table so that when searching the second processor core, can be according to first processor core ID search this processor core grouped table and determine the second processor core.In embodiments of the present invention, the second processor core is determined Mode has multiple, does not limits.
202, the thread currently run by first processor core is added at first at the cache rate of people logging in of current time sheet In the cache rate of people logging in that reason device core is total, accumulative frequency count value is added one;
In embodiments of the present invention, if first processor core will switch the thread of current operation, first processor core will The current thread run is added in the cache rate of people logging in that first processor core is total at the cache rate of people logging in of current time sheet, will Accumulative frequency count value adds one, and wherein, the thread that first processor core currently runs at the cache rate of people logging in of current time sheet is The number of times accessing cache when first processor core runs current thread in current time sheet runs current thread luck with it The ratio of the instruction number of times of row, the total cache rate of people logging in of first processor core is that first processor core starts fortune from system start-up The accumulated value of the cache rate of people logging in of line journey, and often accumulate once, accumulative frequency count value adds one.
If what 203 second processor cores currently ran is Cache Sensitive molded line journey, then from first processor verification is answered A caching insensitive thread is searched in the set treating active thread of ready state, or, if the second processor core is worked as Front operation is caching insensitive thread, then check the collection treating active thread being in ready state answered from first processor Conjunction is searched a Cache Sensitive molded line journey, if finding, then operating procedure 204, if not finding, then operating procedure 205;
In embodiments of the present invention, the type identification of the thread that processor core currently runs is saved in the current of processor core In active thread descriptor, therefore, first processor core can obtain from the second processor core current active thread descriptor The type identification of the thread that two processor cores currently run, to determine the thread type that the second processor core currently runs, wherein, The type of thread includes: Cache Sensitive type, cache relatively responsive type, caching insensitive.
In embodiments of the present invention, the type of thread that first processor core currently will run according to the second processor core, The thread of desirable type is searched, when the second processor core is worked as in the set of the corresponding thread to be run being in ready state Front operation be Cache Sensitive molded line journey time, then from the set treating active thread being in ready state search a caching non- Responsive type thread, or, when the second processor core currently run be caching insensitive thread time, then from being in ready state The set treating active thread in search one caching insensitive thread.
204, find required when being in the set treating active thread of ready state of answering in first processor verification During the thread of type, the thread currently run is switched to the thread found, continues executing with step 209 by first processor core;
In embodiments of the present invention, if first processor core is in the corresponding set treating active thread being in ready state In find the thread of desirable type, then the current thread run switches to the thread found.
The thread of desirable type is looked for specifically include it should be noted that first processor is verified, corresponding being just in of scanning The set treating active thread of not-ready status, from Current Scan to thread place ELF file obtain the line of this Current Scan The type identification of journey, determines the type of thread that Current Scan arrives, if this currently scans thread is institute according to the type mark Need the thread of type, then stop scanning, operating procedure 204, the thread currently run is switched to the thread found, if deserving Before the thread that scans be not the thread of desirable type, then the next thread of scanning.
If 205 are in the set treating active thread of ready state do not find institute what first processor verification were answered Need the thread of type, then according to the total cache rate of people logging in of first processor core and accumulative frequency count value, calculate first processor The average cache rate of people logging in of core;The cache rate of people logging in total according to the second processor core and accumulative frequency count value, calculate second The average cache rate of people logging in of processor core;And by the average cache rate of people logging in of first processor core and putting down of the second processor core All cache rates of people logging in are sued for peace as the first parameter value;
In embodiments of the present invention, if check the set treating active thread being in ready state answered at first processor In do not find the thread of desirable type, first processor core by according to the cache rate of people logging in total according to first processor core and Accumulative frequency count value, calculates the average cache rate of people logging in of first processor core, visits according to the cache that the second processor core is total Ask rate and accumulative frequency count value, calculate the average cache rate of people logging in of the second processor core, and average by first processor core The average cache rate of people logging in of cache rate of people logging in and the second processor core is sued for peace as the first parameter value, particularly as follows: by first The reason total cache rate of people logging in of device core, divided by the accumulative frequency count value of first processor core, obtains the average of first processor core Cache rate of people logging in, counts cache rate of people logging in total for the second processor core divided by the accumulative frequency of the second processor core simultaneously Value, obtains the average cache rate of people logging in of the second processor core, finally by the average cache rate of people logging in of first processor core and the The average cache rate of people logging in of two processor cores is added, and obtains the first parameter value.
206, scanning first processor checks the set treating active thread being in ready state answered, and calculates Current Scan The thread that currently runs at cache rate of people logging in and second processor core of last timeslice of thread in last timeslice The sum of cache rate of people logging in, as the second parameter value;
207, when the difference between the first parameter value and the second parameter value is more than or equal to preset numerical value, then will be current The thread run switches to the thread of Current Scan;
208, when the difference between the first parameter value and the second parameter value is less than preset numerical value, then next line is scanned Journey, returns and performs step 206;
In embodiments of the present invention, first processor core is by the collection treating active thread being in ready state corresponding for scanning Close, calculate the thread that the thread of Current Scan currently runs at cache rate of people logging in and second processor core of last timeslice and exist The sum of the cache rate of people logging in of last timeslice, as the second parameter value.
First processor assesses the difference calculated between the first parameter value and the second parameter value, if this difference is more than or equal to pre- The numerical value put, then switch to the thread of Current Scan by the thread currently run;If this difference is less than preset numerical value, then scan Next thread, returns and performs step 206, i.e. calculates the thread of Current Scan at the cache rate of people logging in of last timeslice and the The thread that two processor cores currently run is in the sum of the cache rate of people logging in of last timeslice, as the second parameter value.
If it should be noted that the Thread Count scanned reaches preset number or has scanned the preferential of preset number After level queue, not finding switchable thread yet, first processor core will be by method switch threads of the prior art, herein Do not limit.
209, after the switching of first processor core generation thread context, the type identification of the thread currently run is saved in In the current active thread descriptor of first processor core.
In embodiments of the present invention, after the switching of first processor core generation context, need to update current active thread The type identification of the thread currently run is saved in the by the type identification of thread preserved in descriptor, i.e. first processor core In the current active thread descriptor of one processor core.
In embodiments of the present invention, according to the thread currently run with first processor the second processor core of answering of verification The type of the thread of type search first processor core switching, and when not finding the thread of desirable type, further according to thread And the cache rate of people logging in of processor core determines the thread that first processor core switches, it is possible to effectively avoid that there is corresponding relation Two processor cores run the thread of same types, alleviate the competition to shared resource, improve the utilization rate of resource, improvement is many The performance of core processor system.
In embodiments of the present invention, also the first process can directly be determined according to the cache rate of people logging in of processor core and thread The thread that device core will switch, refers to Fig. 3, for the embodiment of thread scheduling method a kind of in the embodiment of the present invention, including:
301, when first processor core generation thread context switches, the thread that first processor core currently runs is existed The cache rate of people logging in of current time sheet is added in the cache rate of people logging in that first processor core is total, accumulative frequency count value is added One;
In embodiments of the present invention, when first processor core generation thread context switches, first processor core ought The thread of front operation is added in the cache rate of people logging in that first processor core is total at the cache rate of people logging in of current time sheet, will be tired Adding count value and add one, wherein, the thread that first processor core currently runs is at the cache rate of people logging in of current time sheet One processor core number of times of access cache when current time sheet runs current thread runs when running current thread with it The ratio of instruction number of times, the total cache rate of people logging in of first processor core is at current time sheet, first processor core active thread The accumulated value of cache rate of people logging in, and often accumulate once, accumulative frequency count value adds one.
302, obtain, with first processor core, there is the total cache rate of people logging in of the second processor core of corresponding relation and cumulative Count value;
In embodiments of the present invention, first processor core is by true with preset computational methods for the ID according to first processor core Fixed second processor core, or determine the second processor core according to the ID lookup processor core grouped table of first processor core, After confirming the second processor core, from the second processor core, obtain the total cache rate of people logging in of this second processor core and add up Count value.
303, according to the total cache rate of people logging in of first processor core and accumulative frequency count value, first processor core is calculated Average cache rate of people logging in, the cache rate of people logging in total according to the second processor core and accumulative frequency count value, calculate at second The average cache rate of people logging in of reason device core, and average by the average cache rate of people logging in of first processor core and the second processor core Cache rate of people logging in is sued for peace as the first parameter value;
In embodiments of the present invention, first processor core by the cache rate of people logging in total according to first processor core and adds up Count value, calculates the average cache rate of people logging in of first processor core, according to the cache rate of people logging in that the second processor core is total And accumulative frequency count value, calculate the average cache rate of people logging in of the second processor core, and average by first processor core The average cache rate of people logging in of cache rate of people logging in and the second processor core is sued for peace as the first parameter value, particularly as follows: by first The reason total cache rate of people logging in of device core, divided by the accumulative frequency count value of first processor core, obtains the average of first processor core Cache rate of people logging in, counts cache rate of people logging in total for the second processor core divided by the accumulative frequency of the second processor core simultaneously Value, obtains the average cache rate of people logging in of the second processor core, finally by the average cache rate of people logging in of first processor core and the The average cache rate of people logging in of two processor cores is added, and obtains the first parameter value.
304, scanning first processor checks the set treating active thread being in ready state answered, and calculates Current Scan The thread that currently runs at cache rate of people logging in and second processor core of last timeslice of thread in last timeslice The sum of cache rate of people logging in, as the second parameter value;
305, the difference between the first parameter value and the second parameter value is more than or equal to preset numerical value, then will currently transport The thread of row switches to the thread of Current Scan;
In embodiments of the present invention, first processor core is by the collection treating active thread being in ready state corresponding for scanning Close, calculate the thread that the thread of Current Scan currently runs at cache rate of people logging in and second processor core of last timeslice and exist The sum of the cache rate of people logging in of last timeslice, as the second parameter value.First processor assesses calculation the first parameter value and the second ginseng Difference between numerical value, if this difference is more than or equal to preset numerical value, then by the thread run current on first processor core Switch to the thread of Current Scan.
Preferably, in embodiments of the present invention, can also carry out following steps:
306, the difference between the first parameter value and the second parameter value is less than preset numerical value, then scan next line Journey, returns and performs step 304;
In embodiments of the present invention, when the difference between the first parameter value and the second parameter value is less than preset numerical value, First processor core will scan next thread, and return the content performed in step 304, and the thread i.e. calculating Current Scan exists The thread that the cache rate of people logging in of last timeslice and the second processor core currently run is at the cache rate of people logging in of last timeslice Sum, as the second parameter value.
307, after the thread of first processor core has switched, the type identification of the thread currently run is saved in first In the current active thread descriptor of processor core.
In embodiments of the present invention, after the switching of first processor core generation context, need to update current active thread The type identification of the thread currently run is saved in the by the type identification of thread preserved in descriptor, i.e. first processor core In the current active thread descriptor of one processor core.
In embodiments of the present invention, when first processor core generation thread switches, by total according to processor core Cache rate of people logging in and thread determine, at the cache rate of people logging in of last timeslice, the thread that will switch, and complete switching, it is possible to Effectively avoid shared resource contention and the waste produced during two processor core active threads in same group, effectively improve The utilization rate of shared resource, improves the performance of multi-core processor system.
Refer to Fig. 4, for the embodiment of thread scheduling device a kind of in the embodiment of the present invention, including:
Determine unit 401, for when first processor core generation thread context switches, determine and first processor core There is the type of the thread that the second processor core of corresponding relation currently runs;
Search unit 402, if currently run for the second processor core is Cache Sensitive molded line journey, then process first What device verification was answered is in the set treating active thread of ready state lookup one caching insensitive thread;Or, if the What two processor cores currently ran is caching insensitive thread, then be in treating of ready state what first processor verification was answered The set of active thread is searched a Cache Sensitive molded line journey;
Switch unit 403, if for checking the set treating active thread being in ready state answered at first processor In find the thread of desirable type, then the thread currently run is switched to the thread found.
In embodiments of the present invention, when first processor core generation thread context switches, in first processor core Determine that unit 401 and first processor core have the type of the thread that the second processor core of corresponding relation currently runs, if the What two processor cores currently ran is Cache Sensitive molded line journey, search unit 402 first processor verification answer be in ready The set treating active thread of state is searched a caching insensitive thread;Or, if the second processor core currently runs Be caching insensitive thread, search unit 402 first processor verification answer be in ready state treat active thread Set in search a Cache Sensitive molded line journey;It is in ready state what first processor verification was answered if searching unit 402 The set treating active thread in find the thread of desirable type, then the thread of front operation is switched to look into by switch unit 403 The thread found.
The embodiment of the present invention at thread dispatching device, under a kind of implementation, its physical aspect can be processor Core, processor core can be central processing unit (CPU, Central Processing Unit), or microprocessor (MPU, Micro Processor Unit) or digital signal processor (DSP, Digital Signal Processing) or Graphic process unit (GPU, graphic process unit).
Visible, by the thread scheduling device of the embodiment of the present invention, when first processor core generation thread context switches Time, by determining the first process according to the type checking the thread that the second processor core answered currently runs with this first processor The type of the thread that device core will run, and the thread searching the type complete thread switching, it is possible to effectively avoid at first Reason device core and the resource contention that produces in same shared resource of the second processor core or waste, effectively alleviate resource competing Strive, improve the utilization rate of shared resource, improve the system of system.
For the device being better understood from the present invention, refer to Fig. 5, for thread scheduling dress a kind of in the embodiment of the present invention Another embodiment put, including:
The most really cell 401, search unit 402, switch unit 403, and describe with Fig. 4 institute embodiment Content similar, here is omitted.
Wherein it is determined that unit 401 includes:
Processor core determines unit 501, for the Identity Code ID according to first processor core and preset calculating side Method determines have the second processor core of corresponding relation with first processor core, or for looking into according to the ID of first processor core Look for processor core grouped table to determine with first processor core and there is the second processor core of corresponding relation;
Thread determines unit 502, for obtaining the second process from the current active thread descriptor of the second processor core The type of the thread that device core currently runs, the type of thread includes: Cache Sensitive type, cache relatively responsive type, caching insensitive.
In embodiments of the present invention, thread scheduling device also includes:
Summing elements 503, the thread currently run by first processor core is at the cache memory of current time sheet Cache rate of people logging in is added in the cache rate of people logging in that first processor core is total, adds one by accumulative frequency count value;
Updating block 504, after the thread of first processor core has switched, the type mark of thread that will currently run Know in the current active thread descriptor being saved in first processor core;
Computing unit 505, if for checking the set treating active thread being in ready state answered at first processor In do not find the thread of desirable type, then according to the total cache rate of people logging in of first processor core and accumulative frequency count value, meter Calculate the average cache rate of people logging in of first processor core;The cache rate of people logging in total according to the second processor core and accumulative frequency counting Value, calculates the average cache rate of people logging in of the second processor core;And by the average cache rate of people logging in and second of first processor core The average cache rate of people logging in of processor core is sued for peace as the first parameter value;
Scanning computing unit 506, is in the set treating active thread of ready state for scanning first processor core, meter Calculate thread that the thread of Current Scan currently runs at cache rate of people logging in and second processor core of last timeslice when last Between the sum of cache rate of people logging in of sheet, as the second parameter value;
Processing unit 507, for when the difference between the first parameter value and the second parameter value is more than or equal to preset number During value, then the thread currently run is switched to the thread of Current Scan, and for when the first parameter value and the second parameter value it Between difference less than preset numerical value, then scan next thread, return to scan computing unit 506.
In embodiments of the present invention, when first processor core generation context thread switches, the place in unit 401 is determined Reason device core determine unit 501 the Identity Code ID according to first processor core and preset computational methods are determined with first at Reason device core has the second processor core of corresponding relation, or searches processor core packet for the ID according to first processor core Table determines have the second processor core of corresponding relation with first processor core, and is determined list by the thread determined in unit 401 Unit 502 obtains the class of the thread that the second processor core currently runs from the current active thread descriptor of the second processor core Type;And the thread currently run is added to by summing elements 503 at the cache memory cache rate of people logging in of current time sheet In the cache rate of people logging in that first processor core is total, accumulative frequency count value is added one;If what the second processor core currently ran is Cache Sensitive molded line journey, searches unit 402 and checks the set treating active thread being in ready state answered at first processor Middle lookup one caching insensitive thread;Or, if the second processor core currently run be caching insensitive thread, look into Unit 402 is looked for be in the set treating active thread of ready state one Cache Sensitive of lookup what first processor verification was answered Molded line journey;Lookup it is in the set treating active thread of ready state what first processor verification was answered if searching unit 402 To the thread of desirable type, then the thread currently run is switched to the thread found by switch unit 403.If lookup unit 402 are in the set treating active thread of ready state do not search a Cache Sensitive molded line what first processor verification were answered Journey, computing unit 505, according to the total cache rate of people logging in of first processor core and accumulative frequency count value, calculates first processor The average cache rate of people logging in of core, the cache rate of people logging in total according to the second processor core and accumulative frequency count value, calculate second The average cache rate of people logging in of processor core, and by the average cache rate of people logging in of first processor core and putting down of the second processor core All cache rates of people logging in are sued for peace as the first parameter value;Scanned first processor core by scanning computing unit 506 again and be in ready shape The set treating active thread of state, calculates the thread of Current Scan at the cache rate of people logging in of last timeslice and the second processor The thread that core currently runs is in the sum of the cache rate of people logging in of last timeslice, as the second parameter value;When the first parameter value and When difference between two parameter values is more than or equal to preset numerical value, the thread currently run is switched to work as by processing unit 507 The thread of front scanning, and when the difference between the first parameter value and the second parameter value is less than preset numerical value, then scan next Thread, returns to scan computing unit 506.Finally, after the thread of first processor core has switched, updating block 504 ought The type identification of the thread of front operation is saved in the current active thread descriptor of first processor core.
The embodiment of the present invention at thread dispatching device, under a kind of implementation, its physical aspect can be processor Core, processor core can be central processing unit (CPU, Central Processing Unit), or microprocessor (MPU, Micro Processor Unit) or digital signal processor (DSP, Digital Signal Processing) or Graphic process unit (GPU, graphic process unit).
Visible, by the thread scheduling device of the embodiment of the present invention, according to the second process answered with first processor verification The type of the thread of the type search first processor core switching of the thread that device core currently runs, and do not finding desirable type Thread time, the cache rate of people logging in further according to thread and processor core determines the thread that first processor core switches, it is possible to effectively Two processor cores avoiding there is corresponding relation run the thread of same types, alleviate the competition to shared resource, improve The utilization rate of resource, improves the performance of multi-core processor system.
Refer to Fig. 6, for the embodiment of thread scheduling device another kind of in the embodiment of the present invention, including:
First summing elements 601, for when first processor core generation thread context switches, by first processor core It is total that the current thread run is added to first processor core at the cache memory cache rate of people logging in of current time sheet In cache rate of people logging in, accumulative frequency count value is added one;
First acquiring unit 602, second processor core for acquisition and first processor core with corresponding relation is total Cache rate of people logging in and accumulative frequency count value;
First computing unit 603, for the cache rate of people logging in total according to first processor core and accumulative frequency count value, Calculate the average cache rate of people logging in of first processor core, the cache rate of people logging in total according to the second processor core and accumulative frequency meter Numerical value, calculates the average cache rate of people logging in of the second processor core, and by the average cache rate of people logging in and of first processor core The average cache rate of people logging in of two processor cores is sued for peace as the first parameter value;
First scanning computing unit 604, for scan that first processor verification answers be in ready state treat travel line The set of journey, the thread cache rate of people logging in last timeslice and the second processor core that calculate Current Scan currently run The sum of thread cache rate of people logging in last timeslice, as the second parameter value;
First processing unit 605, for when the difference between the first parameter value and the second parameter value is more than or equal to preset Numerical value, then the thread currently run is switched to the thread of Current Scan.
Preferably, in embodiments of the present invention, thread scheduling device can also include:
Second processing unit 606, for being less than preset numerical value when the difference between the first parameter value and the second parameter value, Then scan next thread, return to the first scanning computing unit 604;
First updating block 607, after the thread of first processor core has switched, the class of thread that will currently run Type mark is saved in the current active thread descriptor of first processor core.
Preferably, in the embodiment of the present invention, the first acquiring unit 602 specifically includes:
Core determines unit 608, for determining according to Identity Code ID and the preset computational methods of first processor core With the second processor core that first processor core has corresponding relation, or, search processor according to the ID of first processor core Core grouped table determines have the second processor core of corresponding relation with first processor core;
Numerical value acquiring unit 609, for obtain from the second processor core the total cache rate of people logging in of the second processor core and Accumulative frequency count value.
In embodiments of the present invention, when first processor core generation thread context switches, the first summing elements 601 will The thread that first processor core currently runs is added at first at the cache memory cache rate of people logging in of current time sheet In the cache rate of people logging in that reason device core is total, accumulative frequency count value is added one;And obtained and first by the first acquiring unit 602 Reason device core has the total cache rate of people logging in of the second processor core of corresponding relation and accumulative frequency count value, particularly as follows: true by core Cell 608 determines have with first processor core according to Identity Code ID and the preset computational methods of first processor core Second processor core of corresponding relation, or, search processor core grouped table according to the ID of first processor core and determine and first Processor core has the second processor core of corresponding relation;From the second processor core, is obtained again by numerical value acquiring unit 609 Cache rate of people logging in that two processor cores are total and accumulative frequency count value, then, the first computing unit 603 then processes according to first Cache rate of people logging in that device core is total and accumulative frequency count value, calculate the average cache rate of people logging in of first processor core, according to the Cache rate of people logging in that two processor cores are total and accumulative frequency count value, calculate the average cache rate of people logging in of the second processor core, And the average cache rate of people logging in of first processor core and the average cache rate of people logging in of the second processor core are sued for peace as first Parameter value, and the first scanning computing unit 604 scans the active thread for the treatment of being in ready state that first processor verification is answered Set, calculates the thread that the thread of Current Scan currently runs at cache rate of people logging in and second processor core of last timeslice In the sum of the cache rate of people logging in of last timeslice, as the second parameter value;Difference between the first parameter value and the second parameter value Value is more than or equal to preset numerical value, and the thread currently run then is switched to the thread of Current Scan by the first processing unit 605, Difference between the first parameter value and the second parameter value is less than preset numerical value, and the second processing unit 606 then scans next Thread, returns to the first scanning computing unit 604, finally, after the thread of first processor core has switched, and the first updating block The type identification of the thread currently run is saved in the current active thread descriptor of first processor core by 607.
The embodiment of the present invention at thread dispatching device, under a kind of implementation, its physical aspect can be processor Core, processor core can be central processing unit (CPU, Central Processing Unit), or microprocessor (MPU, Micro Processor Unit) or digital signal processor (DSP, Digital Signal Processing) or Graphic process unit (GPU, graphic process unit).
Visible, by the thread scheduling device of the embodiment of the present invention, when first processor core generation thread switches, pass through The cache rate of people logging in total according to processor core and the cache rate of people logging in of thread determine the thread that will switch, and complete switching, Can effectively avoid shared resource contention and the waste produced during two processor core active threads in same group, effectively Improve the utilization rate of shared resource, improve the performance of multi-core processor system.
Refer to Fig. 7, for the logical architecture schematic diagram of the multi-core processor system of the embodiment of the present invention, the embodiment of the present invention Multi-core processor system may include that
First processor core 701 and the second processor core 702, and the hardware resource 703 shared;
First processor core 701 and the second processor core 702 access the hardware resource 703 shared;
First processor core 701 is used for: when first processor core generation thread context switches, and determines and processes with first Device core has the type of the thread that the second processor core of corresponding relation currently runs;If what the second processor core currently ran is Cache Sensitive molded line journey, then be in the set treating active thread of ready state lookup one what first processor verification was answered Caching insensitive thread, if or the second processor core currently run be caching insensitive thread, then first process What device verification was answered is in the set treating active thread of ready state one Cache Sensitive molded line journey of lookup;When processing first What device verification was answered is in the thread finding desirable type in the set treating active thread of ready state, the line that will currently run Journey switches to the thread found;
Or,
First processor core 701 is used for: when first processor core generation thread context switches, by first processor core The current thread run is added in total cache rate of people logging at the cache memory cache rate of people logging in of current time sheet, Accumulative frequency count value is added one;Obtain and with first processor core, there is the total cache of the second processor core of corresponding relation and visit Ask rate and accumulative frequency count value;The cache rate of people logging in total according to first processor core and accumulative frequency count value, calculate first The average cache rate of people logging in of processor core, the cache rate of people logging in total according to the second processor core and accumulative frequency count value, meter Calculate the average cache rate of people logging in of the second processor core, and by the average cache rate of people logging in of first processor core and the second processor The average cache rate of people logging in of core is sued for peace as the first parameter value;What scanning first processor verification was answered is in treating of ready state The set of active thread, the thread calculating Current Scan is current at cache rate of people logging in and second processor core of last timeslice The thread run is in the sum of the cache rate of people logging in of last timeslice, as the second parameter value;When the first parameter value and the second parameter The thread currently run more than or equal to preset numerical value, is then switched to the thread of Current Scan by the difference between value.
In embodiments of the present invention, the hardware resource 703 shared includes: the storage device shared and/or shared hardware Cache;
It should be noted that in embodiments of the present invention, first processor core and second are included with multi-core processor system Processor core is easy to explanation, and, in the embodiment of the present invention, it is that the angle standing in first processor core is to illustrate at multinuclear The function of the processor core in reason device system, it should be appreciated that the function of the second processor core is with reference to first processor core Function, simply changes angle and stands in the angle of the second processor core and illustrate, repeat no more here.It should be appreciated that this The multi-core processor system of bright embodiment is to illustrate using first processor core and the second processor core as representative, and the present invention is real The multi-core processor system executing example can include multiple processor core, multiple processor cores here, can be belonging to same Processor, it is also possible to be to be belonging respectively to different processors;
The multi-core processor system of the embodiment of the present invention as shown in Figure 7, when actual physics is disposed it can be understood as, Include a processor with multi-core processor system, and this processor includes first processor core and the second processor core, or Person, includes two processors with multi-core processor system, and one of them processor includes first processor core, another processor Including the second processor core.
It should be noted that in embodiments of the present invention, when first processor core and the second processor core are belonging respectively to not With processor time, this first processor core and the second processor core can access shared storage device;
When first processor core and the second processor core belong to same processor, at this first processor core and second Reason device can access shared storage device and/or shared cache memory.
In actual applications, multi-core processor system may include that one or more processor (figure below 8-a, 8-b and 8-c In with the signal of two processors, but be not limited to this, it is also possible to being to include a processor, this processor includes multiple processor Core), wherein, each processor includes that one or more processor core (shows with two processor cores in figure below 8-a, 8-b and 8-c Meaning), optionally, described each processor may further include: share hardware cache (as shown in Fig. 8-a and 8-c, Such as LLC:last level caches, afterbody caches), described processor accesses storage device by internet, this In storage device can share to multiple processor core, storage device here can be one or more (figure below 8- With a storage device signal in a, 8-b and 8-c, but it is not limited to this).
It should be noted that in embodiments of the present invention, access, by internet, the storage shared between processor and set Standby, this internet can be bus or interconnection chip, and this storage device shared can be internal memory, such as memory, Or external memory, such as disk.
In embodiments of the present invention, the hardware resource shared comprised in multi-core processor system can be shared storage Equipment, or the hardware cache shared, or the storage device shared and the hardware cache shared, wherein, altogether The storage device enjoyed, outside processor, is connected with processor core by bus, and the hardware cache shared is in processor Portion.
Refer to Fig. 8-a, in the embodiment of the present invention, a physical structure schematic diagram of multi-core processor system, wherein, Multi-core processor system comprises shared hardware cache.
Refer to Fig. 8-b, in the embodiment of the present invention, a physical structure schematic diagram of multi-core processor system, wherein Multi-core processor system comprises shared storage device.
Refer to Fig. 8-c, in the embodiment of the present invention, a physical structure schematic diagram of multi-core processor system, wherein Multi-core processor system comprises shared hardware cache and the storage device shared.
It should be appreciated that under a kind of implementation, the processor core of the embodiment of the present invention can include scheduling logic Unit (as shown in Fig. 8-a, Fig. 8-b, Fig. 8-c), scheduling logic unit here can be that software realizes, it is also possible to is hardware Realize, it is also possible to be that soft or hard is implemented in combination with.Realize if scheduling logic unit is software, it is possible to understand that become, when logical Processor core access internal memory by internet, at the one section of scheduler program code loading and performing in this internal memory storage After, then there is the function of the processor core of the embodiment of the present invention.It should be appreciated that transport on the processor core of the embodiment of the present invention Row has operating system, and this operating system can be specifically linux system, or Unix system, it is also possible to is that Windows etc. has Machine hardware and software resource management control system, run on described operating system and have aforesaid scheduler program, described scheduling Program shows as thread (thread).
It should be noted that in embodiments of the present invention, Fig. 4, Fig. 5 and shown in Fig. 6 at thread dispatching device, in one Under implementation, its physical aspect can be processor core, can by comprise in processor core scheduling logic unit (Fig. 8- A, 8-b, 8-c illustrate with square frame) realize, and this scheduling logic unit can be that software realizes, it is also possible to it is that hardware realizes , it is also possible to it is that soft or hard is implemented in combination with.Or, under another kind of implementation, Fig. 4, Fig. 5 and adjusting at thread shown in Fig. 6 Degree device comprises scheduling logic unit (Fig. 8-a illustrates in 8-b, 8-c) with square frame corresponding in processor core.
In sum, the embodiment of the present invention is dispatching method based on thread type, in multi-core processor system, same Multiple processor cores in individual processor share hardware cache, and such as LLC, the polycaryon processor in non-same processor is altogether Enjoy storage device, in the prior art, when the multiple processor cores in same processor share same LLC, if same luck Row cache sensitivity thread, will produce LLC competition, when running caching insensitive thread simultaneously, will produce LLC waste, In the multi-core processor system that the embodiment of the present invention provides, thread scheduling device can share same money according to this processor core Source the type of thread run of processor core, then from this processor core corresponding be in ready state treat travel line Journey select thread and runs so that on same group of processor core, different types of thread can be run.The method alleviates altogether Enjoy resource contention, it is to avoid the shared wasting of resources improves the utilization rate of shared resource so that systematic function obtains good changing Kind.
It should be noted that the LLC that is not limited in competitive resource of the embodiment of the present invention and Memory Controller Hub, apply also for reality Other competitive resource in existing multi-core processor system.
The embodiment of the present invention is not limited to computer, it is adaptable to other have any device of resource contention coordinated scheduling.
The embodiment of the present invention is not limited to sequential scheduling for the purpose of improving performance, applies also for other with sequential scheduling as side The scene of method means.
One of ordinary skill in the art will appreciate that all or part of step realizing in above-described embodiment method is permissible Instructing relevant hardware by program to complete, described program can be stored in a kind of computer-readable recording medium, on Stating the storage medium mentioned can be read only memory, disk or CD etc..
Above a kind of thread scheduling method provided by the present invention, thread scheduling device and multi-core processor system are carried out It is discussed in detail, for one of ordinary skill in the art, according to the thought of the embodiment of the present invention, in detailed description of the invention and should All will change with in scope, in sum, this specification content should not be construed as limitation of the present invention.

Claims (10)

1. a thread scheduling method, it is characterised in that including:
When first processor core generation thread context switches, the thread currently run by described first processor core is currently The cache memory cache rate of people logging in of timeslice is added in the cache rate of people logging in that described first processor core is total, will be tired Add count value and add one;
Obtain, with described first processor core, there is the total cache rate of people logging in of the second processor core of corresponding relation and accumulative frequency Count value, wherein, described corresponding relation accesses, with described second processor core, the hardware shared based on described first processor core;
The cache rate of people logging in total according to described first processor core and accumulative frequency count value, calculate described first processor core Average cache rate of people logging in, the cache rate of people logging in total according to described second processor core and accumulative frequency count value, calculate institute State the average cache rate of people logging in of the second processor core, and by the average cache rate of people logging in and described of described first processor core The average cache rate of people logging in of two processor cores is sued for peace as the first parameter value;
Scan described first processor and check the set treating active thread being in ready state answered, calculate the line of Current Scan The thread that journey is currently run at cache rate of people logging in and described second processor core of last timeslice is in last timeslice The sum of cache rate of people logging in, as the second parameter value;
Difference between described first parameter value and the second parameter value more than or equal to preset numerical value, then will currently be run Thread switches to the thread of Current Scan.
Method the most according to claim 1, it is characterised in that described acquisition has corresponding closing with described first processor core The total cache rate of people logging in of second processor core of system and accumulative frequency count value include:
Identity Code ID and preset computational methods according to described first processor core determine and described first processor core There is the second processor core of corresponding relation, or, search processor core grouped table according to the ID of described first processor core true Fixed second processor core with described first processor core with corresponding relation;
The total cache rate of people logging in of described second processor core and accumulative frequency count value is obtained from described second processor core.
Method the most according to claim 1, it is characterised in that the described cache total according to described first processor core visits Ask rate and accumulative frequency count value, calculate the average cache rate of people logging in of described first processor core, according to described second processor Cache rate of people logging in that core is total and accumulative frequency count value, calculate the average cache rate of people logging in of described second processor core, and will The average cache rate of people logging in of described first processor core and the average cache rate of people logging in summation conduct of described second processor core First parameter value, including:
Cache rate of people logging in total for described first processor core is counted divided by the described accumulative frequency of described first processor core Value, obtains the average cache rate of people logging in of described first processor core;
By cache rate of people logging in total for described second processor core divided by the accumulative frequency count value of described second processor core, obtain Average cache rate of people logging in described second processor core;
Described first processor core average cache rate of people logging in is added with described second processor core average cache rate of people logging in, To described first parameter value.
4. according to the method described in any one of claims 1 to 3, it is characterised in that described method also includes:
Difference between described first parameter value and the second parameter value less than preset numerical value, then scans next thread, and The cache rate of people logging in last timeslice returning the thread performing described calculating Current Scan is worked as with described second processor core The thread of front operation is in the sum of the cache rate of people logging in of last timeslice, as the step of the second parameter value.
Method the most according to claim 4, it is characterised in that
After described first processor core completes thread switching, the type identification of the thread currently run is saved at described first In the current active thread descriptor of reason device core.
6. a thread scheduling device, it is characterised in that including:
First summing elements, for when first processor core generation thread context switches, works as described first processor core The cache memory cache rate of people logging in of the thread of front operation is added to the cache rate of people logging in that described first processor core is total In, accumulative frequency count value is added one;
First acquiring unit, second processor core for acquisition and described first processor core with corresponding relation is total Cache rate of people logging in and accumulative frequency count value, wherein, described corresponding relation based on described first processor core and described second at Reason device core accesses the hardware shared;
First computing unit, for the cache rate of people logging in total according to described first processor core and accumulative frequency count value, calculates The average cache rate of people logging in of described first processor core, the cache rate of people logging in total according to described second processor core and cumulative time Counting number value, calculates the average cache rate of people logging in of described second processor core, and average by described first processor core The average cache rate of people logging in of cache rate of people logging in and described second processor core is sued for peace as the first parameter value;
First scanning computing unit, for scanning the active thread for the treatment of being in ready state that the verification of described first processor is answered Set, calculates what the thread of Current Scan currently ran at cache rate of people logging in and described second processor core of last timeslice Thread is in the sum of the cache rate of people logging in of last timeslice, as the second parameter value;
First processing unit, for the difference between described first parameter value and the second parameter value more than or equal to preset number Value, then switch to the thread of Current Scan by the thread currently run.
Device the most according to claim 6, it is characterised in that described first acquiring unit includes:
Core determines unit, is used for the Identity Code ID according to described first processor core and preset computational methods determine and institute State first processor core and there is the second processor core of corresponding relation, or, at the ID lookup according to described first processor core Reason device core grouped table determines have the second processor core of corresponding relation with described first processor core;
Numerical value acquiring unit, for obtaining the cache rate of people logging in that described second processor core is total from described second processor core And accumulative frequency count value.
8. according to the device described in claim 6 or 7, it is characterised in that described device also includes:
Second processing unit, for the difference between described first parameter value and the second parameter value less than preset numerical value, then Scan next thread, return to described first scanning computing unit;
First updating block, after described first processor core completes thread switching, the type mark of thread that will currently run Know in the current active thread descriptor being saved in described first processor core.
9. a multi-core processor system, it is characterised in that including:
First processor core and the second processor core, and the hardware resource shared;
Described first processor core and the second processor core access described shared hardware resource;
Described first processor core is used for: when described first processor core generation thread context switches, at described first The thread that reason device core currently runs is added to total cache visit at the cache memory cache rate of people logging in of current time sheet Ask in rate, accumulative frequency count value is added one;Acquisition and described first processor core have the second processor core of corresponding relation Total cache rate of people logging in and accumulative frequency count value;The cache rate of people logging in total according to described first processor core and accumulative frequency Count value, calculates the average cache rate of people logging in of described first processor core, visits according to the cache that described second processor core is total Ask rate and accumulative frequency count value, calculate the average cache rate of people logging in of described second processor core, and by described first processor The average cache rate of people logging in of core and the average cache rate of people logging in of described second processor core are sued for peace as the first parameter value;Sweep Retouch described first processor and check the set treating active thread being in ready state answered, calculate the thread of Current Scan upper The thread that the cache rate of people logging in of individual timeslice and described second processor core currently run accesses at the cache of last timeslice The sum of rate, as the second parameter value;Difference between described first parameter value and the second parameter value is more than or equal to preset Numerical value, then switch to the thread of Current Scan by the thread currently run.
System the most according to claim 9, it is characterised in that described shared hardware resource includes: the storage shared sets Hardware cache that is standby and/or that share;
When described first processor core and described second processor core are belonging respectively to different processors, described first processor Core and the second processor core access described shared hardware cache;
Or,
When described first processor core and described second processor core belong to same processor, described first processor core and Second processor accesses described shared storage device and/or shared hardware cache.
CN201310134356.XA 2011-11-16 2011-11-16 A kind of thread scheduling method, thread scheduling device and multi-core processor system Active CN103197977B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310134356.XA CN103197977B (en) 2011-11-16 2011-11-16 A kind of thread scheduling method, thread scheduling device and multi-core processor system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310134356.XA CN103197977B (en) 2011-11-16 2011-11-16 A kind of thread scheduling method, thread scheduling device and multi-core processor system
CN201110362773.0A CN102495762B (en) 2011-11-16 2011-11-16 Thread scheduling method, thread scheduling device and multi-core processor system

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201110362773.0A Division CN102495762B (en) 2011-11-16 2011-11-16 Thread scheduling method, thread scheduling device and multi-core processor system

Publications (2)

Publication Number Publication Date
CN103197977A CN103197977A (en) 2013-07-10
CN103197977B true CN103197977B (en) 2016-09-28

Family

ID=48720565

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310134356.XA Active CN103197977B (en) 2011-11-16 2011-11-16 A kind of thread scheduling method, thread scheduling device and multi-core processor system

Country Status (1)

Country Link
CN (1) CN103197977B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1848095A (en) * 2004-12-29 2006-10-18 英特尔公司 Fair sharing of a cache in a multi-core/multi-threaded processor by dynamically partitioning of the cache
CN101256515A (en) * 2008-03-11 2008-09-03 浙江大学 Method for implementing load equalization of multicore processor operating system
CN101286138A (en) * 2008-06-03 2008-10-15 浙江大学 Method for multithread sharing multi-core processor secondary buffer memory based on data classification
CN101685408A (en) * 2008-09-24 2010-03-31 国际商业机器公司 Method and device for accessing shared data structure by multiple threads in parallel

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1848095A (en) * 2004-12-29 2006-10-18 英特尔公司 Fair sharing of a cache in a multi-core/multi-threaded processor by dynamically partitioning of the cache
CN101256515A (en) * 2008-03-11 2008-09-03 浙江大学 Method for implementing load equalization of multicore processor operating system
CN101286138A (en) * 2008-06-03 2008-10-15 浙江大学 Method for multithread sharing multi-core processor secondary buffer memory based on data classification
CN101685408A (en) * 2008-09-24 2010-03-31 国际商业机器公司 Method and device for accessing shared data structure by multiple threads in parallel

Also Published As

Publication number Publication date
CN103197977A (en) 2013-07-10

Similar Documents

Publication Publication Date Title
Reaño et al. Local and remote GPUs perform similar with EDR 100G InfiniBand
DE102019106669A1 (en) METHOD AND ARRANGEMENTS FOR MANAGING STORAGE IN CASCADED NEURONAL NETWORKS
CN108897615A (en) Second kills request processing method, application server cluster and storage medium
CN105956666B (en) A kind of machine learning method and system
US20160062802A1 (en) A scheduling method for virtual processors based on the affinity of numa high-performance network buffer resources
CN102567077B (en) Virtualized resource distribution method based on game theory
CN106951926A (en) The deep learning systems approach and device of a kind of mixed architecture
EP3857401A1 (en) Methods for automatic selection of degrees of parallelism for efficient execution of queries in a database system
CN1914597A (en) Dynamic loading and unloading for processing unit
CN103425536A (en) Test resource management method oriented towards distributed system performance tests
CN103885826B (en) Real-time task scheduling implementation method of multi-core embedded system
CN104216783A (en) Method for automatically managing and controlling virtual GPU (Graphics Processing Unit) resource in cloud gaming
CN107111511A (en) Access control method, device and system
CN108829740A (en) Date storage method and device
CN110659418A (en) Content searching method and device, storage medium and computing equipment
CN102495762B (en) Thread scheduling method, thread scheduling device and multi-core processor system
DE102022121371A1 (en) PREVENTING UNAUTHORIZED TRANSFERRED ACCESS THROUGH ADDRESS SIGNING
CN110955390A (en) Data processing method and device and electronic equipment
Oortwijn et al. Distributed binary decision diagrams for symbolic reachability
CN103197977B (en) A kind of thread scheduling method, thread scheduling device and multi-core processor system
CN108170860A (en) Data query method, apparatus, electronic equipment and computer readable storage medium
CN110415182B (en) Fundus OCT image enhancement method, device, equipment and storage medium
CN109871939B (en) Image processing method and image processing device
CN107193656B (en) Resource management method of multi-core system, terminal device and computer readable storage medium
JP2021517310A (en) Processing for multiple input datasets

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant