CN101751295A

CN101751295A - Method for realizing inter-core thread migration under multi-core architecture

Info

Publication number: CN101751295A
Application number: CN200910157107A
Authority: CN
Inventors: 陈天洲; 乔福明; 唐兴盛; 张少斌; 胡威; 胡同森
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2009-12-22
Filing date: 2009-12-22
Publication date: 2010-06-23
Anticipated expiration: 2029-12-22
Also published as: CN101751295B

Abstract

The invention relates to the field of designing multi-core hierarchical structures and aims at providing a method for realizing inter-core thread migration under a multi-core architecture. The method comprises the following steps: segmenting Cache data blocks, setting a fault mapping table and a companion mapping table, carrying out the inter-core thread migration, and completing the migration of all the Cache data blocks to an access core, thereby realizing the whole thread migration. The method has the benefits that the thread migration is realized by combining the fault mapping and the companion mapping of the Cache data blocks under the multi-core environment. The adoption of the method for retaining the Cache data blocks which are replaced in the access core rather than abandoning the Cache data blocks can improve the Cache hit ratio. The realization method can reduce the time delay of Cache access, compared with the previously proposed method of copying the Cache data blocks, the method can effectively utilize the capacity of Cache and keep the uniqueness of the Cache data blocks in the Cache.

Description

The implementation method of inter-core thread migration under multi-core architecture

Technical field

The present invention relates to multinuclear hierarchical structure design field, particularly relate to a kind of implementation method of inter-core thread migration under multi-core architecture.

Background technology

Along with the continuous progress of science and technology, the capacity of Cache is also increasing thereupon, especially afterbody Cache.Because the difference of manufacture craft, tend to increase the error rate of the sram cell of forming Cache, the method of many solutions has been proposed at present, error correcting code is one of them, it is to replace out of order unit by the unit of redundancy, but after redundant unit use is over, can only be abandoned in all the other out of order unit, this way unit that often can only remedy a mistake, it is unpractical correcting a large amount of mistakes thus.If a plurality of error unit need to correct, error correcting code just needs certain area overhead and complexity of calculation so.Also have other method, such as reduce the Cache amount of capacity by having abandoned trouble unit; Strategy abandoned in word; Position fixed policy or the like.

Time-delay is the key factor that thread migration is considered, has proposed the time-delay that many methods are used for alleviating memory access at present, particularly at afterbody Cache.The method that reduces the visit time-delay has data migtation and data to duplicate etc., but data migtation is more effective under the monokaryon situation.With under the multinuclear situation, compare, data are duplicated the uniqueness (having only the backup of an internal memory among the Cache) that can not effectively utilize the Cache capacity and can not guarantee data block.In chip multi-processor CMP, distributed shared L2 nonuniformity visit Cache, it is more obvious that the visit time-delay shows: near the needed access time of memory bank banks the shortest (it is the shortest to delay time) of visit nuclear, and examine the needed access time of memory bank the longest (it is the longest to delay time) far away from visit.

Summary of the invention

The technical problem to be solved in the present invention is that a kind of implementation method of inter-core thread migration under multi-core architecture is provided.

For solving the problems of the technologies described above, method provided by the invention may further comprise the steps:

(1) the Cache data block is cut apart:

Each Cache data block is divided into k equal portions div that equates, the size of each equal portions div is n, and the size of supposing the Cache data block is c, then c=nk;

(2) fault mapping table and companion's mapping table are set:

Each Cache data block is provided with the mapping of fault, represents that with 0 equal portions div does not have fault, can be used for storing data, represents that with 1 there is fault in equal portions div, can not be used for storing data.Each Cache data block is provided with a companion bit b, represents whether this Cache data block exists companion's piece, represents that when b=0 this Cache data block does not have companion's piece, represents that when b=1 there is companion's piece in this Cache data block.

If existing the fault mapping of a Cache data block to shine upon the XOR result with the fault of another Cache data block in a Cache group all is 0, these two Cache data blocks just exist peer relation so, i.e. another Cache data block companion's piece that is this Cache data block.

Companion's piece determining device that all nuclear is shared is set, is used for judging whether companion bit b is 0; Each Cache group set is provided with a searcher, is used for determining the position of companion's piece.

Each Cache group set is provided with companion's mapping table, each Cache data block needs the log2n position to represent that the companion shines upon, n is the degree of association of Cache, if in Cache group set, have an out of order Cache data block and another out of order Cache pairing, form a Cache data block that can be used for canned data, then these two out of order Cache data blocks exchange its index.

(3) in order to reduce the memory access time-delay, need carry out internuclear thread migration, its implementation process mainly is divided into following step:

Migration nuclear is meant that nuclear that thread will be moved, and visit nuclear is meant that nuclear of visit thread;

The first step: companion's piece determining device judges whether by the companion bit b of the Cache data block that will move in the migration nuclear be 0, if be 0, and changes step 2 over to; If be not 0 (being b=1), and change step 5 over to;

Second step: companion's piece determining device judges whether the companion bit b of the Cache data block that will replace away in the visit nuclear is 0, if be 0, changes step 3 over to; If be 1, change step 4 over to;

The 3rd step: thread migration controller Direct Transfer Cache data block is examined to visit on the Cache data block location of displacing, and the Cache data block of displacing from visit nuclear is filled into moves the position that nuclear Cache data block is moved out simultaneously;

The 4th step: at first the position of its companion's piece determined in companion's index of the searcher utilization visit nuclear Cache data block among the Cache group set, the thread migration controller shines upon according to the fault mapping and the companion of the Cache data block that visit nuclear is displaced, migration Cache data block is filled into migration to active data in two Cache data blocks of displacing again and examines the position that the Cache data block is moved out to visiting on the active position of examining in two Cache data blocks of displacing from visit nuclear;

The 5th step: at first the searcher among this Cache group set is according to companion's index of migration nuclear Cache data block, determine the position of its companion's piece, companion's piece determining device is judged the companion bit b of the Cache data block that will replace away in the visit nuclear simultaneously, if b=0, the thread migration controller is moved to visit nuclear with active data in two Cache data blocks, the fault of examining two Cache data blocks of being moved out is shone upon and companion's mapping according to moving again, the Cache data block of displacing from visit is examined is filled into to move examines on two Cache data block location of being moved out; If b=1, the thread migration controller is moved to active data in two Cache data blocks in the visit nuclear in two Cache data blocks effectively on the position, active data in two Cache data blocks of displacing from visit nuclear is filled on two Cache data block location that migration nuclear moved out again.

Five step circulations more than implementing are carried out, and finish all Cache data blocks and are moved to visit nuclear, to realize the migration of whole thread.

Compare with background technology, the useful effect that the present invention has is:

Under multi-core environment, thread migration shines upon in conjunction with the mapping of the fault of Cache data block and companion to be realized.Take to keep the method for the Cache data block of from visit nuclear, replacing away, rather than abandon the Cache data block, can improve the Cache hit rate.Implementation method of the present invention can reduce the time-delay of Cache visit, and compares capacity and the uniqueness of maintenance Cache data block in Cache that can effectively utilize Cache with the method for the previously presented Cache of duplicating data block.

Description of drawings

Fig. 1 is the process of Cache data block migration.

Embodiment

The detailed process and the example of this method are as follows:

The implementation method of inter-core thread migration under multi-core architecture comprises the steps:

(1) cutting apart the Cache data block:

Each Cache data block is divided into k equal portions div that equates, the size of each equal portions div is n, and the size of supposing the Cache data block is c, then c=nk.Suppose that Cache data block size for 32K, is divided into 4 equal portions, each equal portions size is 8K.

(2) fault mapping table and companion's mapping table:

For each Cache data block is equipped with a fault mapping, if the fault position of equal portions div is 0, just represent that these equal portions can store data, if the fault position of equal portions div is 1, just represent that this is cut apart and to store data, such as the fault of a Cache data block is mapped as 1010, represents that then second and the 4th can be used for storing data, and first and the 3rd can not be used for storing data.Each Cache data block is provided with a companion bit b, be used for representing whether this Cache data block exists companion's piece, in a Cache group set, the fault mapping of the fault mapping of Cache data block and interior other all the Cache data blocks of this Cache group set is XOR mutually, all is 1 if there is the result of a Cache data block XOR, and b=1 then is set, represent that this Cache data block has companion's piece, if the result is not 1 entirely, b=0 then is set, represent that there is not companion's piece in this Cache data block.

If existing the fault mapping of a Cache data block to shine upon the XOR result with the fault of another Cache data block in a Cache group set all is 0, these two Cache data blocks just exist peer relation so, it is companion's piece that another Cache data block is this Cache data block, for example the fault of a Cache data block is mapped as 1010, if in same Cache group set, exist the fault of another Cache data block to be mapped as 0101, the fault of these two data blocks mapping XOR result is 1111, so another Cache data block companion's piece that is this Cache data block.

Each Cache group set is provided with companion's mapping table, and each Cache data block needs log _2nThe position represents that the companion shines upon, n is the degree of association of Cache, if in Cache group set, have an out of order Cache data block and another out of order Cache pairing (i.e. companion's piece that the Cache data block is another Cache data block), form a Cache data block that can be used for canned data, then these two out of order Cache data blocks exchange its index.

Second step: companion's piece determining device judges whether the companion bit b of the Cache data block that will replace away in the visit nuclear is 0, if be 0, changes step 3 over to, if be 1, changes step 4 over to;

The 3rd step: thread migration controller Direct Transfer Cache data block is examined on the Cache data block location of displacing to visit, promptly move trouble-free Cache data block on trouble-free Cache data block location, simultaneously the Cache data block of displacing is filled into the position that migration nuclear Cache data block is moved out from visit nuclear;

The 4th step: at first the searcher among the Cache group set is determined the position of its companion's piece: the position of finding its companion's piece by companion's mapping, the thread migration controller shines upon according to the fault mapping and the companion of the Cache data block that visit nuclear is displaced, migration Cache data block is to visiting on the active position of examining in two Cache data blocks of displacing, again active data in two Cache data blocks of displacing is filled into the position that migration nuclear Cache data block is moved out from visit nuclear, such as moves on the active position of trouble-free Cache data block to two an out of order Cache data block;

The 5th step: at first the searcher among this Cache group set is according to companion's index of migration nuclear Cache data block, determine the position of its companion's piece, companion's piece determining device is judged the companion bit b of the Cache data block that will replace away in the visit nuclear simultaneously, if b=0, the thread migration controller is moved to visit nuclear with active data in two Cache data blocks, examine fault mapping and companion's mapping of two Cache data blocks of being moved out again according to migration, the Cache data block of displacing from visit nuclear is filled into migration examines on two Cache data block location of being moved out, belong to two out of order Cache data blocks are moved on the trouble-free Cache data block; If b=1, the thread migration controller is moved to active data in two Cache data blocks in the visit nuclear in two Cache data blocks effectively on the position, again active data in two Cache data blocks of displacing is filled into migration and examines on two Cache data block location of being moved out from visit nuclear, belong to two out of order Cache data blocks are moved on two out of order Cache data blocks.

Claims

1. the implementation method of an inter-core thread migration under multi-core architecture may further comprise the steps:

(1) the Cache data block is cut apart:

(2) fault mapping table and companion's mapping table are set:

Each Cache data block is provided with a fault mapping table and a companion bit b, and the fault mapping table is used for representing whether each equal portions div of Cache data block can be used for storing data, and companion bit b represents whether this Cache data block exists companion's piece; If existing the fault mapping of a Cache data block to shine upon the XOR result with the fault of another Cache data block in a Cache group all is 0, these two Cache data blocks just exist peer relation so, i.e. another Cache data block companion's piece that is this Cache data block;

Companion's piece determining device that all nuclear is shared is set, is used for judging whether companion bit b is 0;

Each Cache group set is provided with a searcher and companion's mapping table, and searcher is used for determining the position of companion's piece, and companion's mapping table is used for determining the position of this Cache data block companion piece; If in Cache group set, exist the out of order Cache of an out of order Cache data block and another to match, form a Cache data block that can be used for canned data, then these two out of order Cache data blocks exchange its index;

The first step: companion's piece determining device judges whether by the companion bit b of the Cache data block that will move in the migration nuclear be 0, if be 0, and changes step 2 over to; If be 1, then change step 5 over to;

The 5th step: at first the searcher among this Cache group set is according to companion's index of migration nuclear Cache data block, determine the position of its companion's piece, companion's piece determining device is judged the companion bit b of the Cache data block that will replace away in the visit nuclear simultaneously, if b=0, the thread migration controller is moved to visit nuclear with active data in two Cache data blocks, the fault of examining two Cache data blocks of being moved out is shone upon and companion's mapping according to moving again, the Cache data block of displacing from visit is examined is filled into to move examines on two Cache data block location of being moved out; If b=1, the thread migration controller is moved to active data in two Cache data blocks in the visit nuclear in two Cache data blocks effectively on the position, active data in two Cache data blocks of displacing from visit nuclear is filled on two Cache data block location that migration nuclear moved out again;