CN103729309A

CN103729309A - Method for cataloging Cache consistency

Info

Publication number: CN103729309A
Application number: CN201410017448.4A
Authority: CN
Inventors: 韩东涛
Original assignee: Inspur Electronic Information Industry Co Ltd
Current assignee: Inspur Electronic Information Industry Co Ltd
Priority date: 2014-01-15
Filing date: 2014-01-15
Publication date: 2014-04-16
Anticipated expiration: 2034-01-15
Also published as: CN103729309B

Abstract

The invention provides a method for cataloging Cache consistency. The process of achieving the method comprises the steps that a two-stage catalog storage structure of two protocols is arranged with the combination of a limited catalog and a full mapping catalog on the basis that a storage Cache is used; a replacement algorithm between a storage layer and a storage Cache layer is guaranteed through a fake-least-recently-used algorithm sharing numerical weighing. Compared with the prior art, the method for cataloging Cache consistency solves the problems that that the full mapping catalog takes too many storage expenses, the limited catalog can be limited by overflowing of a catalog entry, and the effectiveness of time for a chained catalog is low, the practicability is high, and popularization is easy.

Description

A kind of catalogue Cache coherence method

Technical field

The present invention relates to field of computer technology, more specifically say catalogue Cache coherence method.

Background technology

In multistage network, cache directory has been deposited relevant cache line copy and has been resided in information where, to support cache coherence.The main difference of various directory schemes is how maintenance informations and deposit what information of catalogue.First directory schema is to record with a centrales copy of putting all cache directories, and center catalogue can be provided as and guarantee the needed full detail of consistance.Therefore, its capacity is very large and must adopt association method to retrieve, and the catalogue of this and single high-speed cache is similar.Large-scale multiple processor system adopt center catalogue have conflict and retrieval time long two shortcomings.Distributed directory scheme is put forward by Censier and Feautrier.In distributed directory, each memory module is safeguarded catalogue separately, is recording the state of each memory block and current information in catalogue, and wherein status information is local, and current information indicates the copy that has this memory block in which high-speed cache.Dissimilar directory scheme can be divided into full mapping catalogue, limited catalogue and chain type catalogue three classes.Wherein entirely shine upon catalogue and take excessive storage overhead, the restriction that limited catalogue can be overflowed by directory entry, the problem such as the available time of chain type catalogue is lower.Based on this, the invention provides a kind of improved catalogue Cache coherence method, limited catalogue is combined and solves the problems referred to above with full mapping catalogue.

Summary of the invention

Technical assignment of the present invention is to solve the deficiencies in the prior art, provide a kind of simple to operate, be easy to realize, improved catalogue Cache coherence method.

Technical scheme of the present invention realizes in the following manner, this one catalogue Cache coherence method, and its specific implementation process is:

One, two-stage directory stores structure is set, i.e. full mapping catalogue and limited catalogue, wherein entirely shining upon catalogue deposits and each relevant data in global storage, make the each high-speed cache in system can store the copy of any data block simultaneously, each directory entry comprises N pointer, and N is the number of processor in system; Limited catalogue is the pointer that its each directory entry all contains fixed number with full mapping catalogue difference;

The pseudo-least recently used algorithm of shared several weightings of two, using between memory layer and storer Cache layer: each directory entry of supposition memory layer is used Q pointer, and the cache line that is only less than Q while replacing to sharing number is used this algorithm to replace; When the shared number of all cache lines in storer Cache is all greater than Q, by sharing the minimum cache line of number, replaces out storer Cache and carry out corresponding calcellation processing.

In described two-stage directory stores structure, in the directory entry of realizing with full mapping method, there are a processor position and a dirty position: the cacheline of the former respective processor exists or non-existent state; The latter is " 1 " if, and has one also to only have a processor position for " 1 ", and this processor just can carry out write operation to this piece, and each of high-speed cache all has two mode bits: whether a bit representation piece is effective; Another one represents whether active block allows to write.

The detailed content of the pseudo-least recently used algorithm of described shared several weightings is:

1) if cache line, at storer Cache, carries out 2, otherwise carry out 5;

2) reading out data from storer Cache;

3) if new shared node carries out 4, otherwise carry out 14;

4) revise Cache directory entry, carry out 14;

5) reading out data from storer;

6) if new shared node carries out 7, otherwise carry out 9;

7) if storer directory entry overflows, carry out 8, otherwise carry out 9;

8) record overflow entry;

9) if available free directory entry in Cache carries out 10, otherwise carry out 11;

10) data added to Cache and revise Cache directory entry according to storer directory entry, skipping to 14;

11) if having in Cache, share the directory entry that number is less than Q, carry out 12, otherwise carry out 13;

12) in shared several cache lines that are less than Q, use lru algorithm to select a blocks of data to replace out Cache, carry out 10;

13) select a shared minimum cache line to share the respective handling of cancelling, carry out 10;

14) complete.

The beneficial effect that the present invention compared with prior art produced is:

A kind of catalogue Cache coherence method of the present invention combines limited catalogue and full mapping catalogue, and then the Cache coherence method of use two-stage catalogue, solve full mapping catalogue and taken excessive storage overhead, the restriction that limited catalogue can be overflowed by directory entry, the problem such as the available time of chain type catalogue is lower.At memory layer, use limited catalogue, because the single directory entry of limited catalogue takes up room littlely, therefore can make to need the memory layer of a large amount of directory entries to save many storage spaces.Meanwhile, at storer Cache layer, use full mapping catalogue, Cache finite capacity, so even if single directory entry takes up room greatly, the space altogether taking is not understood many yet.Not only solved the directory entry overflow problem of limited catalogue, and the highest data of frequency of utilization and directory entry thereof are always in being used the storer Cache layer of full mapping catalogue, thereby the access speed of this two-stage type storage organization can be that storer Cache speed is suitable with its first order storer, practical, be easy to promote.

Accompanying drawing explanation

Accompanying drawing 1 is two-stage directory stores structural representation.

Accompanying drawing 2 is for sharing the pseudo-least recently used algorithm process flow diagram of number weighting in the present invention.

Embodiment

Below in conjunction with accompanying drawing, a kind of catalogue Cache coherence method of the present invention is described in detail below.

As shown in Figure 1, the present invention proposes a kind of catalogue Cache coherence method, its specific implementation process is:

Two-stage directory stores structure is set, limited catalogue and full mapping catalogue are combined, and then the Cache coherence method of use two-stage catalogue, solve full mapping catalogue and taken excessive storage overhead, full mapping catalogue is deposited and each relevant data in global storage, make the each high-speed cache in system can store the copy of any data block, each directory entry comprises N pointer simultaneously, and N is the number of processor in system.Quan limited catalogue is much no matter system scale has with mapping catalogue difference, its each directory entry all contains the pointer of fixed number.

Between memory layer and storer Cache layer, use replace Algorithm, share the least-recently-used lru algorithm of puppet of number weighting, so share count exceedes the cache line of the limited directory pointer number of memory layer, can guarantee in storer Cache, realize and utilize relatively less storage space to ensure cached data consistance.

In above-mentioned two-stage directory stores structure, in the directory entry of realizing with full mapping method, there are a processor position and a dirty position: the cacheline of the former respective processor exists or non-existent state; The latter is " 1 " if, and has one also to only have a processor position for " 1 ", and this processor just can carry out write operation to this piece.Every of high-speed cache has two mode bits: whether a bit representation piece is effective; Another one represents whether active block allows to write.High speed caching coherence method must guarantee the mode bit of storer catalogue and the mode bit of high-speed cache consistent.Limited directory scheme can be alleviated the excessive problem of catalogue, if the copy number of arbitrary data block while in high-speed cache has certain limitation, the size of catalogue can not exceed certain constant so.

The least recently used lru algorithm of puppet of described shared several weightings, the cache line that share count exceedes the limited directory pointer number of memory layer can guarantee in storage Cache.Each directory entry of supposing memory layer is used Q pointer, and the cache line that is only less than Q while replacing to sharing number is used LRU replace Algorithm to replace.Only have when the shared number of all cache lines in storer Cache is all greater than Q, just by one, share the minimum cache line of number and replace out storer Cache and carry out corresponding calcellation processing.And the situation that this directory entry overflows can be avoided by the size that storer Cache is suitably set.

As shown in Figure 2, the detailed content of the pseudo-least recently used algorithm of described shared several weightings is:

1) if cache line, at storer Cache, carries out 2, otherwise carry out 5.

2) reading out data from storer Cache.

3) if new shared node carries out 4, otherwise carry out 14.

4) revise Cache directory entry, carry out 14.

5) reading out data from storer.

6) if new shared node carries out 7, otherwise carry out 9.

7) if storer directory entry overflows, carry out 8, otherwise carry out 9.

8) record overflow entry.

9) if available free directory entry in Cache carries out 10, otherwise carry out 11.

10) data added to Cache and revise Cache directory entry according to storer directory entry, skipping to 14.

11) if having in Cache, share the directory entry that number is less than Q, carry out 12, otherwise carry out 13.

12) in shared several cache lines that are less than Q, use lru algorithm to select a blocks of data to replace out Cache, carry out 10.

13) select a shared minimum cache line to share the respective handling of cancelling, carry out 10.

14) complete.

So far, completely realized a kind of improved catalogue Cache coherence method.Aspect system performance, according to the time limitation of the thickness granularity of task and application and read-write operation proportion, the size of storer Cache and cache line is rationally set and guarantees that with applicable replace Algorithm storer Cache obtains higher hit rate.When a certain directory entry of memory layer overflows, directory entry information will copy in storer Cache layer.And according to the replace Algorithm of sharing number weightings, this directory information will remain in storer Cache always, until its shared nodes is less than the limited directory entry pointer of memory layer while counting Q, just can be replaced out storer Cache.Therefore, not only solved the directory entry overflow problem of limited catalogue, and the highest data of frequency of utilization and directory entry thereof be always in being used the storer Cache layer of full mapping catalogue, thereby the access speed of this two-stage type storage organization can be that storer Cache speed is suitable with its first order storer.

Certainly; the present invention also can have other various embodiments; in the situation that not deviating from spirit of the present invention and essence thereof; those of ordinary skill in the art are when making according to the present invention various corresponding changes and distortion, but these corresponding changes and distortion all should belong to the protection domain of claim of the present invention.

Claims

1. a catalogue Cache coherence method, is characterized in that its specific implementation process is:

2. a kind of catalogue Cache coherence method according to claim 1, it is characterized in that: in described two-stage directory stores structure, in the directory entry of realizing with full mapping method, have a processor position and a dirty position: the cacheline of the former respective processor exists or non-existent state; The latter is " 1 " if, and has one also to only have a processor position for " 1 ", and this processor just can carry out write operation to this piece, and each of high-speed cache all has two mode bits: whether a bit representation piece is effective; Another one represents whether active block allows to write.

3. a kind of catalogue Cache coherence method according to claim 2, is characterized in that: the detailed content of the pseudo-least recently used algorithm of described shared several weightings is:

1) if cache line, at storer Cache, carries out 2, otherwise carry out 5;

2) reading out data from storer Cache;

3) if new shared node carries out 4, otherwise carry out 14;

4) revise Cache directory entry, carry out 14;

5) reading out data from storer;

6) if new shared node carries out 7, otherwise carry out 9;

7) if storer directory entry overflows, carry out 8, otherwise carry out 9;

8) record overflow entry;

14) complete.