WO2017000124A1 - 目录中表项合并的方法以及设备 - Google Patents

目录中表项合并的方法以及设备 Download PDF

Info

Publication number
WO2017000124A1
WO2017000124A1 PCT/CN2015/082672 CN2015082672W WO2017000124A1 WO 2017000124 A1 WO2017000124 A1 WO 2017000124A1 CN 2015082672 W CN2015082672 W CN 2015082672W WO 2017000124 A1 WO2017000124 A1 WO 2017000124A1
Authority
WO
WIPO (PCT)
Prior art keywords
entry
sharer
label
cache
directory
Prior art date
Application number
PCT/CN2015/082672
Other languages
English (en)
French (fr)
Inventor
方磊
顾雄礼
蔡卫光
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2015/082672 priority Critical patent/WO2017000124A1/zh
Priority to CN201580079604.2A priority patent/CN107533512B/zh
Publication of WO2017000124A1 publication Critical patent/WO2017000124A1/zh
Priority to US15/839,665 priority patent/US20180101475A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0815Cache consistency protocols
    • G06F12/0817Cache consistency protocols using directory methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0815Cache consistency protocols
    • G06F12/0817Cache consistency protocols using directory methods
    • G06F12/082Associative directories
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0808Multiuser, multiprocessor or multiprocessing cache systems with cache invalidating means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/12Replacement control
    • G06F12/121Replacement control using replacement algorithms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/084Multiuser, multiprocessor or multiprocessing cache systems with a shared cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement
    • G06F2212/1021Hit rate improvement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1041Resource optimization
    • G06F2212/1044Space efficiency improvement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/62Details of cache specific to multiprocessor cache arrangements

Definitions

  • the present invention relates to the field of computers, and in particular, to a method and device for merging entries in a directory, a method and device for accessing a directory.
  • each processor core In a multi-core processor, each processor core generally has a private cache. Each private cache has access only to the processor core in which it resides, and a shared cache is also provided on the multi-core processor. The shared cache can be Accessed by each processor core. After each processor core reads the cache block from the shared cache, it can create a copy of the cache block in its private cache and read and write it, so there may be more cache blocks stored in the shared cache. Multiple copies are created in a private cache. When a copy of the cache block in any of the private caches is modified, other private caches need to be notified to invalidate other copies of the cache block to achieve consistency of the same cache block throughout the system.
  • Commonly used coherence protocols include directory-based coherence protocols, that is, entries are used to record the storage of each cache block in each private cache.
  • directory-based coherence protocols that is, entries are used to record the storage of each cache block in each private cache.
  • entries are used to record the storage of each cache block in each private cache.
  • the entry corresponding to the cache block needs to be obtained to obtain a storage of the cache block in each private cache to perform consistency processing on the copy of the cache block.
  • the number of stored entries is also limited. Generally, an entry cannot be set for each cache block.
  • the embodiment of the invention provides a method for merging entries in a directory, by which the method can Effectively improve the efficiency of the use of the directory, reducing the impact on the cache system caused by the replacement of the table entry.
  • a first aspect of the embodiments of the present invention provides a method for merging entries in a directory, where the directory includes multiple entries, each entry includes an entry label and a sharer number, and the entry label is used to indicate Cache block; the method includes: determining N entries to be merged, wherein the cache block indicated by the entry label of each of the N entries belongs to a merge scope, and the merge scope indicates 2 a cache block, the N and the a are both positive integers; the N entries are merged into a first entry, where the entry label of the first entry indicates the 2 a cache block The sharer number of the first entry includes a sharer number of each of the N entries.
  • the method before the merging the N entries into the first entry, the method further includes: determining whether the merge condition is met, and combining the The N-th entry includes any one of the following conditions: the second entry is not included in the directory, and the entry label of the second entry and the entry label of the first entry The same and both indicate the 2 a cache block; or the directory includes the second entry and the entry label of any of the N entries indicates a label of more than 2 cache blocks, where The entry label of the second entry is the same as the entry label of the first entry and indicates the 2 a cache block; or the directory includes the second entry and the N entries are The entry label of any entry indicates a cache block and the N is greater than a preset threshold, where the entry label of the second entry is the same as the entry label of the first entry and both Indicates the 2 a cache blocks.
  • the method further includes: deleting The second entry.
  • the method before the deleting the second entry, further includes: determining whether a sharer number of the second entry is The share number of the first entry is the same; if the sharer number of the second entry is different from the sharer number of the first entry, before performing the action of deleting the second entry, Obtaining a redundant sharer number in the second entry, where the redundant sharer number is another sharer number in a second entry different from the sharer number of the first entry, Querying the 2 a cache blocks in the cache device corresponding to the redundant sharer number, and determining the first cache block cached in the cache device corresponding to the redundant sharer number; performing one of the following two actions: invalid The first cache block cached by the cache device corresponding to the redundant sharer number, or a third entry is generated, and the entry label of the third entry indicates the first cache block, where the The share number of the three entries is the redundant share. Number.
  • the method further The method includes: determining whether the second entry includes two or more sharer numbers, and combining the second entry and the second entry when the second entry includes two or more sharer numbers
  • the first entry is a fourth entry
  • the entry label of the fourth entry indicates the 2 a cache block
  • the sharer number of the fourth entry includes sharing of the second entry The number of the sharer and the sharer number of the first entry.
  • the method further includes: deleting the second entry if the second entry includes only one sharer number.
  • the method before the deleting the second entry, further includes: determining whether a sharer number of the second entry is The share number of the first entry is the same; if the sharer number of the second entry is different from the sharer number of the first entry, before performing the action of deleting the second entry, Obtaining a redundant sharer number in the second entry, where the redundant sharer number is another sharer number in a second entry different from the sharer number of the first entry, Querying the 2 a cache blocks in the cache device corresponding to the redundant sharer number, and determining a second cache block cached in the cache device corresponding to the redundant sharer number; performing one of the following two actions: invalid The second cache block cached by the cache device corresponding to the redundant sharer number, or a fifth entry is generated, and the entry label of the fifth entry indicates the second cache block, where the The share number of the five entry is the redundant share Numbering.
  • a second aspect of the embodiments of the present invention provides a method for accessing a directory, where the directory includes multiple entries, each entry includes an entry label and a sharer number, where the entry label is used to indicate a cache block.
  • the plurality of entries include a first entry, the entry label of the first entry indicates 2 a cache blocks, and the a is a positive integer; the method includes: receiving a directory access request, the directory The access request carries a label of the cache block to be accessed; the directory is queried according to the label of the cache block to be accessed, and a set of the entry corresponding to the cache block to be accessed is obtained, where the set of entries includes The entry label in the directory indicates all the entries of the cache block to be accessed; determining a query entry from the set of entries, the query entry being indicated by the label of the entry in the set of entries The smallest entry in the cache block.
  • each entry in the directory further includes a management scope identifier, where the management scope identifier is used to indicate a cache block indicated by the entry label
  • the determining the query entry from the set of entries comprises: determining the query entry according to a management scope flag of each entry in the set of entries.
  • the directory access request further includes a visitor number, the visitor number indicating that the directory access request is issued
  • the cache device further includes: if the sharer number of the query entry is different from the visitor number, generating a first new entry, the entry label of the first new entry indicating the The cache block to be accessed, the sharer number of the first new entry is the visitor number.
  • the directory access request further includes a visitor number, the visitor number indicating that the directory access request is issued The cache access device, the directory access request further includes an access type, the access type is used to indicate that the directory access request is a read request or a write request, and the method further includes: if the access request type indicates the directory access The request is a read request, and the sharer number of the lookup table entry does not include the visitor number, and the visitor number is added to the sharer number of the query entry.
  • the method further includes: if the access request type indicates that the directory access request is a write request; generating a second new table The entry label of the second new entry indicates the cache block to be accessed, the sharer number of the second new entry is the visitor number, and notifying the query entry The cache device corresponding to the sharer number other than the visitor number invalidates the cache block to be accessed.
  • a third aspect of the embodiments of the present invention provides an apparatus for merging entries in a directory, where the directory includes multiple entries, each entry includes an entry label and a sharer number, and the entry label is used by
  • the device includes: an acknowledgment module, configured to determine N entries to be merged, wherein the cache block indicated by the entry label of each of the N entries belongs to the merge scope
  • the merging range indicates 2 a cache blocks, and the N and the a are both positive integers.
  • the processing module is configured to merge the N entries into a first entry, where the first entry is The entry label indicates the 2 a cache block, and the sharer number of the first entry is a sharer number of the N entries.
  • the processing module is further configured to: determine whether the merge condition is met, and meet the merge condition before combining the N entries into the first entry And merging the N entries;
  • the merging condition includes any one of the following conditions: the second entry is not included in the directory, where the entry label of the second entry is related to the first table The entry label of the entry is the same and both indicate the 2 a cache block; or the directory includes the second entry and the entry label of any of the N entries indicates more than 2 cache blocks a label, wherein the entry label of the second entry is the same as the entry label of the first entry and both indicate the 2 a cache block; or the directory includes the second entry and the The entry label of the entry of any one of the N entries indicates a cache block and the N is greater than a preset threshold, where the entry label of the second entry and the table of the first entry The item labels are the same and both indicate the 2 a cache blocks.
  • the processing module Also used to: delete the second entry.
  • the processing module is further configured to: determine a sharer number of the second entry before deleting the second entry Whether the same as the sharer number of the first entry; if the sharer number of the second entry is the same as the sharer number of the first entry, performing the action of deleting the second entry If the sharer number of the second entry is different from the sharer number of the first entry, obtain the redundancy in the second entry before performing the action of deleting the second entry a share number of the sharer, the redundant sharer number is another sharer number in the second entry different from the sharer number of the first entry, and the cache device corresponding to the redundant sharer number Querying the 2 a cache blocks to determine a first cache block cached in the cache device corresponding to the redundant sharer number; performing one of the following two actions: invalidating the redundant sharer number corresponding to Cache the first cache block cached by the device, By generating a third entry, the entry of the third label indicating
  • the processing module is further configured to: include, in the directory, the second entry, and merge to obtain the first After the entry, determining whether the second entry includes two or more sharer numbers, and combining the second table when the second entry includes two or more sharer numbers And the first entry is a fourth entry, the entry label of the fourth entry indicates the 2 a cache block, and the sharer number of the fourth entry includes the second entry The sharer number and the sharer number of the first entry.
  • the processing module is further configured to delete the second table if the second entry includes only one sharer number item.
  • the processing module is further configured to determine whether a sharer number of the second entry is before deleting the second entry The same as the sharer number of the first entry; if the sharer number of the second entry is different from the sharer number of the first entry, performing the action of deleting the second entry And obtaining a redundant sharer number in the second entry, where the redundant sharer number is another sharer number in the second entry different from the sharer number of the first entry, Querying the 2 a cache blocks in the cache device corresponding to the redundant sharer number, and determining a second cache block cached in the cache device corresponding to the redundant sharer number; performing one of the following two actions : invalidating the second cache block cached by the cache device corresponding to the redundant sharer number, or generating a fifth entry, where the entry label of the fifth entry indicates the second cache block, The sharer number of the fifth entry is the redundancy Sharer number.
  • the device is an application specific integrated circuit ASIC or off-the-shelf programmable gate array FPGA.
  • a fourth aspect of the embodiments of the present invention provides an apparatus for accessing a directory, where the directory includes multiple entries, each entry includes an entry label and a sharer number, and the entry label is used to indicate the cache.
  • a block the plurality of entries includes a first entry, the entry label of the first entry indicates 2 a cache blocks, and the a is a positive integer;
  • the device includes: a receiving module, configured to receive a directory access request, the directory access request carries a label of the cache block to be accessed;
  • the processing module is configured to query the directory according to the label of the cache block to be accessed, and obtain a set of the cache block to be accessed
  • the entry of the set includes the entry of the entry in the directory indicating all the entries of the cache block to be accessed; determining the query entry from the set of entries, the query entry is The entry with the smallest cache block indicated by the entry label in the set of entries.
  • each entry in the directory further includes a management scope identifier, where the management scope identifier is used to indicate a cache block indicated by the entry label
  • the determining, by the processing module, the query entry from the set of entries comprises: determining the query entry according to a management scope flag of each entry in the set of entries.
  • the directory access request further includes a visitor number, where the visitor number indicates to issue the directory access request
  • the processing module is further configured to: if the sharer number of the query entry is different from the visitor number, generate a first new entry, the entry label indication of the first new entry The cache block to be accessed, the sharer number of the first new entry is the visitor number.
  • the directory access request further includes a visitor number, where the visitor number indicates to issue the directory access request
  • the cache access device the directory access request further includes an access type, the access type is used to indicate that the directory access request is a read request or a write request
  • the processing module indicates, if the access request type indicates the directory access request For the read request, the sharer number of the lookup entry does not include the visitor number, and the visitor number is added to the sharer number of the lookup entry.
  • the method further includes: if the access request type indicates that the directory access request is a write request; and generating a second new entry, The entry label of the second new entry indicates the cache block to be accessed, the sharer number of the second new entry is the visitor number, and notify the visitor of the query entry The cache device corresponding to the sharer number other than the number invalidates the cache block to be accessed.
  • the device is an application specific integrated circuit ASIC or ready-made Program the gate array FPGA.
  • a fifth aspect of the embodiments of the present invention provides a directory, including: a block entry, where the block entry includes a first entry label, a first sharer number, and an area entry, where the area entry includes a second An entry label, a second sharer number, a super-area entry, the third-area entry, a third sharer number, the first entry label indicating a cache block, and the second
  • the entry label indicates 2 n cache blocks
  • the third entry label indicates 2 n+m cache blocks
  • the n and the m are both positive integers.
  • the block entry further includes a first management scope flag
  • the regional entry further includes a second management scope flag
  • the super-region table The item further includes a third management scope flag
  • the first management scope identifier is used to indicate a quantity of the cache block indicated by the first entry label
  • the second management scope identifier is used to indicate the second The number of cache blocks indicated by the entry label
  • the third management scope identifier is used to indicate the number of cache blocks indicated by the third entry label.
  • a sixth aspect of the embodiments of the present invention provides a storage medium for storing a directory according to the fifth aspect of the embodiment of the present invention or the first implementation manner of the fifth aspect.
  • a seventh aspect of the present invention provides a directory cache, including the storage medium according to the sixth aspect of the present invention, the second aspect or the second aspect of the embodiment of the present invention.
  • the device for merging the entries in the directory, the device for accessing the directory, the bus according to any one of the fourth or fourth aspects of the embodiments of the present invention; the storage medium, the A device for establishing a table item in a directory, and a device for accessing the directory establish a communication connection through the bus.
  • the entries of the directory can be effectively merged, the storage space of the directory is saved, and the extra cost of the cache block managed by a part of the entry must be avoided as much as possible after the directory storage space reaches the upper limit.
  • FIG. 1 is a schematic diagram of a shared cache architecture of a multi-core processor according to an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of a directory of a directory to which the first embodiment of the present invention is applied;
  • FIG. 3 is a schematic structural diagram of another directory of an application according to Embodiment 1 of the present invention.
  • FIG. 4 is a schematic flowchart diagram of a method for merging entries according to Embodiment 1 of the present invention.
  • FIG. 5 is a schematic structural diagram of a device used for merging entries in a directory according to Embodiment 2 of the device of the present invention.
  • FIG. 6 is a schematic flowchart diagram of a method for accessing a directory according to Embodiment 2 of the method of the present invention.
  • FIG. 7 is a schematic structural diagram of a device for accessing a directory according to Embodiment 3 of the device of the present invention.
  • FIG. 8 is a schematic structural diagram of a directory cache device used in Embodiment 4 of the device of the present invention.
  • processor core generally refers to one or more processing units that perform data processing tasks of a multi-core processor chip, and may also be referred to as a processor core or processing core, and may also be a signal-bearing unit.
  • processing integrated circuit chips such as general-purpose processors, digital signal processors (DSPs), application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), etc. .
  • directory cache (English name: Directory Cache) generally refers to the directory-based cache consistency (English name: Directory Based Cache Coherence) system, with the function of storing directories and processing entries. device.
  • the term "entry”, or “directory entry” or “directory entry” generally refers to a directory-based cache coherency system stored in the directory high speed An entry in the cache (English full name: Entry).
  • Each entry records one or more entry labels, each entry label indicates a cache block, and each entry also records one or more sharer numbers, each sharer number indicating a processor node or Private cache.
  • event tag refers to a tag of a cache block.
  • each cache block in the shared cache architecture has a unique tag through which a cache block can be determined.
  • the address of the cache block in the cache device has 12 bits.
  • a 12-bit address can be indexed to a cache line (English full name: Cache Line) ), then through a 4-bit tag can find a cache block.
  • the term "sharer number" is used to indicate a cache device.
  • the sharer number in an entry is also the cache device corresponding to the cache block indicated by the entry label of the entry. In fact, due to the entry.
  • the correspondence between the cache block and the cache device recorded in the medium is not completely accurate. For example, there are entries 0000 to 1111-1, 2, but in fact, the cache devices No. 1 and No. 2 do not necessarily store all of the tags 0000 to 1111. Cache block.
  • the term "the cache block indicated by the entry label belongs to the scope” indicates that the cache block belongs to 2 a consecutive cache blocks, that is, the label of the cache block belongs to the label of 2 a consecutive cache blocks, where
  • the "belonging" includes the endpoints of the two ends of the scope, for example, the cache block A belongs to the merge range 0000 to 1111, that is, the label of the cache block A may be any one of 0000 to 1111 and may be 0000 or 1111.
  • a multi-core processor generally has multiple nodes, such as node 0 and node 1 in FIG. 1, each node including a processor or a processor core, and a cache device of each node, that is, a private cache, each node
  • the cache device only has access to the node's processor and reads and writes data from it.
  • the data in each cache device is derived from the shared data cache, and the shared data cache can be accessed by each cache device. Therefore, as shown in FIG.
  • the cache block A stored in the shared data cache may be node 0, node 1 and Node N is read and stored in node 0, node 1 and node N cache devices, so cache block A has 3 copies in the entire shared cache architecture.
  • the directory cache is used in the shared cache architecture to ensure consistency between copies of each cache block. Directory speed The entry is recorded in the file, and any entry records the tags of a plurality of cache blocks, and the sharer number of the cache blocks, that is, the number of the node where the copy of the cache block is located.
  • the directory cache When each node reads and writes the cache block in the cache device, it needs to access the entry recorded in the directory cache to obtain the number of the node where the copy of the cache block is located, so as to ensure that the copies of the cache block to be read and written are consistent. Sex.
  • the directory cache In addition to storing entries, the directory cache generally has a certain processing capability for the table entries. For example, the corresponding entry is searched according to the directory access request, and the storage capacity of the entry in the directory cache reaches the upper limit. The item is processed, and the corresponding cache device is notified when the entry is modified.
  • the first embodiment provides a directory for the foregoing shared cache architecture.
  • the directory 100 includes:
  • the block entry 102 includes an entry label 1022 and a sharer number 1024;
  • the area entry 104 includes an entry label 1042 and a sharer number 1044.
  • the super-area entry 106 includes an entry label 1062 and a sharer number 1064.
  • the entry label 1022 indicates a cache block
  • the entry label 1042 indicates 2 n cache blocks
  • the entry label 1062 indicates 2 n+m cache blocks
  • n and m are positive integers. That is, the number of cache blocks managed by the super-area entry 106 may be 2 m times of the area table entry 104.
  • the actual super-area entry 106 may include multiple super-area entries, for example, including the entry label indicating 2 n+1
  • the super-area entry of the cache block, the entry label indicates the super-area entry of 2 n+2 cache blocks, and the entry label indicates the super-area entry of 2 n+3 cache blocks, etc., assuming that the number of management cache blocks is the largest
  • the super zone entry manages 2 n+L cache blocks.
  • the sharer number 1024, the sharer number 1044, and the sharer number 1064 may include the number of one or more cache devices, and the number of each cache device indicates a node or a cache device in the architecture, and the block table entry 102 manages FIG.
  • the entry label 1022 is the label of the cache block A
  • the sharer number 1042 includes the number of the node 0, the number of the node 1, and the number of the node N.
  • the block entry 102 further includes a management scope flag 1026
  • the regional entry 104 further includes a management scope flag 1046
  • the super-region entry 106 further includes a management scope flag 1066; the management scope identifier 1026.
  • the management scope identifier 1046 is used to indicate the number of cache blocks indicated by the entry label 1042
  • the management scope identifier 1066 is used to indicate the cache block indicated by the entry label 1062. Quantity.
  • the entry labels of the respective entries may not directly indicate the cache block, but indicate the start address, and then determine the management range flag.
  • the range of cache blocks managed by this entry For example, if an entry has an entry label of 0011 and a management scope flag of 0010, the management scope flag indicates that the entry manages four cache blocks, that is, four caches with labels of 0011, 0100, 0101, and 0110. Piece. Or the entry label may only indicate the high bit of the label of the cache block.
  • the management scope flag indicates that the entry manages four cache blocks, and the four cache blocks are If the upper bits of the tag are 00, the tags of the four cache blocks are 0000, 0001, 0010, and 0011.
  • the width of the label bit of each entry should be the same, and the width of the management range of each entry should be the same. Take the super-area table with the largest number of cache blocks in the previous example. In the case of 2 n + L cache blocks, the management range flag requires a minimum of log 2 (L+2) + 1 bits.
  • the foregoing provides a directory, which provides a plurality of entries of different management ranges of the cache block, and a cache block with a large number of management entries saves the storage space of the directory, and the entry with a small management scope is improved.
  • the accuracy of the directory and the combination of multiple management-scoped entries improve the efficiency of the directory and saves the overhead of directory-based coherency protocols.
  • a first embodiment of the present invention provides a storage medium for storing any of the directories provided in Embodiment 1.
  • the storage medium may be a RAM, a ROM, an EEPROM, a disk storage medium, or a solid state hard disk (English name: Solid State Drives) ) or other storage media.
  • the foregoing provides a storage medium for storing a directory, wherein the directory stored in the storage medium provides a plurality of entries of a management range of different cache blocks, and the cache block with a large management range of entries has a storage space for the directory.
  • the small-administration table item improves the accuracy of the directory, and the combination of various management scope items improves the use efficiency of the directory, saves the overhead of directory-based consistency, and enables the multi-core processor using the storage medium. The efficiency of the chip is improved.
  • the first embodiment of the method provides a method for merging entries in a directory, and the method can be applied to the directory of any of the alternatives in the first embodiment, that is, each entry in the directory includes an entry label and a sharer number.
  • the entry label is used to indicate the cache block.
  • the process diagram of the method is shown in FIG. 4, and includes:
  • Step 202 Determine N entries to be merged, where the cache block indicated by the entry label of each entry of the N entries belongs to a merge range, and the merge range indicates 2 a cache blocks.
  • the N and the a are both positive integers.
  • Step 204 Combine the N entries into a first entry, where the entry label of the first entry indicates the 2 a cache block, and the sharer number of the first entry includes the The sharer number of each entry in the N entries.
  • the first table merged in step 204 The sharer number of the item is the sharer number of the N entries.
  • the directory includes four entries, and the entry label of each entry indicates one or more cache blocks, and the labels of all the cache blocks indicated by the four entries are in the range of 0000 to 0011, 0000.
  • a is equal to 2.
  • the labels of the cache blocks managed by the two entries are 0000, 0001, 0010, and 0011, and the share numbers of the four entries are the same.
  • the share number of the four entries is 1, indicating If the cache device is numbered 1, the four entries are merged into the first entry, and the entry label of the first entry is 0000 to 0011 and the sharer number of the first entry indicates the cache device 1.
  • 0000-1 indicates that the entry label of an entry is 0000, indicating that the label is a cache block of 0000, and the sharer number indicates that the cache device is 10000 to 1111-1, 2, indicating an entry of an entry.
  • the label indicates a total of 16 cache blocks from 0000 to 1111, and the sharer number indicates the cache device 1 and the cache device 2, and the meaning of the entries in this document is the same.
  • the minimum value of a is the management range of the area table entries in the directory. If the area table item in the directory is set to manage the labels of the eight cache blocks, then the minimum value of a can be 3, and the first table is The entry is an area entry, and a can also take an integer greater than 3. The first entry is a super-area entry. At the same time, the sharer number of the entry may also indicate two or more cache devices. For example, a is 4, and 4 entries are respectively, 0000-1, 2,0001-1, 2,0010-1, 2 , 1000 to 1111-1, 2, the four entries are merged into the first entry, and the first entry is 0000 to 1111-1, 2.
  • the sharer number of the entry includes the sharer number of each of the N entries.
  • the directory includes four entries, and the entry label of each entry indicates one or more delays.
  • the block is stored, and a is taken as 2, and the labels of all the cache blocks indicated by the four entries are in the range of 0000 to 0011, and the four entries are 0000-1,0001-2, 0010-1, 0011-3. Merged into 0000 to 0011-1, 2, 3.
  • the minimum value of a is the management range of the area table entries in the directory. If the area table item in the directory is set to manage the labels of the eight cache blocks, then the minimum value of a can be 3, and the first table is The entry is an area entry, and a can also take an integer greater than 3. The first entry is a super-area entry. For example, if a is 4 and 4 entries are 0000-1, 2,0001-1, 3,0010-1, 1000 to 1111-2, and 4, the four entries are merged into the first entry. The first entry is 0000 to 1111-1, 2, 3, 4.
  • step 202 generally reaches the upper limit of the storage space of the entry in the directory, but when a new entry needs to be generated, step 202 and step 204 are performed to merge the entries in the directory to save the entry.
  • the storage space vacated the storage space for the new entry which avoids the burden on the system to delete the existing entry in order to generate a new entry.
  • the initiation of the step 202 may also be performed when the storage space of the entry in the directory reaches the upper limit, that is, the merging of the entries in the directory may be initiated periodically or according to another preset rule.
  • each entry in the directory further includes a management range flag
  • the management scope flag of the first entry is also generated, and the management scope flag is used.
  • the entry label indicating the first entry indicates 2 a cache blocks.
  • a is 4, and 4 entries are 0000-1-1, 2,0001-1-1, 2,0010-1-1, 2, 1000-8-1, 2, where 0000-1-1 2 indicates that the entry label of the entry is 0000, and the management scope flag is 1, so the entry only manages the cache block with the label 0000, and the sharer number indicates the cache device 1 and the cache device 2, 1000-8-1.
  • the table with the management range flag in this document The meaning of the item and so on, the first entry of the four items 0000-1, 2,0001-1, 2,0010-1, 2, 1000 to 1111-1, 2 may be 0000- 16-1, 2.
  • the method further includes: determining whether the merge condition is met, and combining the N entries when the merge condition is met; the merge condition includes any of the following conditions: A: the second entry is not included in the directory, and the entry label of the second entry indicates the foregoing consecutive 2 a cache blocks, and the N entries are merged into the first entry; the directory includes the second entry, and the directory includes the second entry.
  • An entry label of any one of the N entries indicates a label of the two or more cache blocks; the directory includes the second entry, and the entry label of any one of the N entries indicates a cache Block and said N is greater than a preset threshold.
  • the second entry of the first entry in the directory is the same, the second entry needs to be replaced with the first entry.
  • 0000-1, 2,0001-1, 2,0010-1, 2, 1000 to 1111-1, 2 are combined into 0000 to 1111-1, 2, for example, if the second entry already exists in the directory. If the entry of the entry is 0000 to 1111, the first entry conflicts with the second entry. The two entries cannot have the same entry in the directory. In this case, you need to determine whether to replace the first entry with the first entry. Two entries. The first entry is merged in the case of the following two cases. Case 1: If any of the N entries is an area entry or a super-area entry, that is, any of the N entries.
  • the second entry is replaced by the first entry.
  • Case 2 If any of the N entries is a block entry, if N is greater than the preset The threshold is replaced with the first entry by the second entry. If the foregoing two cases are not met, the second entry is retained, and the N entries are not combined into the first entry, and the merge is abandoned. In general, if there are regional or super-area entries in the entry to be merged, or the number of block entries in the table to be merged is high, the efficiency of the merge is more obvious. Therefore, the second entry is replaced with the first entry under this condition.
  • the method further includes: deleting the second entry.
  • the second entry conflicts with the first entry and the first entry is merged, the second entry must be deleted to prevent the two entries with the same entry label from being in the directory.
  • the method before deleting the second entry in step 204, the method further includes: determining that a sharer number of the second entry is different from a sharer number of the first entry, and performing the deleting Obtaining a redundant sharer number in the second entry before the action of the second entry, where the redundant sharer number is different from the second entry of the sharer number of the first entry Querying the 2 a cache blocks in the cache device corresponding to the redundant sharer number, determining the first cache block cached in the cache device corresponding to the redundant sharer number, and performing invalid The first cache block cached by the cache device corresponding to the redundant sharer number, or a third entry is generated, and the entry label of the third entry indicates the first cache block, where the The sharer number of the three entry is the redundant sharer number.
  • 0000-1, 2,0001-1, 2, 0010-1, 2, 1000 to 1111-1, 2 are combined into 0000 to 1111-1, 2 as an example, if the directory already exists
  • the second entry is 0000 to 1111-1, 2, and 3.
  • the redundant sharer number is the number of the cache device 3, and the first cache block is obtained.
  • the label of the first cache block, that is, the label stored in the cache device 3 is in the range of 0000 to 1111.
  • the cache block such as the label of the first cache block is 0011, generates a third entry 0011-3, that is, a new block entry is used to maintain the first cache block.
  • the cached block in the cache device corresponding to the redundant sharer number may be invalid from the cache block in the interval 0000 to 1111. Drop it. If the second entry is deleted, if the share number of the second entry is not protected by the sharer number of the first entry, you need to protect the share number of the first entry. Cache blocks in the 0000 to 1111 interval in the sharer number generate new block table entry protection, or need to invalidate these cache blocks, because there may not be entries in the directory to manage these cache blocks.
  • the method further includes: determining whether the second entry includes two or more sharer numbers, where the second entry includes two Or the second entry is merged with the second entry, and the first entry is a fourth entry, and the entry label of the fourth entry indicates the 2 a cache block.
  • the sharer number of the fourth entry includes the sharer number of the second entry and the sharer number of the first entry.
  • the directory includes the second entry and is merged to obtain the first entry, if the second entry includes only one share. Number, delete the second entry.
  • the second entry is deleted.
  • the method before deleting the second entry in step 204, the method further includes: determining that a sharer number of the second entry is different from a sharer number of the first entry, and performing the deleting Obtaining a redundant sharer number in the second entry before the action of the second entry, where the redundant sharer number is different from the second entry of the sharer number of the first entry Querying the 2 a cache blocks in the cache device corresponding to the redundant sharer number, determining the first cache block cached in the cache device corresponding to the redundant sharer number, and performing invalid The first cache block cached by the cache device corresponding to the redundant sharer number, or a third entry is generated, and the entry label of the third entry indicates the first cache block, where the The sharer number of the three entry is the redundant sharer number.
  • 0000-1, 2,0001-1, 2, 0010-1, 2, 1000 to 1111-1, 2 are combined into 0000 to 1111-1, 2 as an example, if the directory already exists
  • the second entry is 0000 to 1111-1, 2, and 3.
  • the redundant sharer number needs to be obtained.
  • the number of the shared device number is the number of the cache device 3, and the first cache block is obtained.
  • the label of the first cache block that is, the label stored in the cache device 3 is a cache block in the interval 0000 to 1111, for example, the label of the first cache block is 0011, a third entry 0011-3 is generated, that is, a new block entry is used to maintain the first cache block.
  • the cached block in the cache device corresponding to the redundant sharer number may be invalid from the cache block in the interval 0000 to 1111. Drop it. If the second entry is deleted, if the share number of the second entry is not protected by the sharer number of the first entry, you need to protect the share number of the first entry. Cache blocks in the 0000 to 1111 interval in the sharer number generate new block table entry protection, or need to invalidate these cache blocks, because there may not be entries in the directory to manage these cache blocks.
  • the method for merging the entries in the directory is provided.
  • the method can effectively merge the entries of the cache block in the directory within a certain range, thereby saving the storage space of the directory, and avoiding the need for the directory storage space to reach the upper limit.
  • the cache block managed by the entry performs an invalid extra expenditure, and the merge method properly considers the impact of directory consolidation on directory usage efficiency.
  • the second embodiment of the device provides a device 400 for merging entries in a directory with a cache consistency.
  • the device 400 is specifically used for merging entries in a directory in any one of the first embodiments, and the component 400 is configured.
  • the schematic diagram is shown in Figure 5 and includes:
  • the acknowledgment module 402 is configured to determine the N entries to be merged, where the cache block indicated by the entry label of each of the N entries belongs to the merging range, and the merging range indicates 2 a Cache block, the N and the a are both positive integers.
  • the processing module 404 is configured to merge the N entries into a first entry, where the entry label of the first entry indicates the 2 a cache block, and the share number of the first entry Number of the sharer of the N entries.
  • the confirmation module 402 specifically performs the step 202 of the method embodiment 1 and the various alternatives of the step 202.
  • the processing module 404 specifically performs the step 204 of the method embodiment 1 and various alternatives of the step 204.
  • the confirmation module 402 establishes a communication connection with the processing module 404, and the verification module 402 and the processing module 404 establish a communication connection with the storage medium for storing the directory provided in the first embodiment in the device embodiment 1.
  • An example is the working scenario of the device 400.
  • the confirmation module 402 first accesses the directory stored in the storage medium.
  • the confirmation directory includes N entries that meet the foregoing conditions, and the processing module 404 merges the N entries in the directory into the first entry.
  • the initiation of the work of the device 400 may also be performed when the storage space of the entry in the directory reaches the upper limit, that is, the combination of the entries in the directory may be initiated periodically or according to other preset rules.
  • device 400 can be an ASIC or an FPGA.
  • the operation of the directory is also implemented by the hardware device in order to cope with the high speed operating frequency of the cache device.
  • the device provides a device for merging entries in a directory.
  • the device can effectively merge the entries of the cached blocks in the directory within a certain range, thereby saving the storage space of the directory, and avoiding the need for the directory storage space to reach the upper limit.
  • a cache block managed by a part of the entry performs an invalid extra expenditure, and the merge method properly considers the impact of directory consolidation on the efficiency of the directory.
  • the second embodiment of the method provides a method for accessing a directory of any one of the first embodiment, where the directory includes multiple entries, and each entry includes an entry label and a sharer number, and the entry label For indicating a cache block, the plurality of entries include a first entry, the entry label of the first entry indicates 2 a cache blocks, and the a is a positive integer, and the process schematic diagram of the method is as follows: Figure 6, including:
  • Step 602 Receive a directory access request, where the directory access request carries a label of a cache block to be accessed.
  • Step 604 Query the directory according to the label of the cache block to be accessed, and obtain a set of entries corresponding to the cache block to be accessed, where the set of entries includes an indication label of the entry in the directory. All entries of the cache block to be accessed.
  • Step 606 Determine a query entry from the set of entries, where the query entry is the entry with the smallest cache block indicated by the entry label in the set of entries.
  • the directory access request carries the cache block to be accessed, which is 0011.
  • the directory is searched, and the entry of the cache block indicating that the label is 0011 is obtained, for example, the range of the entry label is 0000 to 0011.
  • the entry of the entry is in the range of 0000 to 1000, and the entries constitute a set of entries.
  • the query entries are determined from a set of entries.
  • each entry in the directory further includes a management scope identifier, where the management scope identifier is used to indicate the number of cache blocks indicated by the entry label; and the step 1006 determines from the set of entries.
  • the query entry includes: determining a query entry according to a management scope flag of each entry in the set of entries.
  • the label of the cache block is 0011
  • the obtained entry includes the entry of the entry label of the range of 0011
  • the range of the entry label of the entry is 0000 to 0011.
  • the range of items from 0000 to 1000, and the like.
  • the management scope flag of each entry the entry with the smallest cache block indicated by the entry label in the set of entries is determined.
  • the management scope flag of the entry with the range of 0011 in the previous example entry indicates that the entry is only Manages a cache block.
  • the management range flag of the entry with the range of 0000 to 0011 indicates that the entry manages four cache blocks.
  • the range of the entry label ranges from 0000 to 1000. If the entry manages 16 cache blocks, the entry with the range of 0011 is the query entry.
  • the directory access request received in step 1002 further includes a visitor number, that is, the directory is sent. Record the number of the node that accessed the request, or the number of the cache device for that node.
  • the method further includes: Step 608: If the sharer number of the query entry is different from the visitor number, and generate a first new entry, the entry label of the first new entry indicates the to-be-set The cache block of the access, the sharer number of the first new entry is the visitor number. For example, if the query entry 0011-1-3 and the visitor number is 2, 0011-1-2 is generated for managing the operation of the cache device No. 2 for the cache block 0011. If the sharer number of the query entry is the same as the visitor number, the cache block 0011 can be directly operated according to the first entry.
  • the directory access request includes the foregoing visitor number, and further includes an access type, where the access type is used to indicate that the directory access request is a read request or a write request.
  • Step 1008 further includes: if the access request type indicates that the directory access request is a read request, the sharer number of the query entry does not include the visitor number; and the visitor number is added to the query entry. For example, if the query entry is 0011-1-3, 4 and the visitor number is 2, if the current directory access request is a read request, the query entry can be modified to 0011-1-2, 3, and 4, because it is read. The request does not modify the cache block 0011, so there is no need to invalidate other copies of the cache block 0011. If the sharer number of the query entry includes the visitor number, the directory access request does not need to modify the directory, and the cache block 0011 can be directly operated according to the first entry.
  • the step 608 further includes: generating a second new entry, where the entry label of the second new entry indicates the cache block to be accessed, the first The sharer number of the second new entry is the visitor number, and the cache device corresponding to the sharer number other than the visitor number of the query entry is notified to invalidate the cache block to be accessed.
  • a block entry must be separately managed for the copy of the cache block 0011 in the cache device corresponding to the visitor number, for example, the visitor number is 2, and the first entry 0011-1- 2, 3, 4, generating a second new entry 0011-1-2, and invalidating the cache block 0011 in the cache device of number 3, 4 to achieve consistency of the cache block 0011 in the entire cache system.
  • the foregoing provides a method for accessing a directory, by which an entry having a multi-granularity and a different management scope can be accessed, and an entry having the smallest management scope is selected as an access object, and is taken according to the type of the actual access and the content of the access entry.
  • the third embodiment of the device provides a device 800 for accessing a directory, where the directory includes multiple entries, each entry includes an entry label and a sharer number, where the entry label is used to indicate a cache block.
  • the first entry is included in the plurality of entries, and the entry label of the first entry indicates 2 a cache blocks, and the a is a positive integer.
  • the schematic structure of the device 800 is as shown in FIG. 7 , and includes:
  • the receiving module 802 is configured to receive a directory access request, where the directory access request carries a label of the cache block to be accessed.
  • the processing module 804 is configured to query the directory according to the label of the cache block to be accessed, and obtain a set of the entry corresponding to the cache block to be accessed, where the set of entries includes the entry in the directory a label indicating all entries of the cache block to be accessed; determining a query entry from the set of entries, wherein the query entry is a table with a minimum cache block indicated by an entry label in the set of entries item.
  • the receiving module 802 specifically performs step 602 of the method embodiment 2 and various alternatives of step 602.
  • the processing module 804 specifically performs step 604, step 606, and step 608 of the second embodiment of the method, and various alternatives of step 604, step 606, and step 608.
  • the receiving module 802 establishes a communication connection with the processing module 804, and the receiving module 802 and the processing module 804 establish a communication connection with the storage medium for storing the directory provided in the first embodiment in the first embodiment of the device.
  • the working scenario of the device 800 is: when a cache block in a cache device needs to be accessed, the directory access request is first sent to the receiving module 802 of the device 800, and the processing module 804 obtains a query table after obtaining the directory access request from the receiving module 802. Item, the access operation of the cache block can be completed according to the content recorded by the query table item.
  • device 800 can be an ASIC or an FPGA.
  • access to the directory is typically also implemented by the hardware device in order to match the high speed operating frequency of the cache device.
  • the device provides a device for accessing a directory, and the device can access an entry with different granularity and different management scopes, and select an entry with the smallest management scope as the access object, and according to the type of the actual access and the access entry.
  • the content takes appropriate actions to access the directory flexibly and efficiently, improving the efficiency of the use of the directory.
  • a fourth embodiment of the present invention provides a directory cache device 1000, which includes the optional device 400 as described in any one of the second embodiment of the device, such as the optional device 800.
  • Device Embodiment A storage medium 1004 is described. A communication connection is established between the device 400, the device 800, and the storage medium 1004 through the bus 1002. A schematic structural diagram thereof is shown in FIG. 8.
  • An example is the working scenario of the device 1000.
  • the device 800 is responsible for accessing the storage medium 1004 storing the directory according to the received directory access request to obtain a corresponding entry. If a new entry is added to the directory during the access process, and the storage space of the directory in the existing storage medium 1004 has reached the upper limit, the device 400 merges the entries in the directory to save the storage space of the directory. Free up storage for new entries.
  • the foregoing provides a directory cache device, which can access an entry with multiple granularity and different management scopes, and select an entry with the smallest management scope as an access object, and adopt according to the actual access type and the content of the access entry.
  • Corresponding operations, flexible and efficient access to the directory can also merge the entries in the directory, improve the storage efficiency of the directory, avoid the loss of the cache system when deleting the entries in the directory, and improve the efficiency of the use of the directory.
  • the device in the second embodiment of the device is the device that performs the method described in the first embodiment of the method, so that the two devices can learn from each other;
  • the device in the third embodiment of the device is the method described in the second embodiment of the method. The device of the method, so the two can learn from each other.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

本发明实施例公开了一种表项合并的方法包括确定待合并的N个表项,N个表项中的每个表项的表项标签所指示的缓存块属于合并范围,合并范围指示2a个缓存块;合并N个表项为第一表项,其中,第一表项的表项标签指示2a个缓存块,第一表项的共享者编号包括N个表项中每个表项的共享者编号,通过该方法可以有效合并目录中的表项,提升目录的使用效率。

Description

目录中表项合并的方法以及设备 技术领域
本发明涉及计算机领域,尤其涉及一种目录中表项合并的方法以及设备,访问目录的方法以及设备。
背景技术
在多核处理器中,一般各个处理器核拥有私有的高速缓存,各私有高速缓存仅有其所在的处理器核能够访问,同时多核处理器上还设置有共享的高速缓存,该共享高速缓存可以被各个处理器核访问。各个处理器核从共享高速缓存中读取缓存块后,可以在其私有高速缓存中创建该缓存块的拷贝,并对其进行读写,因此共享高速缓存中存储的某一缓存块可能在多个私有高速缓存中创建有多个拷贝。当任一私有高速缓存中的该缓存块的拷贝被修改时,需要通知其他私有高速缓存以对该缓存块的其他拷贝进行无效处理,以实现同一缓存块在整个系统中的一致性。
常用的一致性协议包括基于目录的一致性协议,也即采用表项来记录每一缓存块的拷贝在各个私有高速缓存中的存储情况,当某一私有高速缓存需要对某一缓存块的拷贝进行操作时,需要获取该缓存块对应的表项以获得该缓存块的拷贝的在各个私有高速缓存的存储情况,以对该缓存块的拷贝进行一致性处理。然而,由于目录高速缓存的功耗和面积开销有限,因此存储的表项的数目也有限,一般无法为每一个缓存块设置一个表项,当目录高速缓存的表项的存储达到上限时,会引发表项的竞争,需要对表项进行替换,而被替换的表项记录的缓存块的各个拷贝均需要被无效掉,对整个缓存系统造成额外的通信开销,同时被替换的表项记录的缓存块的无效也会造成缓存块的命中率降低。
发明内容
本发明实施例提供了一种对目录中表项进行合并的方法,通过该方法可以 有效提升目录的使用效率,减少对表项进行替换造成的对缓存系统的影响。
本发明实施例的第一方面提供了一种目录中表项合并的方法,所述目录包括多个表项,每个表项包括表项标签和共享者编号,所述表项标签用于指示缓存块;所述方法包括:确定待合并的N个表项,其中,所述N个表项中的每个表项的表项标签所指示的缓存块属于合并范围,所述合并范围指示2a个缓存块,所述N和所述a均为正整数;合并所述N个表项为第一表项,其中,所述第一表项的表项标签指示所述2a个缓存块,所述第一表项的共享者编号包括所述N个表项中每个表项的共享者编号。
结合第一方面,在第一方面的第一种实现方式中,所述合并所述N个表项为第一表项之前还包括:确定是否满足合并条件,在满足合并条件时,合并所述N个表项;所述合并条件包括以下条件任意之一:所述目录中不包括第二表项,其中,所述第二表项的表项标签与所述第一表项的表项标签相同且都指示所述2a个缓存块;或者,所述目录包括第二表项且所述N个表项中任一表项的表项标签指示2个以上缓存块的标签,其中,所述第二表项的表项标签与所述第一表项的表项标签相同且都指示所述2a个缓存块;或者,所述目录包括第二表项且所述N个表项中的任一表项的表项标签均指示一个缓存块并所述N大于预设的阈值,其中,所述第二表项的表项标签与所述第一表项的表项标签相同且都指示所述2a个缓存块。
结合第一方面的第一种实现方式,在第一方面的第二种实现方式中,在所述目录包含所述第二表项并进行合并得到所述第一表项之后,还包括:删除所述第二表项。
结合第一方面的第二种实现方式,在第一方面的第三种实现方式中,所述删除所述第二表项之前还包括:确定所述第二表项的共享者编号是否与所述第一表项的共享者编号相同;若所述第二表项的共享者编号与所述第一表项的共享者编号不同,在执行所述删除所述第二表项的动作之前,获取所述第二表项中的冗余共享者编号,所述冗余共享者编号为不同于所述第一表项的共享者编号的第二表项中的其它共享者编号,在所述冗余共享者编号所对应的缓存设备中查询所述2a个缓存块,确定所述冗余共享者编号所对应的缓存设备中缓存的第 一缓存块;执行以下两种动作之一:无效所述冗余共享者编号所对应的缓存设备所缓存的所述第一缓存块,或者生成第三表项,所述第三表项的表项标签指示所述第一缓存块,所述第三表项的共享者编号为所述冗余共享者编号。
结合第一方面的第一种实现方式,在第一方面的第四种实现方式中,在所述目录包含所述第二表项并进行合并得到所述第一表项之后,所述方法还包括:确定所述第二表项是否包括两个或两个以上的共享者编号,在所述第二表项包括两个或两个以上的共享者编号时,合并所述第二表项和所述第一表项为第四表项,所述第四表项的表项标签指示所述2a个缓存块,所述第四表项的共享者编号包括所述第二表项的共享者编号和所述第一表项的共享者编号。
结合第一方面的第四种实现方式,在第一方面的第五种实现方式中,还包括:若所述第二表项仅包括一个共享者编号,删除所述第二表项。
结合第一方面的第五种实现方式,在第一方面的第六种实现方式中,所述删除所述第二表项之前还包括:确定所述第二表项的共享者编号是否与所述第一表项的共享者编号相同;若所述第二表项的共享者编号与所述第一表项的共享者编号不同,在执行所述删除所述第二表项的动作之前,获取所述第二表项中的冗余共享者编号,所述冗余共享者编号为不同于所述第一表项的共享者编号的第二表项中的其它共享者编号,在所述冗余共享者编号所对应的缓存设备中查询所述2a个缓存块,确定所述冗余共享者编号所对应的缓存设备中缓存的第二缓存块;执行以下两种动作之一:无效所述冗余共享者编号所对应的缓存设备所缓存的所述第二缓存块,或者生成第五表项,所述第五表项的表项标签指示所述第二缓存块,所述第五表项的共享者编号为所述冗余共享者编号。
本发明实施例的第二方面提供了一种访问目录的方法,所述目录包括多个表项,每个表项包括表项标签和共享者编号,所述表项标签用于指示缓存块,所述多个表项中包含第一表项,所述第一表项的表项标签指示2a个缓存块,所述a为正整数;所述方法包括:接收目录访问请求,所述目录访问请求携带待访问的缓存块的标签;根据所述待访问的缓存块的标签,查询所述目录,获取一组所述待访问的缓存块对应的表项,所述一组表项包括所述目录中表项标签指示 所述待访问的缓存块的所有表项;从所述一组表项中确定查询表项,所述查询表项为所述一组表项中表项标签指示的缓存块最少的表项。
结合第二方面,在第二方面的第一种实现方式中所述目录中每个表项还包括管理范围标识位,所述管理范围标识位用于指示所述表项标签指示的缓存块的数量;所述从所述一组表项中确定查询表项包括:根据所述一组表项中各个表项的管理范围标志位,确定所述查询表项。
结合第二方面或第二方面的第一种实现方式,在第二方面的第二种实现方式中,所述目录访问请求还包括访问者编号,所述访问者编号指示发出所述目录访问请求的缓存设备;所述方法还包括:若所述查询表项的共享者编号与所述访问者编号不相同,生成第一新表项,所述第一新表项的表项标签指示所述待访问的缓存块,所述第一新表项的共享者编号为所述访问者编号。
结合第二方面或第二方面的第二种实现方式,在第二方面的第三种实现方式中,所述目录访问请求还包括访问者编号,所述访问者编号指示发出所述目录访问请求的缓存设备,所述目录访问请求还包括访问类型,所述访问类型用于指示所述目录访问请求为读请求或写请求;所述方法还包括:若所述访问请求类型指示所述目录访问请求为读请求,所述查询表项的共享者编号不包括所述访问者编号,将所述访问者编号添加至所述查询表项的共享者编号中。
结合第二方面的第三种实现方式,在第二方面的第四种实现方式中,所述方法还包括:若所述访问请求类型指示所述目录访问请求为写请求;生成第二新表项,所述第二新表项的表项标签指示所述待访问的缓存块,所述第二新表项的共享者编号为所述访问者编号,并通知所述查询表项中除所述访问者编号之外的其它共享者编号对应的缓存设备无效所述待访问的缓存块。
本发明实施例的第三方面提供了一种用于目录中表项合并的设备,所述目录包括多个表项,每个表项包括表项标签和共享者编号,所述表项标签用于指示缓存块;所述设备包括:确认模块,用于确定待合并的N个表项,其中,所述N个表项中的每个表项的表项标签所指示的缓存块属于合并范围,所述合并范围指示2a个缓存块,所述N和所述a均为正整数;处理模块,用于合并所述N个表项 为第一表项,其中,所述第一表项的表项标签指示所述2a个缓存块,所述第一表项的共享者编号为所述N个表项的共享者编号。
结合第三方面,在第三方面的第一种实现方式中,所述处理模块合并所述N个表项为所述第一表项之前还用于:确定是否满足合并条件,在满足合并条件时,合并所述N个表项;所述合并条件包括以下条件任意之一:所述目录中不包括第二表项,其中,所述第二表项的表项标签与所述第一表项的表项标签相同且都指示所述2a个缓存块;或者,所述目录包括第二表项且所述N个表项中任一表项的表项标签指示2个以上缓存块的标签,其中,所述第二表项的表项标签与所述第一表项的表项标签相同且都指示所述2a个缓存块;或者,所述目录包括第二表项且所述N个表项中的任一表项的表项标签均指示一个缓存块并所述N大于预设的阈值,其中,所述第二表项的表项标签与所述第一表项的表项标签相同且都指示所述2a个缓存块。
结合第三方面的第一种实现方式,在第三方面的第二种实现方式中,在所述目录包含所述第二表项并进行合并得到所述第一表项之后,所述处理模块还用于:删除所述第二表项。
结合第三方面的第二种实现方式,在第三方面的第三种实现方式中,所述处理模块删除所述第二表项之前还用于:确定所述第二表项的共享者编号是否与所述第一表项的共享者编号相同;若所述第二表项的共享者编号与所述第一表项的共享者编号相同,执行所述删除所述第二表项的动作;若所述第二表项的共享者编号与所述第一表项的共享者编号不同,在执行所述删除所述第二表项的动作之前,获取所述第二表项中的冗余共享者编号,所述冗余共享者编号为不同于所述第一表项的共享者编号的第二表项中的其它共享者编号,在所述冗余共享者编号所对应的缓存设备中查询所述2a个缓存块,确定所述冗余共享者编号所对应的缓存设备中缓存的第一缓存块;执行以下两种动作之一:无效所述冗余共享者编号所对应的缓存设备所缓存的所述第一缓存块,或者生成第三表项,所述第三表项的表项标签指示所述第一缓存块,所述第三表项的共享者编号为所述冗余共享者编号。
结合第三方面的第一种实现方式,在第三方面的第四种实现方式中,所述 处理模块还用于:在所述目录包含所述第二表项并进行合并得到所述第一表项之后,确定所述第二表项是否包括两个或两个以上的共享者编号,在所述第二表项包括两个或两个以上的共享者编号时,合并所述第二表项和所述第一表项为第四表项,所述第四表项的表项标签指示所述2a个缓存块,所述第四表项的共享者编号包括所述第二表项的共享者编号和所述第一表项的共享者编号。
结合第三方面的第四种实现方式,在第三方面的第五种实现方式中,所述处理模块还用于若所述第二表项仅包括一个共享者编号,删除所述第二表项。
结合第三方面的第五种实现方式,在第三方面的第六种实现方式中,所述处理模块还用于删除所述第二表项之前确定所述第二表项的共享者编号是否与所述第一表项的共享者编号相同;若所述第二表项的共享者编号与所述第一表项的共享者编号不同,在执行所述删除所述第二表项的动作之前,获取所述第二表项中的冗余共享者编号,所述冗余共享者编号为不同于所述第一表项的共享者编号的第二表项中的其它共享者编号,在所述冗余共享者编号所对应的缓存设备中查询所述2a个缓存块,确定所述冗余共享者编号所对应的缓存设备中缓存的第二缓存块;执行以下两种动作之一:无效所述冗余共享者编号所对应的缓存设备所缓存的所述第二缓存块,或者生成第五表项,所述第五表项的表项标签指示所述第二缓存块,所述第五表项的共享者编号为所述冗余共享者编号。
结合第三方面或第一方面的第一种或第二种或第三种或第四种或第五种实现方式,在第三方面的第六种实现方式中,所述设备为专用集成电路ASIC或现成可编程门阵列FPGA。
本发明实施例的第四方面提供了一种用于访问目录的设备,所述目录包括多个表项,每个表项包括表项标签和共享者编号,所述表项标签用于指示缓存块,所述多个表项中包含第一表项,所述第一表项的表项标签指示2a个缓存块,所述a为正整数;所述设备包括:接收模块,用于接收目录访问请求,所述目录访问请求携带待访问的缓存块的标签;处理模块,用于根据所述待访问的缓存块的标签,查询所述目录,获取一组所述待访问的缓存块对应的表项,所述一组表项包括所述目录中表项标签指示所述待访问的缓存块的所有表项;从所述 一组表项中确定查询表项,所述查询表项为所述一组表项中表项标签指示的缓存块最少的表项。
结合第四方面,在第四方面的第一种实现方式中,所述目录中每个表项还包括管理范围标识位,所述管理范围标识位用于指示所述表项标签指示的缓存块的数量;所述处理模块从所述一组表项中确定查询表项包括:根据所述一组表项中各个表项的管理范围标志位,确定所述查询表项。
结合第四方面或第四方面的第一种实现方式,在第四方面的第二种实现方式中,所述目录访问请求还包括访问者编号,所述访问者编号指示发出所述目录访问请求的缓存设备;所述处理模块,还用于若所述查询表项的共享者编号与所述访问者编号不相同,生成第一新表项,所述第一新表项的表项标签指示所述待访问的缓存块,所述第一新表项的共享者编号为所述访问者编号。
结合第四方面或第四方面的第一种实现方式,在第四方面的第三种实现方式中,所述目录访问请求还包括访问者编号,所述访问者编号指示发出所述目录访问请求的缓存设备,所述目录访问请求还包括访问类型,所述访问类型用于指示所述目录访问请求为读请求或写请求;所述处理模块,若所述访问请求类型指示所述目录访问请求为读请求,所述查询表项的共享者编号不包括所述访问者编号,将所述访问者编号添加至所述查询表项的共享者编号中。
结合第四方面的第三种实现方式,在第四方面的第四种实现方式中,还用于若所述访问请求类型指示所述目录访问请求为写请求;生成第二新表项,所述第二新表项的表项标签指示所述待访问的缓存块,所述第二新表项的共享者编号为所述访问者编号,并通知所述查询表项中除所述访问者编号之外的其它共享者编号对应的缓存设备无效所述待访问的缓存块。
结合第四方面或第四方面的第一种或第二种或第三种或第四种实现方式,在第四方面的第五种实现方式中,所述设备为专用集成电路ASIC或现成可编程门阵列FPGA。
本发明实施例的第五方面提供了一种目录,包括:块表项,所述块表项包括第一表项标签,第一共享者编号;区域表项,所述区域表项包括第二表项标 签,第二共享者编号;超区域表项,所述超区域表项包括第三表项标签,第三共享者编号;所述第一表项标签指示一缓存块,所述第二表项标签指示2n个缓存块,所述第三表项标签指示2n+m个缓存块,所述n和所述m均为正整数。
结合第五方面,在第五方面的第一种实现方式中,所述块表项还包括第一管理范围标志位,所述区域表项还包括第二管理范围标志位,所述超区域表项还包括第三管理范围标志位;所述第一管理范围标识位用于指示所述第一表项标签指示的缓存块的数量,所述第二管理范围标识位用于指示所述第二表项标签指示的缓存块的数量,所述第三管理范围标识位用于指示所述第三表项标签指示的缓存块的数量。
本发明实施例的第六方面提供了一种存储介质,用于存储本发明实施例的第五方面或第五方面的第一种实现方式所述的目录。
本发明实施例的第七方面提供了一种目录高速缓存,包括如本发明实施例的第六方面所述的存储介质、如本发明实施例的第二方面或第二方面任一种实现方式所述的用于目录中表项合并的设备、如本发明实施例的第四方面或第四方面任一种实现方式所述的用于访问目录的设备,总线;所述存储介质、所述用于目录中表项合并的设备、所述用于访问目录的设备之间通过所述总线建立通信连接。
通过以上提供的实施例,可以有效合并目录的表项,节约目录的存储空间,尽量避免了目录存储空间达到上限后必须对一部分表项管理的缓存块进行无效的额外支出。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例中所需要使用的附图作以简单地介绍,显而易见的,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例所应用的多核处理器的共享缓存架构图;
图2为本发明实施例一所应用的目录的组成结构示意图;
图3为本发明实施例一所应用的又一目录的组成结构示意图;
图4为本发明方法实施例一所应用的表项合并的方法的流程示意图;
图5为本发明设备实施例二所应用的用于目录中表项合并的设备的组成结构示意图;
图6为本发明方法实施例二所应用的访问目录的方法的流程示意图;
图7为本发明设备实施例三所应用的用于访问目录的设备的组成结构示意图;
图8为本发明设备实施例四所应用的目录高速缓存设备的组成结构示意图。
具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
贯穿本说明书,术语“处理器核”一般指多核处理器芯片的一个或者多个执行数据处理任务的处理单元,也可为称之为处理器核心或处理核心,还可能是一种具有信号的处理能力的集成电路芯片,例如通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)等。
贯穿本说明书,术语“目录高速缓存”(英文全称:Directory Cache),一般指代用于基于目录的缓存一致性(英文全称:Directory Based Cache Coherence)系统中,具有存储目录以及处理表项的功能的设备。
贯穿本说明书,术语“表项”,或称为“目录表项”或“目录条目”(英文全称:Directory Entry),一般指代基于目录的缓存一致性系统中存储于目录高速 缓存的一个条目(英文全称:Entry)。每一表项记录了一个或多个表项标签,每一表项标签指示一个缓存块,每一表项还记录了一个或多个共享者编号,每一共享者编号指示一个处理器节点或私有高速缓存。
贯穿本说明书,术语“表项标签”指代某一缓存块的标签,一般而言共享缓存架构中每一缓存块具有唯一的标签,通过该标签可以确定一缓存块。实际上,假设某一系统中的缓存块的标签有4位,该缓存块在缓存设备中的地址有12位,一般而言通过12位地址可以索引到某一缓存行(英文全称:Cache Line),再通过4位标签可以查找到一缓存块。
贯穿本说明书,术语“共享者编号”用于指示缓存设备,某一表项中的共享者编号也即指示该表项的表项标签指示的缓存块对应的缓存设备,实际上,由于表项中记录的缓存块和缓存设备的对应关系并不完全准确,例如存在表项0000至1111-1、2,但实际上1号和2号缓存设备并不一定存储了标签为0000至1111的全部缓存块。
贯穿本说明书,术语“表项标签所指示的缓存块属于合并范围”指示该缓存块的属于2a个连续缓存块,也即该缓存块的标签属于2a个连续缓存块的标签,此处的“属于”包括合并范围的两端的端点,例如缓存块A属于合并范围0000至1111,即缓存块A的标签可以为0000至1111中之任一且可以为0000或1111。
本发明实施例的多核处理器的共享缓存架构
图1描述了本发明实施例所提供的多核处理器的共享缓存架构的部分示意图。多核处理器一般有多个节点,如图1中的节点0,节点1,每个节点包括处理器或称为处理器核,以及各个节点的缓存设备,也即私有高速缓存,每个节点的缓存设备仅有该节点的处理器能否访问并对其中的数据进行读写。各个缓存设备中的数据来源于共享数据高速缓存,共享数据高速缓存可以被各缓存设备访问,因此如图1所示,共享数据高速缓存中存储的缓存块A,可能被节点0,节点1和节点N均读取过并存储于节点0,节点1和节点N的缓存设备中,因此整个共享缓存架构中缓存块A共有3个拷贝。为了保证整个架构中的各个缓存块A的拷贝之间的一致性,例如如果节点0的缓存设备对缓存块A进行了写操作,则节点1和节点N需要得知其所存储的缓存块A已经无效,共享缓存架构中采用了目录高速缓存来保证各个缓存块的拷贝之间的一致性。目录高速缓 存中记录了表项,任一表项记录了一个多个缓存块的标签,以及这些缓存块的共享者编号,即这些缓存块的拷贝所在的节点的编号。各节点读写其缓存设备中的缓存块时,需先访问目录高速缓存中记录的表项,获取该缓存块的拷贝所在的节点的编号,以保证待读写的缓存块的各个拷贝的一致性。目录高速缓存除了用于存储表项之外,一般还具有一定的针对表项的处理能力,例如根据目录访问请求搜索出相应的表项、目录高速缓存中表项的存储容量达到上限时对表项进行处理、表项被修改时通知相应缓存设备等。
实施例一
本实施例一提供一种用于目录,该目录适用于前述共享缓存架构,如图2所示目录100包括:
块表项102,包括表项标签1022,共享者编号1024;
区域表项104,包括表项标签1042,共享者编号1044;
超区域表项106,包括表项标签1062,共享者编号1064;
表项标签1022指示一缓存块,表项标签1042指示2n个缓存块,表项标签1062指示2n+m个缓存块,n和m均为正整数。即超区域表项106管理的缓存块的数量可以为区域表项104的2m倍,实际中超区域表项106可以包括多种超区域表项,例如,包括表项标签指示2n+1个缓存块的超区域表项,表项标签指示2n+2个缓存块的超区域表项,表项标签指示2n+3个缓存块的超区域表项等,假设管理缓存块数量最多的超区域表项管理2n+L个缓存块。
共享者编号1024、共享者编号1044、共享者编号1064可以包括一个或多个缓存设备的编号,每一个缓存设备的编号指示架构中的一个节点或一个缓存设备,以块表项102管理图1中缓存块A为例,则表项标签1022为缓存块A的标签,共享者编号1042包括节点0的编号、节点1的编号和节点N的编号。
可选的,如图3,块表项102还包括管理范围标志位1026,区域表项104还包括管理范围标志位1046,超区域表项106还包括管理范围标志位1066;管理范围标识位1026用于指示表项标签1022指示的缓存块的数量,管理范围标识位1046用于指示表项标签1042指示的缓存块的数量,管理范围标识位1066用于指示表项标签1062指示的缓存块的数量。
实际中,由于各个表项管理的缓存块的标签必然是连续的2n个,各个表项的表项标签可能不直接指示缓存块,而是指示起始地址,再配合管理范围标志位来确定该表项管理的缓存块的范围。例如,某一表项的表项标签为0011,管理范围标志位为0010,则该管理范围标志位指示该表项管理4个缓存块,即标签为0011,0100,0101,0110的4个缓存块。或者表项标签可以仅表示缓存块的标签的高位,例如表项标签为00,管理范围标志位为0010,则该管理范围标志位指示该表项管理4个缓存块,这4个缓存块的标签的高位均为00,则这4个缓存块的标签为0000,0001,0010,0011。由于实际设计中,每一个表项的表项标签位的宽度应该为一致的,每一个表项的管理范围标志位的宽度也应该为一致的,取前例中管理缓存块数量最多的超区域表项管理2n+L个缓存块的情况,则管理范围标志位最少需要有log2(L+2)+1比特。
上述提供一种目录,该目录提供了多种不同的缓存块的管理范围的表项,管理范围大的表项管理较多的缓存块节约了目录的存储空间,管理范围小的表项提升了目录的精确程度,多种管理范围的表项的结合使用提升了目录的使用效率,节约了基于目录的一致性协议的开销。
设备实施例一
本设备实施例一提供一种用于存储实施例一中的提供的任一种目录的存储介质,该存储介质可以为RAM、ROM、EEPROM、磁盘存储介质、固态硬盘(英文全称:Solid State Drives)或其他存储介质。
上述提供一种存储目录的存储介质,该存储介质中存储的目录提供了多种不同的缓存块的管理范围的表项,管理范围大的表项管理较多的缓存块节约了目录的存储空间,管理范围小的表项提升了目录的精确程度,多种管理范围的表项的结合使用提升了目录的使用效率,节约了基于目录的一致性的开销,使得运用该存储介质的多核处理器芯片的工作效率提升。
方法实施例一
本方法实施例一提供一种目录中表项合并的方法,具体的该方法可以运用于实施例一中任一可选方案的目录,即目录中每一个表项包括表项标签和共享者编号,表项标签用于指示缓存块,该方法的流程示意图如图4所示,包括:
步骤202,确定待合并的N个表项,其中,所述N个表项中的每个表项的表项标签所指示的缓存块属于合并范围,所述合并范围指示2a个缓存块,所述N和所述a均为正整数。
步骤204,合并所述N个表项为第一表项,其中,所述第一表项的表项标签指示所述2a个缓存块,所述第一表项的共享者编号包括所述N个表项中每个表项的共享者编号。
若所述N个表项中的每个表项的表项标签所指示的缓存块属于合并范围且所述N个表项的共享者编号相同,则步骤204中合并出的所述第一表项的共享者编号为所述N个表项的共享者编号。
举例说明,目录中包括4个表项,每一个表项的表项标签指示一个或多个缓存块,且这4个表项指示的全部缓存块的标签,均在0000至0011范围内,0000至1111即此时的合并范围,a等于2。如这4个表项,分别管理的缓存块的标签为,0000,0001,0010和0011,且这4个表项的共享者编号相同,例如这4个表项的共享者编号为1,指示编号为1的缓存设备,则合并这4个表项为第一表项,第一表项的表项标签为0000至0011和第一表项的共享者编号指示缓存设备1。下文中0000-1,指示某一表项的表项标签为0000,指示标签为0000的缓存块,共享者编号指示缓存设备1,0000至1111-1、2,指示某一表项的表项标签指示0000到1111共16个缓存块,共享者编号指示缓存设备1和缓存设备2,本文中表项的意义依此类推。
需要说明的是,a的最小取值为目录中区域表项的管理范围,如果目录中的区域表项设置为管理8个缓存块的标签,那么此处a最小可以为3,则第一表项为区域表项,a也可以取大于3的整数,此时第一表项为超区域表项。同时,表项的共享者编号还可以指示两个或两个以上缓存设备,例如,a为4,4个表项分别为,0000-1、2,0001-1、2,0010-1、2,1000至1111-1、2,则合并这4个表项为第一表项,第一表项为0000至1111-1、2。
若所述N个表项中的每个表项的表项标签所指示的缓存块属于合并范围且所述N个表项的共享者编号不全相同,则步骤204中合并出的所述第一表项的共享者编号包括所述N个表项中每个表项的共享者编号。
举例说明,目录中包括4个表项,每一个表项的表项标签指示一个或多个缓 存块,且a取2,这4个表项指示的全部缓存块的标签,均在0000至0011范围内,4个表项为0000-1,0001-2,0010-1,0011-3,合并为0000至0011-1、2、3。
需要说明的是,a的最小取值为目录中区域表项的管理范围,如果目录中的区域表项设置为管理8个缓存块的标签,那么此处a最小可以为3,则第一表项为区域表项,a也可以取大于3的整数,此时第一表项为超区域表项。例如,a为4,4个表项分别为,0000-1、2,0001-1、3,0010-1,1000至1111-2、4,则合并这4个表项为第一表项,第一表项为0000至1111-1、2、3、4。
还需要说明的是,步骤202的发起一般在目录中表项的存储空间达到上限,但需要生成新的表项时,则执行步骤202和步骤204将目录中的表项合并,以节省表项所占的存储空间,为新的表项空出存储空间,避免了为了生成新的表项,必须删除现有表项对系统的负担。步骤202的发起也可以不在目录中表项的存储空间达到上限时,即可以定时发起或按照别的预设规则发起对目录中表项的合并。
可选的,如果目录中每一个表项还包括管理范围标志位,则合并N个表项为第一表项时,还需要生成第一表项的管理范围标志位,该管理范围标志位用于指示第一表项的表项标签指示2a个缓存块。例如a为4,4个表项分别为,0000-1-1、2,0001-1-1、2,0010-1-1、2,1000-8-1、2,其中0000-1-1、2指示该表项的表项标签为0000,管理范围标志位为1,因此该表项仅管理标签为0000的缓存块,共享者编号指示缓存设备1和缓存设备2,1000-8-1、2指示该表项管理范围标志位为8,因此该表项管理标签为1000至1111的8个缓存块,共享者编号指示缓存设备1和缓存设备2,本文中带管理范围标志位的表项的意义依此类推,则0000-1、2,0001-1、2,0010-1、2,1000至1111-1、2这4个表项合并而成的第一表项可以为0000-16-1、2。
可选的,步骤204中合并N个表项为第一表项之前还包括:确定是否满足合并条件,在满足合并条件时,合并所述N个表项;所述合并条件包括以下条件任意之一:目录中不包括第二表项,第二表项的表项标签指示前述连续2a个缓存块,则合并N个表项为第一表项;目录中包括第二表项,并且所述N个表项中任一表项的表项标签指示2个以上缓存块的标签;目录包括第二表项且所述N个表项中 的任一表项的表项标签均指示一个缓存块并所述N大于预设的阈值。
具体的,如果目录中存在于第一表项的表项标签相同的第二表项,此时需要判断是否用第一表项替换第二表项。如前例中,0000-1、2,0001-1、2,0010-1、2,1000至1111-1、2合并为0000至1111-1、2为例,如果目录中已经存在第二表项的表项标签为0000至1111,则此时第一表项与第二表项发生冲突,目录中无法存在两个表项标签相同的表项,此时需要判断是否采用第一表项替换第二表项。而当处于以下两种情况之一时,采取合并第一表项,情况1:若N个表项中任一表项为区域表项或者超区域表项,即N个表项中任一表项的表项标签指示2个以上缓存块的标签,则采用第一表项替换第二表项,情况2:若N个表项中任一表项均为块表项,则若N大于预设的阈值就采用第一表项替换第二表项。若前述两个情况均不符合,则保留第二表项,N个表项并不合成为第一表项,放弃本次合并。一般而言,若待合并的表项中有区域表项或者超区域表项的存在,或者待合并的表项中块表项的数量较高,则本次合并对于目录的效率提升较为明显,因此在该条件下采用第一表项替换掉第二表项。
可选的,步骤204中合并得到所述第一表项之后,还包括:删除所述第二表项。如前文所述,如果存在第二表项与第一表项冲突,并且确定了合并出第一表项,则必须删除第二表项,以免目录中存在表项标签相同的两个表项。
可选的,步骤204中删除所述第二表项之前还包括:确定所述第二表项的共享者编号与所述第一表项的共享者编号不同,在执行所述删除所述第二表项的动作之前,获取所述第二表项中的冗余共享者编号,所述冗余共享者编号为不同于所述第一表项的共享者编号的第二表项中的其它共享者编号,在所述冗余共享者编号所对应的缓存设备中查询所述2a个缓存块,确定所述冗余共享者编号所对应的缓存设备中缓存的第一缓存块,执行无效所述冗余共享者编号所对应的缓存设备所缓存的所述第一缓存块,或者生成第三表项,所述第三表项的表项标签指示所述第一缓存块,所述第三表项的共享者编号为所述冗余共享者编号。
具体的,如前例中,0000-1、2,0001-1、2,0010-1、2,1000至1111-1、2合并为0000至1111-1、2为例,如果目录中已经存在第二表项为0000至1111-1、2、3,确定0000至1111-1、2后,需要删除第二表项,而删除第二表项前还需 要获取冗余共享者编号,此时冗余共享者编号即缓存设备3的编号,获取第一缓存块,该第一缓存块的标签即缓存设备3中存储的标签为0000至1111区间内的缓存块,如第一缓存块的标签为0011,则生成第三表项0011-3,即采用一个新的块表项来维护第一缓存块。获取冗余共享者编号后,除了为第一缓存块生成新的表项进行管理之外,还可以将冗余共享者编号对应的缓存设备中存储的标签为0000至1111区间内的缓存块无效掉。由于需要删除第二表项,而如果第二表项的共享者编号中有第一表项的共享者编号中保护不到的,此时需要为第一表项的共享者编号中保护不到的共享者编号中的0000至1111区间内的缓存块生成新的块表项保护,或者需要将这些缓存块无效,因为目录中可能并没有表项能够管理这些缓存块。
可选的,步骤204中合并得到所述第一表项之后,还包括:确定所述第二表项是否包括两个或两个以上的共享者编号,在所述第二表项包括两个或两个以上的共享者编号时,合并所述第二表项和所述第一表项为第四表项,所述第四表项的表项标签指示所述2a个缓存块,所述第四表项的共享者编号包括所述第二表项的共享者编号和所述第一表项的共享者编号。
如前例中,0000-1、2,0001-1、3,0010-1,1000至1111-2、4,合并为0000至1111-1、2、3、4为例,如果目录中已经不存在表项的表项标签为0000至1111,则第二表项不存在,直接合并出表项0000至1111-1、2、3、4并存储于目录中即可;如果确定目录中已经存在第二表项的表项标签为0000至1111-2、3、4、5,第二表项包括两个或两个以上共享者编号,则合并第一表项和第二表项为第四表项,则第四表项为0000至1111-1、2、3、4、5。
可选的,步骤204中合并得到所述第一表项之后,所述目录包含所述第二表项并进行合并得到所述第一表项之后,若所述第二表项仅包括一个共享者编号,删除所述第二表项。
如前例中,如果确定目录中已经存在第二表项的表项标签为0000至1111-3,即第二表项的仅包括一个共享者编号,则删除第二表项。
可选的,步骤204中删除所述第二表项之前还包括:确定所述第二表项的共享者编号与所述第一表项的共享者编号不同,在执行所述删除所述第二表项的 动作之前,获取所述第二表项中的冗余共享者编号,所述冗余共享者编号为不同于所述第一表项的共享者编号的第二表项中的其它共享者编号,在所述冗余共享者编号所对应的缓存设备中查询所述2a个缓存块,确定所述冗余共享者编号所对应的缓存设备中缓存的第一缓存块,执行无效所述冗余共享者编号所对应的缓存设备所缓存的所述第一缓存块,或者生成第三表项,所述第三表项的表项标签指示所述第一缓存块,所述第三表项的共享者编号为所述冗余共享者编号。
具体的,如前例中,0000-1、2,0001-1、2,0010-1、2,1000至1111-1、2合并为0000至1111-1、2为例,如果目录中已经存在第二表项为0000至1111-1、2、3,确定0000至1111-1、2后,需要删除第二表项,而删除第二表项前还需要获取冗余共享者编号,此时冗余共享者编号即缓存设备3的编号,获取第一缓存块,该第一缓存块的标签即缓存设备3中存储的标签为0000至1111区间内的缓存块,如第一缓存块的标签为0011,则生成第三表项0011-3,即采用一个新的块表项来维护第一缓存块。获取冗余共享者编号后,除了为第一缓存块生成新的表项进行管理之外,还可以将冗余共享者编号对应的缓存设备中存储的标签为0000至1111区间内的缓存块无效掉。由于需要删除第二表项,而如果第二表项的共享者编号中有第一表项的共享者编号中保护不到的,此时需要为第一表项的共享者编号中保护不到的共享者编号中的0000至1111区间内的缓存块生成新的块表项保护,或者需要将这些缓存块无效,因为目录中可能并没有表项能够管理这些缓存块。
上述提供一种目录中表项合并的方法,通过该方法可以有效合并目录中缓存块的标签在一定范围内的表项,节约目录的存储空间,尽量避免了目录存储空间达到上限后必须对一部分表项管理的缓存块进行无效的额外支出,并且该合并方法妥善考虑了目录合并后对目录使用效率的影响。
设备实施例二
本设备实施例二提供一种用于缓存一致性的目录中表项合并的设备400,设备400具体用于实施例一中任一可选方案的目录中表项的合并,设备400的组成结构示意图如图5所示,包括:
确认模块402,用于确定待合并的N个表项,其中,所述N个表项中的每个表项的表项标签所指示的缓存块属于合并范围,所述合并范围指示2a个缓存块,所述N和所述a均为正整数。
处理模块404,用于合并所述N个表项为第一表项,其中,所述第一表项的表项标签指示所述2a个缓存块,所述第一表项的共享者编号为所述N个表项的共享者编号。
确认模块402具体执行方法实施例一的步骤202,以及步骤202的各个可选方案。
处理模块404具体执行方法实施例一的步骤204,以及步骤204的各个可选方案。
确认模块402与处理模块404建立通信连接,确认模块402、处理模块404均与设备实施例一中的用于存储实施例一中的提供的目录的存储介质建立通信连接。
举例说明设备400的工作场景:当设备实施例一中的存储介质中存储的目录中表项的存储空间达到上限,但需要生成新的表项时,确认模块402首先访问存储介质中存储的目录,确认目录中,包括符合前述条件的N个表项,处理模块404将目录中的N个表项合并为第一表项。设备400的工作的发起也可以不在目录中表项的存储空间达到上限时,即可以定时发起或按照别的预设规则发起对目录中表项的合并。
可选的,设备400可以为ASIC或FPGA。一般而言,为了与缓存设备的高速操作频率相配合,对目录的操作也是由硬件设备实现的。
上述提供一种用于目录中表项合并的设备,设备该可以有效合并目录中共缓存块的标签在一定范围内的表项,节约目录的存储空间,尽量避免了目录存储空间达到上限后必须对一部分表项管理的缓存块进行无效的额外支出,并且该合并方法妥善考虑了目录合并后对目录使用效率的影响。
方法实施例二
本方法实施例二提供一种访问实施例一中任一可选方案的目录的方法,所述目录包括多个表项,每个表项包括表项标签和共享者编号,所述表项标签用 于指示缓存块,所述多个表项中包含第一表项,所述第一表项的表项标签指示2a个缓存块,所述a为正整数,该方法的流程示意图如图6所示,包括:
步骤602,接收目录访问请求,所述目录访问请求携带待访问的缓存块的标签。
步骤604,根据所述待访问的缓存块的标签,查询所述目录,获取一组所述待访问的缓存块对应的表项,所述一组表项包括所述目录中表项标签指示所述待访问的缓存块的所有表项。
步骤606,从所述一组表项中确定查询表项,所述查询表项为所述一组表项中表项标签指示的缓存块最少的表项。
例如,目录访问请求中携带待访问的缓存块的标签为0011,根据该标签,查询目录,获取全部表项标签指示标签为0011的缓存块的表项,例如表项标签的范围为0000至0011的表项,表项标签的范围为0000至1000的表项等,这些表项即构成了一组表项。获取一组表项后,从一组表项中确定查询表项。
可选的,所述目录中每个表项还包括管理范围标识位,管理范围标识位用于指示所述表项标签指示的缓存块的数量;步骤1006中从所述一组表项中确定查询表项包括:根据所述一组表项中各个表项的管理范围标志位,确定查询表项。
例如目录访问请求中携带待访问的缓存块的标签为0011,获取的一组表项包括表项标签的范围为0011的表项,表项标签的范围为0000至0011的表项,表项标签的范围为0000至1000的表项等。根据各个表项的管理范围标志位,确定一组表项中表项标签指示的缓存块最少的表项,例如前例表项标签的范围为0011的表项的管理范围标志位指示该表项仅管理一个缓存块,表项标签的范围为0000至0011的表项的管理范围标志位指示该表项管理4个缓存块,表项标签的范围为0000至1000的表项的管理范围标志位指示该表项管理16个缓存块的,则表项标签的范围为0011的表项为查询表项。
可选的,步骤1002中接收的目录访问请求还包括访问者编号,即发出该目 录访问请求的节点的编号,或该节点的缓存设备的编号。步骤602后还包括,步骤608:若所述查询表项的共享者编号与所述访问者编号不相同;生成第一新表项,所述第一新表项的表项标签指示所述待访问的缓存块,所述第一新表项的共享者编号为所述访问者编号。例如,查询表项0011-1-3,访问者编号为2,则生成0011-1-2用于管理2号缓存设备对于缓存块0011的操作。如果该查询表项的共享者编号与访问者编号相同,则直接根据第一表项对缓存块0011进行操作即可。
可选的,目录访问请求包括前述访问者编号,还包括访问类型,访问类型用于指示目录访问请求为读请求或写请求。步骤1008还包括:若访问请求类型指示目录访问请求为读请求,查询表项的共享者编号不包括所述访问者编号;将访问者编号添加至查询表项。例如,查询表项0011-1-3、4,访问者编号为2,本次目录访问请求为读请求,则将查询表项修改为0011-1-2、3、4即可,由于是读请求,不会对缓存块0011进行修改,因此也无须将缓存块0011的其他拷贝无效。如果该查询表项的共享者编号包括访问者编号,则该此目录访问请求无须对目录进行修改,直接根据第一表项对缓存块0011进行操作即可。
可选的,若访问请求类型指示目录访问请求为写请求,步骤608还包括生成第二新表项,所述第二新表项的表项标签指示所述待访问的缓存块,所述第二新表项的共享者编号为所述访问者编号,通知所述查询表项的除所述访问者编号之外的共享者编号对应的缓存设备无效所述待访问的缓存块。若本次目录访问请求为写请求,则需要为缓存块0011在访问者编号对应的缓存设备中的拷贝生成一个块表项单独管理,例如访问者编号为2,第一表项0011-1-2、3、4,生成第二新表项0011-1-2,且无效编号为3、4的缓存设备中的缓存块0011,以达成缓存块0011在整个缓存系统中的一致性。
上述提供一种访问目录的方法,通过该方法可以访问多粒度、管理范围不同的表项,并且从中选取管理范围最小的表项作为访问对象,并根据实际访问的类型、访问表项的内容采取相应的操作,灵活高效地访问该目录,提升目录的使用效率。
设备实施例三
本设备实施例三提供一种用于访问目录的设备800,所述目录包括多个表项,每个表项包括表项标签和共享者编号,所述表项标签用于指示缓存块,所述多个表项中包含第一表项,所述第一表项的表项标签指示2a个缓存块,所述a为正整数,设备800的组成结构示意图如图7所示,包括:
接收模块802,用于接收目录访问请求,目录访问请求携带待访问的缓存块的标签。
处理模块804,用于根据所述待访问的缓存块的标签,查询所述目录,获取一组所述待访问的缓存块对应的表项,所述一组表项包括所述目录中表项标签指示所述待访问的缓存块的所有表项;从所述一组表项中确定查询表项,所述查询表项为所述一组表项中表项标签指示的缓存块最少的表项。
接收模块802具体执行方法实施例二的步骤602,以及步骤602的各个可选方案。
处理模块804具体执行方法实施例二的步骤604、步骤606和步骤608,以及步骤604、步骤606和步骤608的各个可选方案。
接收模块802与处理模块804建立通信连接,接收模块802、处理模块804均与设备实施例一中的用于存储实施例一中的提供的目录的存储介质建立通信连接。
举例说明设备800的工作场景:当某一缓存设备中的缓存块需要被访问时,首先发送目录访问请求至设备800的接收模块802,处理模块804从接收模块802获得目录访问请求之后获得查询表项,根据查询表项记录的内容即可完成缓存块的访问操作。
可选的,设备800可以为ASIC或FPGA。一般而言,为了与缓存设备的高速操作频率相配合,对目录的访问一般也是由硬件设备实现的。
上述提供一种用于访问目录的设备,通过该设备可以访问多粒度、管理范围不同的表项,并且从中选取管理范围最小的表项作为访问对象,并根据实际访问的类型、访问表项的内容采取相应的操作,灵活高效地访问该目录,提升目录的使用效率。
设备实施例四
本设备实施例四提供一种目录高速缓存设备1000,包括如设备实施例二中任一所述的可选的设备400,如设备实施例三中任一所述的可选的设备800,如设备实施例一种所述的存储介质1004,设备400、设备800、存储介质1004之间通过总线1002建立通信连接,其组成结构示意图如图8所示。
举例说明设备1000的工作场景:当某一缓存设备中的缓存块需要被访问时,设备800负责根据接收到的目录访问请求对存储了目录的存储介质1004进行访问,以获得对应的表项,而如果访问过程中需要在目录中增加新的表项,而现有的存储介质1004中目录的存储空间已经达到上限,则设备400对目录中的表项进行合并,以节约目录的存储空间,为新的表项空出存储空间。
上述提供一种目录高速缓存设备,通过该设备可以访问多粒度、管理范围不同的表项,并且从中选取管理范围最小的表项作为访问对象,并根据实际访问的类型、访问表项的内容采取相应的操作,灵活高效地访问该目录,还可以对目录中的表项进行合并,提升目录的存储效率,避免对目录中表项进行删除时对缓存系统造成的损耗,提升目录的使用效率。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。需要说明的是,设备实施例二中的设备即执行方法实施例一中所述的方法的设备,因此二者可以互相借鉴;设备实施例三中的设备即执行方法实施例二中所述的方法的设备,因此二者可以互相借鉴。
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。

Claims (30)

  1. 一种目录中表项合并的方法,其特征在于,所述目录包括多个表项,每个表项包括表项标签和共享者编号,所述表项标签用于指示缓存块;
    所述方法包括:
    确定待合并的N个表项,其中,所述N个表项中的每个表项的表项标签所指示的缓存块属于合并范围,所述合并范围指示2a个缓存块,所述N和所述a均为正整数;
    合并所述N个表项为第一表项,其中,所述第一表项的表项标签指示所述2a个缓存块,所述第一表项的共享者编号包括所述N个表项中每个表项的共享者编号。
  2. 如权利要求1所述的方法,其特征在于,所述合并所述N个表项为第一表项之前还包括:
    确定是否满足合并条件,在满足合并条件时,合并所述N个表项;
    所述合并条件包括以下条件任意之一:
    所述目录中不包括第二表项,其中,所述第二表项的表项标签与所述第一表项的表项标签相同且都指示所述2a个缓存块;或者,
    所述目录包括第二表项且所述N个表项中任一表项的表项标签指示2个以上缓存块的标签,其中,所述第二表项的表项标签与所述第一表项的表项标签相同且都指示所述2a个缓存块;或者,
    所述目录包括第二表项且所述N个表项中的任一表项的表项标签均指示一个缓存块并所述N大于预设的阈值,其中,所述第二表项的表项标签与所述第一表项的表项标签相同且都指示所述2a个缓存块。
  3. 如权利要求2所述的方法,其特征在于,在所述目录包含所述第二表项并进行合并得到所述第一表项之后,还包括:
    删除所述第二表项。
  4. 如权利要求3所述的方法,其特征在于,所述删除所述第二表项之前还包括:
    确定所述第二表项的共享者编号是否与所述第一表项的共享者编号相同;
    若所述第二表项的共享者编号与所述第一表项的共享者编号不同,在执行所述删除所述第二表项的动作之前,获取所述第二表项中的冗余共享者编号,所述冗余共享者编号为不同于所述第一表项的共享者编号的第二表项中的其它共享者编号,在所述冗余共享者编号所对应的缓存设备中查询所述2a个缓存块,确定所述冗余共享者编号所对应的缓存设备中缓存的第一缓存块;
    执行以下两种动作之一:无效所述冗余共享者编号所对应的缓存设备所缓存的所述第一缓存块,或者生成第三表项,所述第三表项的表项标签指示所述第一缓存块,所述第三表项的共享者编号为所述冗余共享者编号。
  5. 如权利要求2所述的方法,其特征在于,在所述目录包含所述第二表项并进行合并得到所述第一表项之后,所述方法还包括:
    确定所述第二表项是否包括两个或两个以上的共享者编号,在所述第二表项包括两个或两个以上的共享者编号时,合并所述第二表项和所述第一表项为第四表项,所述第四表项的表项标签指示所述2a个缓存块,所述第四表项的共享者编号包括所述第二表项的共享者编号和所述第一表项的共享者编号。
  6. 如权利要求5所述的方法,其特征在于,其特征在于,还包括:
    若所述第二表项仅包括一个共享者编号,删除所述第二表项。
  7. 如权利要求6所述的方法,其特征在于,其特征在于,所述删除所述第二表项之前还包括:
    确定所述第二表项的共享者编号是否与所述第一表项的共享者编号相同;
    若所述第二表项的共享者编号与所述第一表项的共享者编号不同,在执行所述删除所述第二表项的动作之前,获取所述第二表项中的冗余共享者编号,所述冗余共享者编号为不同于所述第一表项的共享者编号的第二表项中的其它共享者编号,在所述冗余共享者编号所对应的缓存设备中查询所述2a个缓存块,确定所述冗余共享者编号所对应的缓存设备中缓存的第二缓存块;
    执行以下两种动作之一:无效所述冗余共享者编号所对应的缓存设备所缓存的所述第二缓存块,或者生成第五表项,所述第五表项的表项标签指示所述第二缓存块,所述第五表项的共享者编号为所述冗余共享者编号。
  8. 一种访问目录的方法,其特征在于,所述目录包括多个表项,每个表项包括表项标签和共享者编号,所述表项标签用于指示缓存块,所述多个表项中包含第一表项,所述第一表项的表项标签指示2a个缓存块,所述a为正整数;
    所述方法包括:
    接收目录访问请求,所述目录访问请求携带待访问的缓存块的标签;
    根据所述待访问的缓存块的标签,查询所述目录,获取一组所述待访问的缓存块对应的表项,所述一组表项包括所述目录中表项标签指示所述待访问的缓存块的所有表项;
    从所述一组表项中确定查询表项,所述查询表项为所述一组表项中表项标签指示的缓存块最少的表项。
  9. 如权利要求8所述的方法,其特征在于,所述目录中每个表项还包括管理范围标识位,所述管理范围标识位用于指示所述表项标签指示的缓存块的数量;
    所述从所述一组表项中确定查询表项包括:根据所述一组表项中各个表项的管理范围标志位,确定所述查询表项。
  10. 如权利要求8或9所述的方法,其特征在于,所述目录访问请求还包括访问者编号,所述访问者编号指示发出所述目录访问请求的缓存设备;
    所述方法还包括:
    若所述查询表项的共享者编号与所述访问者编号不相同,生成第一新表项,所述第一新表项的表项标签指示所述待访问的缓存块,所述第一新表项的共享者编号为所述访问者编号。
  11. 如权利要求8或9所述的方法,其特征在于,所述目录访问请求还包括访问者编号,所述访问者编号指示发出所述目录访问请求的缓存设备,所述目录访问请求还包括访问类型,所述访问类型用于指示所述目录访问请求为读请求或写请求;
    所述方法还包括:
    若所述访问请求类型指示所述目录访问请求为读请求,所述查询表项的共 享者编号不包括所述访问者编号,将所述访问者编号添加至所述查询表项的共享者编号中。
  12. 如权利要求11所述的方法,其特征在于,所述方法还包括:若所述访问请求类型指示所述目录访问请求为写请求;
    生成第二新表项,所述第二新表项的表项标签指示所述待访问的缓存块,所述第二新表项的共享者编号为所述访问者编号,并通知所述查询表项中除所述访问者编号之外的其它共享者编号对应的缓存设备无效所述待访问的缓存块。
  13. 一种用于目录中表项合并的设备,其特征在于,所述目录包括多个表项,每个表项包括表项标签和共享者编号,所述表项标签用于指示缓存块;所述设备包括:
    确认模块,用于确定待合并的N个表项,其中,所述N个表项中的每个表项的表项标签所指示的缓存块属于合并范围,所述合并范围指示2a个缓存块,所述N和所述a均为正整数;
    处理模块,用于合并所述N个表项为第一表项,其中,所述第一表项的表项标签指示所述2a个缓存块,所述第一表项的共享者编号为所述N个表项的共享者编号。
  14. 如权利要求13所述的设备,其特征在于,所述处理模块合并所述N个表项为所述第一表项之前还用于:
    确定是否满足合并条件,在满足合并条件时,合并所述N个表项;
    所述合并条件包括以下条件任意之一:
    所述目录中不包括第二表项,其中,所述第二表项的表项标签与所述第一表项的表项标签相同且都指示所述2a个缓存块;或者,
    所述目录包括第二表项且所述N个表项中任一表项的表项标签指示2个以上缓存块的标签,其中,所述第二表项的表项标签与所述第一表项的表项标签相同且都指示所述2a个缓存块;或者,
    所述目录包括第二表项且所述N个表项中的任一表项的表项标签均指示一个缓存块并所述N大于预设的阈值,其中,所述第二表项的表项标签与所述第一表项的表项标签相同且都指示所述2a个缓存块。
  15. 如权利要求14所述的设备,其特征在于,在所述目录包含所述第二表项并进行合并得到所述第一表项之后,所述处理模块还用于:删除所述第二表项。
  16. 如权利要求15所述的设备,其特征在于,所述处理模块删除所述第二表项之前还用于:
    确定所述第二表项的共享者编号是否与所述第一表项的共享者编号相同;
    若所述第二表项的共享者编号与所述第一表项的共享者编号相同,执行所述删除所述第二表项的动作;
    若所述第二表项的共享者编号与所述第一表项的共享者编号不同,在执行所述删除所述第二表项的动作之前,获取所述第二表项中的冗余共享者编号,所述冗余共享者编号为不同于所述第一表项的共享者编号的第二表项中的其它共享者编号,在所述冗余共享者编号所对应的缓存设备中查询所述2a个缓存块,确定所述冗余共享者编号所对应的缓存设备中缓存的第一缓存块;
    执行以下两种动作之一:无效所述冗余共享者编号所对应的缓存设备所缓存的所述第一缓存块,或者生成第三表项,所述第三表项的表项标签指示所述第一缓存块,所述第三表项的共享者编号为所述冗余共享者编号。
  17. 如权利要求14所述的设备,其特征在于,所述处理模块还用于:在所述目录包含所述第二表项并进行合并得到所述第一表项之后,确定所述第二表项是否包括两个或两个以上的共享者编号,在所述第二表项包括两个或两个以上的共享者编号时,合并所述第二表项和所述第一表项为第四表项,所述第四表项的表项标签指示所述2a个缓存块,所述第四表项的共享者编号包括所述第二表项的共享者编号和所述第一表项的共享者编号。
  18. 如权利要求17所述的设备,其特征在于,所述处理模块还用于若所述第二表项仅包括一个共享者编号,删除所述第二表项。
  19. 如权利要求18所述的设备,其特征在于,所述处理模块还用于删除所述第二表项之前确定所述第二表项的共享者编号是否与所述第一表项的共享者编号相同;
    若所述第二表项的共享者编号与所述第一表项的共享者编号不同,在执行 所述删除所述第二表项的动作之前,获取所述第二表项中的冗余共享者编号,所述冗余共享者编号为不同于所述第一表项的共享者编号的第二表项中的其它共享者编号,在所述冗余共享者编号所对应的缓存设备中查询所述2a个缓存块,确定所述冗余共享者编号所对应的缓存设备中缓存的第二缓存块;
    执行以下两种动作之一:无效所述冗余共享者编号所对应的缓存设备所缓存的所述第二缓存块,或者生成第五表项,所述第五表项的表项标签指示所述第二缓存块,所述第五表项的共享者编号为所述冗余共享者编号。
  20. 如权利要求13至19任一所述的设备,其特征在于,所述设备为专用集成电路ASIC或现成可编程门阵列FPGA。
  21. 一种用于访问目录的设备,其特征在于,所述目录包括多个表项,每个表项包括表项标签和共享者编号,所述表项标签用于指示缓存块,所述多个表项中包含第一表项,所述第一表项的表项标签指示2a个缓存块,所述a为正整数;所述设备包括:
    接收模块,用于接收目录访问请求,所述目录访问请求携带待访问的缓存块的标签;
    处理模块,用于根据所述待访问的缓存块的标签,查询所述目录,获取一组所述待访问的缓存块对应的表项,所述一组表项包括所述目录中表项标签指示所述待访问的缓存块的所有表项;从所述一组表项中确定查询表项,所述查询表项为所述一组表项中表项标签指示的缓存块最少的表项。
  22. 如权利要求21的设备,其特征在于,所述目录中每个表项还包括管理范围标识位,所述管理范围标识位用于指示所述表项标签指示的缓存块的数量;
    所述处理模块从所述一组表项中确定查询表项包括:根据所述一组表项中各个表项的管理范围标志位,确定所述查询表项。
  23. 如权利要求21或22所述的设备,其特征在于,所述目录访问请求还包括访问者编号,所述访问者编号指示发出所述目录访问请求的缓存设备;
    所述处理模块,还用于若所述查询表项的共享者编号与所述访问者编号不相同,生成第一新表项,所述第一新表项的表项标签指示所述待访问的缓存块,所述第一新表项的共享者编号为所述访问者编号。
  24. 如权利要求21或22所述的设备,其特征在于,所述目录访问请求还包括访问者编号,所述访问者编号指示发出所述目录访问请求的缓存设备,所述目录访问请求还包括访问类型,所述访问类型用于指示所述目录访问请求为读请求或写请求;
    所述处理模块,若所述访问请求类型指示所述目录访问请求为读请求,所述查询表项的共享者编号不包括所述访问者编号,将所述访问者编号添加至所述查询表项的共享者编号中。
  25. 如权利要求24所述的设备,其特征在于,所述处理模块,还用于若所述访问请求类型指示所述目录访问请求为写请求;
    生成第二新表项,所述第二新表项的表项标签指示所述待访问的缓存块,所述第二新表项的共享者编号为所述访问者编号,并通知所述查询表项中除所述访问者编号之外的其它共享者编号对应的缓存设备无效所述待访问的缓存块。
  26. 如权利要求21至25任一所述的设备,其特征在于,所述设备为专用集成电路ASIC或现成可编程门阵列FPGA。
  27. 一种目录,其特征在于,包括:
    块表项,所述块表项包括第一表项标签,第一共享者编号;
    区域表项,所述区域表项包括第二表项标签,第二共享者编号;
    超区域表项,所述超区域表项包括第三表项标签,第三共享者编号;
    所述第一表项标签指示一缓存块,所述第二表项标签指示2n个缓存块,所述第三表项标签指示2n+m个缓存块,所述n和所述m均为正整数。
  28. 如权利要求27所述的目录,其特征在于,所述块表项还包括第一管理范围标志位,所述区域表项还包括第二管理范围标志位,所述超区域表项还包括第三管理范围标志位;
    所述第一管理范围标识位用于指示所述第一表项标签指示的缓存块的数量,所述第二管理范围标识位用于指示所述第二表项标签指示的缓存块的数量,所述第三管理范围标识位用于指示所述第三表项标签指示的缓存块的数量。
  29. 一种存储介质,其特征在于,用于存储如权利要求27或28所述的目录。
  30. 一种目录高速缓存,其特征在于,包括如权利要求29所述的存储介质、如权利要求13至20任一所述的用于目录中表项合并的设备、如权利要求21至26任一所述的用于访问目录的设备,总线;
    所述存储介质、所述用于目录中表项合并的设备、所述用于访问目录的设备之间通过所述总线建立通信连接。
PCT/CN2015/082672 2015-06-29 2015-06-29 目录中表项合并的方法以及设备 WO2017000124A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
PCT/CN2015/082672 WO2017000124A1 (zh) 2015-06-29 2015-06-29 目录中表项合并的方法以及设备
CN201580079604.2A CN107533512B (zh) 2015-06-29 2015-06-29 目录中表项合并的方法以及设备
US15/839,665 US20180101475A1 (en) 2015-06-29 2017-12-12 Method and device for combining entries in directory

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/082672 WO2017000124A1 (zh) 2015-06-29 2015-06-29 目录中表项合并的方法以及设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/839,665 Continuation US20180101475A1 (en) 2015-06-29 2017-12-12 Method and device for combining entries in directory

Publications (1)

Publication Number Publication Date
WO2017000124A1 true WO2017000124A1 (zh) 2017-01-05

Family

ID=57607342

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/082672 WO2017000124A1 (zh) 2015-06-29 2015-06-29 目录中表项合并的方法以及设备

Country Status (3)

Country Link
US (1) US20180101475A1 (zh)
CN (1) CN107533512B (zh)
WO (1) WO2017000124A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109710582B (zh) * 2018-12-13 2021-10-26 创新科技术有限公司 一种共享目录管理方法和装置
CN109683744B (zh) * 2018-12-24 2022-05-13 杭州达现科技有限公司 一种基于显示界面的目录整合方法和装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1695320A (zh) * 2002-10-11 2005-11-09 松下电器产业株式会社 环路干扰消除器、中继系统和环路干扰消除方法
US7991963B2 (en) * 2007-12-31 2011-08-02 Intel Corporation In-memory, in-page directory cache coherency scheme
CN103257932A (zh) * 2012-01-17 2013-08-21 国际商业机器公司 用于管理计算机可读高速缓存系统中的数据的方法和系统

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7366869B2 (en) * 2005-03-17 2008-04-29 Qualcomm Incorporated Method and system for optimizing translation lookaside buffer entries
US7814279B2 (en) * 2006-03-23 2010-10-12 International Business Machines Corporation Low-cost cache coherency for accelerators
CN101859281A (zh) * 2009-04-13 2010-10-13 廖鑫 基于集中式目录的嵌入式多核缓存一致性方法
JP2011003072A (ja) * 2009-06-19 2011-01-06 Toshiba Corp マルチコアプロセッサシステム
US8635411B2 (en) * 2011-07-18 2014-01-21 Arm Limited Data processing apparatus and method for managing coherency of cached data
US9411733B2 (en) * 2011-09-09 2016-08-09 University Of Rochester Sharing pattern-based directory coherence for multicore scalability (“SPACE”)
CN103186621B (zh) * 2011-12-30 2016-07-06 北大方正集团有限公司 一种目录生成方法和装置
US8904100B2 (en) * 2012-06-11 2014-12-02 International Business Machines Corporation Process identifier-based cache data transfer
CN103544269B (zh) * 2013-10-17 2017-02-01 华为技术有限公司 目录的存储方法、查询方法及节点控制器
CN104133785B (zh) * 2014-07-30 2017-03-08 浪潮集团有限公司 采用混合目录的双控存储服务器的缓存一致性实现方法
US20160070714A1 (en) * 2014-09-10 2016-03-10 Netapp, Inc. Low-overhead restartable merge operation with efficient crash recovery

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1695320A (zh) * 2002-10-11 2005-11-09 松下电器产业株式会社 环路干扰消除器、中继系统和环路干扰消除方法
US7991963B2 (en) * 2007-12-31 2011-08-02 Intel Corporation In-memory, in-page directory cache coherency scheme
CN103257932A (zh) * 2012-01-17 2013-08-21 国际商业机器公司 用于管理计算机可读高速缓存系统中的数据的方法和系统

Also Published As

Publication number Publication date
CN107533512A (zh) 2018-01-02
CN107533512B (zh) 2020-07-28
US20180101475A1 (en) 2018-04-12

Similar Documents

Publication Publication Date Title
TWI283348B (en) Method, system, and apparatus for a hierarchical cache line replacement
JP2020534589A5 (zh)
WO2016082793A1 (zh) 高速缓存cache存储器系统及访问缓存行cache line的方法
US20150067001A1 (en) Cache management in a computerized system
US20150058570A1 (en) Method of constructing share-f state in local domain of multi-level cache coherency domain system
WO2017113213A1 (zh) 访问请求处理方法、装置及计算机系统
US10997078B2 (en) Method, apparatus, and non-transitory readable medium for accessing non-volatile memory
WO2017041570A1 (zh) 向缓存写入数据的方法及装置
CN112256604B (zh) 直接存储器访问系统和方法
WO2018161272A1 (zh) 一种缓存替换方法,装置和系统
CN107341114B (zh) 一种目录管理的方法、节点控制器和系统
US9003130B2 (en) Multi-core processing device with invalidation cache tags and methods
TW201843593A (zh) 多處理器系統、資料管理方法及非暫時性電腦可讀媒體
WO2017113211A1 (zh) 访问请求处理方法、装置及计算机系统
US20190042470A1 (en) Method of dirty cache line eviction
US20170364442A1 (en) Method for accessing data visitor directory in multi-core system and device
WO2017000124A1 (zh) 目录中表项合并的方法以及设备
US20070101044A1 (en) Virtually indexed cache system
JP2010244327A (ja) キャッシュシステム
WO2018036486A1 (zh) 页表缓存的访问方法、页表缓存、处理器芯片和存储单元
US20220156195A1 (en) Snoop filter device
EP2916231B1 (en) Directory maintenance method and apparatus
US10503642B2 (en) Cache coherence directory architecture with decoupled tag array and data array
CN107423232B (zh) Ftl快速访问方法与装置
CN108459970B (zh) 一种查询缓存信息的方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15896655

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15896655

Country of ref document: EP

Kind code of ref document: A1