US20020083269A1 - Cache system - Google Patents
Cache system Download PDFInfo
- Publication number
- US20020083269A1 US20020083269A1 US09/014,315 US1431598A US2002083269A1 US 20020083269 A1 US20020083269 A1 US 20020083269A1 US 1431598 A US1431598 A US 1431598A US 2002083269 A1 US2002083269 A1 US 2002083269A1
- Authority
- US
- United States
- Prior art keywords
- cache
- item
- main memory
- partition
- memory
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/12—Replacement control
- G06F12/121—Replacement control using replacement algorithms
- G06F12/126—Replacement control using replacement algorithms with special data handling, e.g. priority of data or instructions, handling errors or pinning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0842—Multiuser, multiprocessor or multiprocessing cache systems for multiprocessing or multitasking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0844—Multiple simultaneous or quasi-simultaneous cache accessing
- G06F12/0846—Cache with multiple tag or data arrays being simultaneously accessible
- G06F12/0848—Partitioned cache, e.g. separate instruction and operand caches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0864—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using pseudo-associative means, e.g. set-associative or hashing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0888—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using selective caching, e.g. bypass
Definitions
- the present invention relates to a cache system for operating between a processor and a main memory of a computer.
- cache memories are used in computer systems to decrease the access latency to certain data and code and to decrease the memory bandwidth used for that data and code.
- a cache memory can delay, aggregate and reorder memory accesses.
- a cache memory operates between a processor and a main memory of a computer. Data and/or instructions which are required by the process running on the processor can be held in the cache while that process runs. An access to the cache is normally much quicker than an access to main memory. If the processor does not locate a required data item or instruction in the cache memory, it directly accesses main memory to retrieve it, and the requested data item or instruction is loaded into the cache. There are various known systems for using and refilling cache memories.
- a cache system for operating between a processor and a main memory of a computer, the cache system comprising:
- a cache memory having a set of cache partitions, each cache partition comprising a plurality of addressable storage locations for holding items fetched from said main memory for use by the processor,
- a cache refill mechanism arranged to fetch an item from the main memory and to load said item into the cache memory at one of said addressable storage locations;
- the cache refill mechanism is operable to allocate to each said item fetched from the main memory one or more of said cache partitions in dependence on the address of said item in the main memory.
- each address in main memory comprises a page number and a line in page number, the page numbers being held in a look-up table in association with their respective partition indicators.
- the processor issues addresses comprising a virtual page number and line in page number.
- the system can comprise a translation look aside buffer for translating the virtual page number to a real page number for accessing the main memory, the translation look aside buffer also holding respective partition indicators in association with the real page numbers for identifying the cache partition into which the addressed item is to be loaded.
- the line in page number of the items addressed can be used to identify the address storage location within the cache partition into which the item is to be located. That is, each cache partition is direct-mapped. It will be apparent that it is not necessary to use all of the end bits of the items address as the line in page number, but merely a set of appropriate bits. These will normally be near the least significant end of the address.
- One or more cache partitions may be allocated to a page in main memory.
- the system can include a cache access circuit which accesses items from the cache memory according to the address in main memory of said items and regardless of the cache partition in which the item is held in the cache memory. That is, the partition indicator is only used on refill and not on look-up. Thus, a cached item can be retrieved from its partition even if subsequent to its caching that partition is now allocated to a different set of addresses.
- a method of operating a cache memory arranged between a processor and a main memory of a computer wherein, when the processor requests an item from main memory using an address in main memory for said item and that item is not held in the cache memory, said item is fetched from the main memory and loaded into one of a plurality of addressable storage locations in the cache memory, the addressable storage locations being arranged as a set of cache partitions and wherein each address is associated with a multi-bit partition indicator identifying into which cache partition the item may be loaded so that one or more of said cache partitions is allocated to said item in dependence on the address of said item in main memory.
- the main memory can hold a plurality of processes, each process including one or more sequence of instructions held at addresses in the main memory within a common page number.
- Cache partitions can be allocated by associating each cache partition with page numbers of a particular process in the main memory.
- the number of addressable storage locations in each cache partition can be alterable. Also, the association of cache partitions to page numbers can be alterable while a process using these page numbers is being run by the processor.
- the following described embodiment illustrates a cache system which gives protection of the contents of the cache against unexpected eviction by reading from or writing to cache lines from other pages of data which are placed in other partitions. It also provides a system in which the contents of the cache may be predicted.
- FIG. 1 is a block diagram of a computer incorporating a cache system
- FIG. 2 is a sketch illustrating a four way set associative cache
- FIG. 3 is an example of an entry in a translation look aside buffer
- FIG. 4 is a block diagram of the refill engine
- FIG. 5 is a diagram illustrating the operation of a multi-tasking processor
- FIG. 6 is a diagram illustrating the alteration in caching behaviour for the system of FIG. 5;
- FIG. 7 is a schematic block diagram of a CPU
- FIG. 8 is an example of an entry in a TLB in a second embodiment.
- FIG. 1 is a block diagram of a computer incorporating a cache system.
- the computer comprises a CPU 2 which is connected to an address bus 4 for accessing items from a main memory 6 and to a data bus 8 for returning items to the CPU 2 .
- the data bus 8 is referred to herein as a data bus, it will be appreciated that this is for the return of items from the main memory 6 , whether or not they constitute actual data or instructions for execution by the CPU.
- the system described herein is suitable for use on both instruction and data caches. As is known, there may be separate data and instruction caches, or the data and instruction cache may be combined.
- the addressing scheme is a so-called virtual addressing scheme.
- the address is split into a line in page address 4 a and a virtual page address 4 b .
- the virtual page address 4 b is supplied to a translation look-aside buffer (TLB) 10 .
- the line in page address 4 a is supplied to a look-up circuit 12 .
- the translation look-aside buffer 10 supplies a real page address 14 converted from the virtual page address 4 b to the look-up circuit 12 .
- the look-up circuit 12 is connected via address and data buses 16 , 18 to a cache access circuit 20 . Again, the data bus 18 can be for data items or instructions from the main memory 6 .
- the cache access circuit 20 is connected to a cache memory 22 via an address bus 24 , a data bus 26 and a control bus 28 which transfers replacement information for the cache memory 22 .
- a refill engine 30 is connected to the cache access circuit 20 via a refill bus 32 which transfers replacement information, data items (or instructions) and addresses between the refill engine and the cache access circuit.
- the refill engine 30 is itself tasking
- FIG. 6 is a diagram illustrating the alteration in caching behaviour for the system of FIG. 5;
- FIG. 7 is a schematic block diagram of a CPU
- FIG. 8 is an example of an entry in a TLB in a second embodiment.
- FIG. 1 is a block diagram of a computer incorporating a cache system.
- the computer comprises a CPU 2 which is connected to an address bus 4 for accessing items from a main memory 6 and to a data bus 8 for returning items to the CPU 2 .
- the data bus 8 is referred to herein as a data bus, it will be appreciated that this is for the return of items from the main memory 6 , whether or not they constitute actual data or instructions for execution by the CPU.
- the system described herein is suitable for use on both instruction and data caches. As is known, there may be separate data and instruction caches, or the data and instruction cache may be combined.
- the addressing scheme is a so-called virtual addressing scheme.
- the address is split into a line in page address 4 a and a virtual page address 4 b .
- the virtual page address 4 b is supplied to a translation look-aside buffer (TLB) 10 .
- the line in page address 4 a is supplied to a look-up circuit 12 .
- the translation look-aside buffer 10 supplies a real page address 14 converted from the virtual page address 4 b to the look-up circuit 12 .
- the look-up circuit 12 is connected via address and data buses 16 , 18 to a cache access circuit 20 . Again, the data bus 18 can be for data items or instructions from the main memory 6 .
- the cache access circuit 20 is connected to a cache memory 22 via an address bus 24 , a data bus 26 and a control bus 28 which transfers replacement information for the cache memory 22 .
- a refill engine 30 is connected to the cache access circuit 20 via a refill bus 32 which transfers replacement information, data items (or instructions) and addresses between the refill engine and the cache access circuit.
- the refill engine 30 is itself connected to
- the refill engine 30 receives from the translation look-aside buffer 10 a full real address 34 , comprising the real page address and line in page address of an item in the main memory 6 .
- the refill engine 30 also receives a partition indicator from the translation look-aside buffer 10 on a four bit bus 36 . The function of the partition indicator will be described hereinafter.
- the refill engine 30 receives a miss signal on line 38 which is generated in the look-up circuit 12 in a manner which will be described more clearly hereinafter.
- the cache memory 22 described herein is a direct mapped cache. That is, it has a plurality of addressable storage locations, each location constituting one row of the cache. Each row contains an item from main memory and the address in main memory of that item. Each row is addressable by a row address which is constituted by a number of bits representing the least significant bits of the address in main memory of the data items stored at that row. For example, for a cache memory having eight rows, each row address would be three bits long to uniquely identify those rows. For example, the second row in the cache has a row address 001 and thus could hold any data items from main memory having an address in the main memory which ends in the bits 001. Clearly, in the main memory, there would be many such addresses and thus potentially many data items to be held at that row in the cache memory. Of course, the cache memory can hold only one data item at that row at any one time.
- the CPU 2 requests an item from main memory 6 using the address in main memory and transmits that address on address bus 4 .
- the virtual page number is supplied to the translation look-aside buffer 10 which translates it into a real page number 14 according to a predetermined virtual to real page translation algorithm.
- the real page number 14 is supplied to the look-up circuit 12 together with the line in page number 4 a of the original address transmitted by the CPU 2 .
- the line in page address is used by the cache access circuit 20 to address the cache memory 22 .
- the line in page address includes a set of least significant bits (not necessarily including the end bits) of the main address in memory which are equivalent to the row address in the cache memory 22 .
- the contents of the cache memory 22 at the row address identified by the line in page address, being a data item (or instruction) and the address in main memory of the data item (or instruction), are supplied to the look-up circuit 12 .
- the real page number of the address which has been retrieved from the cache memory is compared with the real page number which has been supplied from the translation look-aside buffer 10 . If these addresses match, the look-up circuit indicates a hit which causes the data item which was held at that row of the cache memory to be returned to the CPU along data bus 8 . If however the real page number of the address which was held at the addressed row in the cache memory 22 does not match the real page number supplied from the translation look-aside buffer 10 , then a miss signal is generated on line 38 to the refill engine 30 .
- the refill engine 30 It is the task of the refill engine 30 to retrieve the correct item from the main memory 6 , using the real address which is supplied from the translation look-aside buffer 10 on bus 34 .
- the data item, once fetched from main memory 6 is supplied to the cache access circuit 20 via the refill bus 32 and is loaded into the cache memory 22 together with the address in main memory.
- the data item itself is also returned to the CPU along data bus 8 so that the CPU can continue to execute.
- a direct mapped cache memory as outlined above, it will be apparent that the data item and its address recalled from the main memory 6 will be loaded into the storage location from which the data item was originally accessed for checking.
- FIG. 2 An example of a 4-way set associative cache is illustrated in FIG. 2.
- the cache memory is divided into four banks B 1 ,B 2 ,B 3 ,B 4 .
- the banks can be commonly addressed row-wise by a common row address, as illustrated schematically for one row in FIG. 2.
- that row contains four cache entries, one for each bank.
- the cache entry for bank B 1 is output on bus 26 a
- the cache entry for bank B 2 is output on bus 26 b
- so on for banks B 3 and B 4 thus, this allows four cache entries for one row address (or line in page address).
- the refill engine 30 retrieves the requested item from the main memory 6 and loads it into the correct row in one of the banks, in accordance with a refill algorithm which is based on, for example, how long a particular item has been held in the cache, or other program parameters of the system. Such replacement algorithms are known and are not described further herein.
- n-way set associative cache (where n is the number of banks and is equal to four in FIG. 2), while being an improvement on a single direct mapped system is still inflexible and, more importantly, does not allow the behaviour of the cache to be properly predictable.
- the system described herein provides a cache partitioning mechanism which allows the optimisation of the computer's use of the cache memory by a more flexible cache refill system.
- FIG. 7 is a schematic block diagram of a CPU 2 using the computer of FIG. 1.
- the CPU 2 comprises an execution circuit 15 which is connected to a fetch circuit 17 which is responsible for addressing memory via the memory bus 4 and retrieving data and instructions via the data bus 8 .
- a set of general purpose registers 7 is connected to the execution circuit 15 for holding data and instructions for use in executing a process.
- a set of special registers are provided, denoted by reference numerals 9 , 11 and 13 .
- register 11 holds the instruction pointer which identifies the line of code which is currently being executed.
- special register 9 holds a thread status word which defines the status of a process being executed by the CPU 2 .
- the execution circuit 15 is capable of executing one process or sequence of instructions at any one time. However, it is equally capable of interrupting that process and starting to execute another process before the first process has finished executing. There are many reasons why a process currently being executed by the execution circuit 15 may be interrupted. One is that a higher priority interrupt process is to be executed immediately. Another is that the process being executed is currently awaiting data with a long latency, so that it is more efficient for the execution circuit to commence executing a subsequent process while the first process is waiting for that data. When the data has been received, the first process can be reschedules for execution. The execution of concurrent processes is known per se and is managed by a process handler 19 .
- a thread has the following state:
- control registers accessible by the thread
- thread status word Some of the above state is specified by a small set of values which is referred to herein as thread status word and which is held in the thread status word register 9 .
- the thread status word specifically holds information about:
- the format of the thread status word is defined in Table I. TABLE I Name Bits a Size Description TSW.FPFLAG 0-7 8 Floating point exception flags. TSW.FPTRAP 8-15 8 Floating point exception traps. TSW.FPMODE 16-19 4 Floating point modes. 20-31 Reserved. TSW.USER 32 1 Kernel mode (0)/user mode (1) TSW.SINGLE 33 1 Single step mode. TSW.TLB 34 1 First level TLB miss handler indicator. TSW.WATCH 35 1 Watchpoints enabled. TSW.ENABLE 36 1 Trap enable. 37-47 11 Reserved. TSW.GROUP 48-55 8 Group number. 56-63 Reserved.
- each TLB entry has associated with the virtual page number, a real page number and an information sequence.
- An example entry is shown in FIG. 3, where VP represents the virtual page number, RP represents the real page number and INFO represents the information sequence.
- the information sequence contains various information about the address in memory in a manner which is known and which will not be described further herein. However, according to the presently described system the information sequence additionally contains a partition indicator PI, which in the described embodiment is four bits long. Thus, bits 0 to 3 of the information sequence INFO constitute the partition indicator.
- FIG. 8 An alternative arrangement for the TLB entry is illustrated in FIG. 8. As can be seen from Table I, the thread status word includes an 8 bit group number. This is used as described in the following to generate the partition indicator for allocating cache partitions.
- each TLB entry has associated with the virtual page number, a real page number and an information sequence.
- the information sequence contains various information about the address in memory in a manner which is known and which will not be described further herein.
- the information sequence additionally contains a partition code which generates a partition indicator PI dependent on the group number and the virtual page number. This is illustrated diagrammatically in FIG. 8, where VP represents the virtual page number, RP represents the real page number, GN represents the group number and INFO represents the information sequence.
- PI is four bits long.
- the partition indicator gives information regarding the partition into which the data item may be placed when it is first loaded into the cache memory 22 .
- each partition can constitute one bank of the cache.
- each bit refers to one of the banks.
- the value of 1 in bit j of the partition indicator means that the data in that page may not be placed in partition j.
- the value of 0 in bit j means that the data in that page may be placed in partition j.
- Data may be placed in more than one partition by having a 0 in more than one bit of the partition indicator.
- a partition indicator which is all zeros allows the data to be placed in any partition of the cache.
- a partition indicator which is all ones does not allow any data items to be loaded into the cache memory. This could be used for example for “freezing” the contents of the cache, for example for diagnostic purposes.
- the partition indicator indicates that replacement of data items which have that real page number in main memory may not use banks B 1 or B 3 but may use banks B 2 or B 4 .
- the partition information is not used on cache look-up, but only upon cache replacement or refill.
- the cache access can locate data items held anywhere in the cache memory, whereas a replacement will only replace data into the allowed partitions for that page address.
- FIG. 4 illustrates in more detail the content of the refill engine 30 .
- the refill bus 32 is shown in FIG. 4 as three separate buses, a data bus 32 a , an address bus 32 b and a bus 32 c carrying replacement information.
- the address and data buses 32 a and 32 c are supplied to a memory access circuit 50 which accesses the main memory via the memory bus 54 .
- the replacement information is fed to a decision circuit 52 which also receives the real address 34 , the partition indicator P 1 on bus 36 and the miss signal 38 .
- the decision circuit 52 determines the proper partition of the cache into which data accessed the main memory is to be located.
- the partition indicator PI can be set in the TLB like any other TLB entry.
- the partition indicators are set by kernel mode software running on the CPU 2 and it is the responsibility of that kernel mode software to ensure that pages which should not be placed in a particular cache partition do not have their partition indicator bits set for that partition.
- a user may alter partitions by requesting that the cache partitions be altered. In that event, the CPU 2 would change to kernel mode to implement the request, change the TLB entries accordingly and then return to the user mode to allow the user to continue.
- a user can alter the partitioning behaviour of the cache, thus providing much greater flexibility than has hitherto been possible.
- the cache partitioning mechanism described herein is particularly useful for a multi-tasking CPU.
- a multi-tasking processor is capable of running more than one process “simultaneously”.
- the processor executes part of a process and, when that process is halted for some reason, perhaps in need of data or a stimulus to proceed, the processor immediately begins executing another process.
- the processor is always operating even when individual processes may be held up waiting for data or another stimulus to proceed.
- FIG. 5 illustrates diagrammatically such a situation. On the left hand side of FIG. 5 is illustrated the sequence which a processor may undertake to run different processes P 1 ,P 2 ,P 3 ,P 4 . On the right hand side of FIG.
- the processor executes a first sequence of process P 1 , a first sequence of process P 2 , a second sequence of process P 1 , a second sequence of process P 2 and then a first sequence of process P 3 .
- the process P 1 has been fully run by the processor.
- FIG. 6 shows the partitioning of the cache while the processor is running process P 1 , and the change in the partitioning when the processor switches to running P 3 etc.
- FIG. 6 also shows the TLB cache partition indicators for each case.
- FIG. 5 shows the cache partitioned while the processor is running processes PI and P 2 .
- the process P 1 may use banks B 1 and B 2 of the cache, but may not use banks B 3 and B 4 .
- the process P 2 may use banks B 3 and B 4 , but not banks B 1 and B 2 . This can be seen in the TLB entries below.
- page 0 has a cache partition indicator allowing it to access banks B 1 and B 2 , but not B 3 and B 4 .
- Pages 1 and 2 have cache partition indicators allowing them to access banks B 3 and B 4 but not B 1 and B 2 .
- Page 3 has a cache partition indicator which prevents it from accessing the cache.
- any attempt by the processor to load data items from the process P 3 into the cache would be prohibited.
- this however is not a disadvantage because, as can be seen, the processor is not intending to execute any part of the process P 3 until it has finished executing process P 1 . If it did for some reason have to execute P 3 , the only downside would be that it would have to make its accesses from direct memory and would not be allowed use of the cache.
- the processor can request kernel mode to allow it to alter the cache partition indicators in the TLB.
- kernel processes do not have access to the cache. Instead they modify the TLB entries for the partition indicators to modify the behaviour of the cache. The change is illustrated on the right hand side of FIG. 6.
- the cache partition indicators prevent the process P 1 from using the cache at all, but allocate banks B 1 and B 2 to the processes P 3 and P 4 , by altering the cache partition indicator for page 3 so that it can access these banks of the cache.
- the processor is expecting to execute the process P 3 , it now has a cache facility.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
A cache system and method of operating are described in which a cache is connected between a processor and a main memory of a computer. The cache system includes a cache memory having a set of cache partitions. Each cache partition has a plurality of addressable storage locations for holding items fetched from said main memory for use by the processor. The cache system also includes a cache refill mechanism arranged to fetch an item from the main memory and to load said item into the cache memory at one of said addressable storage locations in a cache partion wich depends on the address of said item in the main memory. This is achieved by a cache partition access table holding in association with addresses of items to be cached respective multi-bit partition indications identifying one or more cache partition into which the item is to be loaded.
Description
- The present invention relates to a cache system for operating between a processor and a main memory of a computer.
- As is well known in the art, cache memories are used in computer systems to decrease the access latency to certain data and code and to decrease the memory bandwidth used for that data and code.
- A cache memory can delay, aggregate and reorder memory accesses. A cache memory operates between a processor and a main memory of a computer. Data and/or instructions which are required by the process running on the processor can be held in the cache while that process runs. An access to the cache is normally much quicker than an access to main memory. If the processor does not locate a required data item or instruction in the cache memory, it directly accesses main memory to retrieve it, and the requested data item or instruction is loaded into the cache. There are various known systems for using and refilling cache memories.
- In order to rely on a cache in a real time system, the behaviour of the cache needs to be predictable. That is, there needs to be a reasonable degree of certainty that particular data items or instructions which are expected to be found in the cache will in fact be found there. Most existing refill mechanisms will normally always attempt to place in the cache a requested data item or instructions. In order to do this, they must delete other data items or instructions from the cache. This can result in items being deleted which were expected to be there for later use. This is particularly the case for a multi-tasking processor, or for a processor which has to handle interrupt processes or other unpredictable processes. It is an object of the present invention to provide a cache system which obviates or reduces this disadvantage and provides greater predictability of caching behaviour.
- According to one aspect of the present invention there is provided a cache system for operating between a processor and a main memory of a computer, the cache system comprising:
- a cache memory having a set of cache partitions, each cache partition comprising a plurality of addressable storage locations for holding items fetched from said main memory for use by the processor,
- a cache refill mechanism arranged to fetch an item from the main memory and to load said item into the cache memory at one of said addressable storage locations;
- a cache partition access table holding in association with addresses of items to be cached respective multi-bit partition indicators identifying into which cache partition the item may be loaded,
- wherein the cache refill mechanism is operable to allocate to each said item fetched from the main memory one or more of said cache partitions in dependence on the address of said item in the main memory.
- It is thus quite possible for an item to have access to more than one partition of the cache, or indeed for an item not to be allowed access to the cache at all.
- In the described embodiment, each address in main memory comprises a page number and a line in page number, the page numbers being held in a look-up table in association with their respective partition indicators.
- In a virtual addressing system, the processor issues addresses comprising a virtual page number and line in page number. In that event, the system can comprise a translation look aside buffer for translating the virtual page number to a real page number for accessing the main memory, the translation look aside buffer also holding respective partition indicators in association with the real page numbers for identifying the cache partition into which the addressed item is to be loaded.
- The line in page number of the items addressed can be used to identify the address storage location within the cache partition into which the item is to be located. That is, each cache partition is direct-mapped. It will be apparent that it is not necessary to use all of the end bits of the items address as the line in page number, but merely a set of appropriate bits. These will normally be near the least significant end of the address.
- One or more cache partitions may be allocated to a page in main memory.
- The system can include a cache access circuit which accesses items from the cache memory according to the address in main memory of said items and regardless of the cache partition in which the item is held in the cache memory. That is, the partition indicator is only used on refill and not on look-up. Thus, a cached item can be retrieved from its partition even if subsequent to its caching that partition is now allocated to a different set of addresses.
- According to another aspect of the invention there is provided a method of operating a cache memory arranged between a processor and a main memory of a computer, wherein, when the processor requests an item from main memory using an address in main memory for said item and that item is not held in the cache memory, said item is fetched from the main memory and loaded into one of a plurality of addressable storage locations in the cache memory, the addressable storage locations being arranged as a set of cache partitions and wherein each address is associated with a multi-bit partition indicator identifying into which cache partition the item may be loaded so that one or more of said cache partitions is allocated to said item in dependence on the address of said item in main memory.
- The main memory can hold a plurality of processes, each process including one or more sequence of instructions held at addresses in the main memory within a common page number. Cache partitions can be allocated by associating each cache partition with page numbers of a particular process in the main memory.
- The number of addressable storage locations in each cache partition can be alterable. Also, the association of cache partitions to page numbers can be alterable while a process using these page numbers is being run by the processor.
- The following described embodiment illustrates a cache system which gives protection of the contents of the cache against unexpected eviction by reading from or writing to cache lines from other pages of data which are placed in other partitions. It also provides a system in which the contents of the cache may be predicted.
- For a better understanding of the present invention and to show how the same may be carried into effect, reference will now be made by way of example to the accompanying drawings.
- FIG. 1 is a block diagram of a computer incorporating a cache system;
- FIG. 2 is a sketch illustrating a four way set associative cache;
- FIG. 3 is an example of an entry in a translation look aside buffer;
- FIG. 4 is a block diagram of the refill engine;
- FIG. 5 is a diagram illustrating the operation of a multi-tasking processor;
- FIG. 6 is a diagram illustrating the alteration in caching behaviour for the system of FIG. 5;
- FIG. 7 is a schematic block diagram of a CPU; and
- FIG. 8 is an example of an entry in a TLB in a second embodiment.
- FIG. 1 is a block diagram of a computer incorporating a cache system. The computer comprises a
CPU 2 which is connected to anaddress bus 4 for accessing items from a main memory 6 and to adata bus 8 for returning items to theCPU 2. Although thedata bus 8 is referred to herein as a data bus, it will be appreciated that this is for the return of items from the main memory 6, whether or not they constitute actual data or instructions for execution by the CPU. The system described herein is suitable for use on both instruction and data caches. As is known, there may be separate data and instruction caches, or the data and instruction cache may be combined. In the computer described herein, the addressing scheme is a so-called virtual addressing scheme. The address is split into a line in page address 4 a and avirtual page address 4 b. Thevirtual page address 4 b is supplied to a translation look-aside buffer (TLB) 10. The line in page address 4 a is supplied to a look-upcircuit 12. The translation look-asidebuffer 10 supplies areal page address 14 converted from thevirtual page address 4 b to the look-upcircuit 12. The look-upcircuit 12 is connected via address anddata buses cache access circuit 20. Again, thedata bus 18 can be for data items or instructions from the main memory 6. Thecache access circuit 20 is connected to a cache memory 22 via anaddress bus 24, adata bus 26 and a control bus 28 which transfers replacement information for the cache memory 22. Arefill engine 30 is connected to thecache access circuit 20 via arefill bus 32 which transfers replacement information, data items (or instructions) and addresses between the refill engine and the cache access circuit. Therefill engine 30 is itself tasking processor; - FIG. 6 is a diagram illustrating the alteration in caching behaviour for the system of FIG. 5;
- FIG. 7 is a schematic block diagram of a CPU; and
- FIG. 8 is an example of an entry in a TLB in a second embodiment.
- FIG. 1 is a block diagram of a computer incorporating a cache system. The computer comprises a
CPU 2 which is connected to anaddress bus 4 for accessing items from a main memory 6 and to adata bus 8 for returning items to theCPU 2. Although thedata bus 8 is referred to herein as a data bus, it will be appreciated that this is for the return of items from the main memory 6, whether or not they constitute actual data or instructions for execution by the CPU. The system described herein is suitable for use on both instruction and data caches. As is known, there may be separate data and instruction caches, or the data and instruction cache may be combined. In the computer described herein, the addressing scheme is a so-called virtual addressing scheme. The address is split into a line in page address 4 a and avirtual page address 4 b. Thevirtual page address 4 b is supplied to a translation look-aside buffer (TLB) 10. The line in page address 4 a is supplied to a look-up circuit 12. The translation look-aside buffer 10 supplies areal page address 14 converted from thevirtual page address 4 b to the look-up circuit 12. The look-up circuit 12 is connected via address anddata buses cache access circuit 20. Again, thedata bus 18 can be for data items or instructions from the main memory 6. Thecache access circuit 20 is connected to a cache memory 22 via anaddress bus 24, adata bus 26 and a control bus 28 which transfers replacement information for the cache memory 22. Arefill engine 30 is connected to thecache access circuit 20 via arefill bus 32 which transfers replacement information, data items (or instructions) and addresses between the refill engine and the cache access circuit. Therefill engine 30 is itself connected to the main memory 6. - The
refill engine 30 receives from the translation look-aside buffer 10 a fullreal address 34, comprising the real page address and line in page address of an item in the main memory 6. Therefill engine 30 also receives a partition indicator from the translation look-aside buffer 10 on a fourbit bus 36. The function of the partition indicator will be described hereinafter. - Finally, the
refill engine 30 receives a miss signal online 38 which is generated in the look-up circuit 12 in a manner which will be described more clearly hereinafter. - The cache memory22 described herein is a direct mapped cache. That is, it has a plurality of addressable storage locations, each location constituting one row of the cache. Each row contains an item from main memory and the address in main memory of that item. Each row is addressable by a row address which is constituted by a number of bits representing the least significant bits of the address in main memory of the data items stored at that row. For example, for a cache memory having eight rows, each row address would be three bits long to uniquely identify those rows. For example, the second row in the cache has a row address 001 and thus could hold any data items from main memory having an address in the main memory which ends in the bits 001. Clearly, in the main memory, there would be many such addresses and thus potentially many data items to be held at that row in the cache memory. Of course, the cache memory can hold only one data item at that row at any one time.
- Operation of the computer system illustrated in FIG. 1 will now be described but as though the partition indicator was not present. The
CPU 2 requests an item from main memory 6 using the address in main memory and transmits that address onaddress bus 4. The virtual page number is supplied to the translation look-aside buffer 10 which translates it into areal page number 14 according to a predetermined virtual to real page translation algorithm. Thereal page number 14 is supplied to the look-up circuit 12 together with the line in page number 4 a of the original address transmitted by theCPU 2. The line in page address is used by thecache access circuit 20 to address the cache memory 22. The line in page address includes a set of least significant bits (not necessarily including the end bits) of the main address in memory which are equivalent to the row address in the cache memory 22. The contents of the cache memory 22 at the row address identified by the line in page address, being a data item (or instruction) and the address in main memory of the data item (or instruction), are supplied to the look-up circuit 12. There, the real page number of the address which has been retrieved from the cache memory is compared with the real page number which has been supplied from the translation look-aside buffer 10. If these addresses match, the look-up circuit indicates a hit which causes the data item which was held at that row of the cache memory to be returned to the CPU alongdata bus 8. If however the real page number of the address which was held at the addressed row in the cache memory 22 does not match the real page number supplied from the translation look-aside buffer 10, then a miss signal is generated online 38 to therefill engine 30. It is the task of therefill engine 30 to retrieve the correct item from the main memory 6, using the real address which is supplied from the translation look-aside buffer 10 onbus 34. The data item, once fetched from main memory 6 is supplied to thecache access circuit 20 via therefill bus 32 and is loaded into the cache memory 22 together with the address in main memory. The data item itself is also returned to the CPU alongdata bus 8 so that the CPU can continue to execute. In a direct mapped cache memory as outlined above, it will be apparent that the data item and its address recalled from the main memory 6 will be loaded into the storage location from which the data item was originally accessed for checking. That is, it will be over-written into the only location which can accept it, having a row address matching the set of least significant bits in the line in page address in main memory. Of course, the page number of the data item originally stored in the cache memory and the data item which is now to be loaded into it are different. This “one to one mapping” limits the usefulness of the cache. - To provide a cache system with greater flexibility, an n-way set associative cache memory has been developed. An example of a 4-way set associative cache is illustrated in FIG. 2. The cache memory is divided into four banks B1,B2,B3,B4. The banks can be commonly addressed row-wise by a common row address, as illustrated schematically for one row in FIG. 2. However, that row contains four cache entries, one for each bank. The cache entry for bank B1 is output on bus 26 a, the cache entry for bank B2 is output on bus 26 b, and so on for banks B3 and B4. Thus, this allows four cache entries for one row address (or line in page address). Each time a row is addressed, four cache entries are output and the real page numbers of their addresses are compared with the real page number supplied from the translation look-
aside buffer 10 to determine which entry is the correct one. If there is a cache miss upon an attempted access to the cache, therefill engine 30 retrieves the requested item from the main memory 6 and loads it into the correct row in one of the banks, in accordance with a refill algorithm which is based on, for example, how long a particular item has been held in the cache, or other program parameters of the system. Such replacement algorithms are known and are not described further herein. - Nevertheless, the n-way set associative cache (where n is the number of banks and is equal to four in FIG. 2), while being an improvement on a single direct mapped system is still inflexible and, more importantly, does not allow the behaviour of the cache to be properly predictable.
- The system described herein provides a cache partitioning mechanism which allows the optimisation of the computer's use of the cache memory by a more flexible cache refill system.
- FIG. 7 is a schematic block diagram of a
CPU 2 using the computer of FIG. 1. TheCPU 2 comprises anexecution circuit 15 which is connected to a fetchcircuit 17 which is responsible for addressing memory via thememory bus 4 and retrieving data and instructions via thedata bus 8. A set of general purpose registers 7 is connected to theexecution circuit 15 for holding data and instructions for use in executing a process. In addition, a set of special registers are provided, denoted byreference numerals 9, 11 and 13. There may be any number of special purpose registers and by way of example register 11 holds the instruction pointer which identifies the line of code which is currently being executed. In addition, special register 9 holds a thread status word which defines the status of a process being executed by theCPU 2. Theexecution circuit 15 is capable of executing one process or sequence of instructions at any one time. However, it is equally capable of interrupting that process and starting to execute another process before the first process has finished executing. There are many reasons why a process currently being executed by theexecution circuit 15 may be interrupted. One is that a higher priority interrupt process is to be executed immediately. Another is that the process being executed is currently awaiting data with a long latency, so that it is more efficient for the execution circuit to commence executing a subsequent process while the first process is waiting for that data. When the data has been received, the first process can be reschedules for execution. The execution of concurrent processes is known per se and is managed by aprocess handler 19. - Each process is executed under a so-called “thread” of control. A thread has the following state:
- an instruction pointer which indicates where in the process the thread has advanced to,
- a jump pointer which indicates where the process will branch to next,
- a set of general purpose registers7 which contain immediately accessible values,
- the mapping of virtual addresses to physical addresses,
- the contents of memory accessible through the virtual addresses,
- control registers accessible by the thread, and
- optionally other values such a floating point rounding mode, whether the thread has kernel privileges etc.
- Some of the above state is specified by a small set of values which is referred to herein as thread status word and which is held in the thread status word register9. The thread status word specifically holds information about:
- whether the thread is in kernel mode or not,
- which virtual address space the thread can access,
- the floating point flags, trap enables and modes,
- debug information, and
- trap optimisation information.
- The format of the thread status word is defined in Table I.
TABLE I Name Bitsa Size Description TSW.FPFLAG 0-7 8 Floating point exception flags. TSW.FPTRAP 8-15 8 Floating point exception traps. TSW.FPMODE 16-19 4 Floating point modes. 20-31 Reserved. TSW.USER 32 1 Kernel mode (0)/user mode (1) TSW.SINGLE 33 1 Single step mode. TSW.TLB 34 1 First level TLB miss handler indicator. TSW.WATCH 35 1 Watchpoints enabled. TSW.ENABLE 36 1 Trap enable. 37-47 11 Reserved. TSW.GROUP 48-55 8 Group number. 56-63 Reserved. - In the translation look-
aside buffer 10 in the system described herein, each TLB entry has associated with the virtual page number, a real page number and an information sequence. An example entry is shown in FIG. 3, where VP represents the virtual page number, RP represents the real page number and INFO represents the information sequence. The information sequence contains various information about the address in memory in a manner which is known and which will not be described further herein. However, according to the presently described system the information sequence additionally contains a partition indicator PI, which in the described embodiment is four bits long. Thus, bits 0 to 3 of the information sequence INFO constitute the partition indicator. - An alternative arrangement for the TLB entry is illustrated in FIG. 8. As can be seen from Table I, the thread status word includes an 8 bit group number. This is used as described in the following to generate the partition indicator for allocating cache partitions.
- In the translation look-
aside buffer 10, each TLB entry has associated with the virtual page number, a real page number and an information sequence. The information sequence contains various information about the address in memory in a manner which is known and which will not be described further herein. However, in this embodiment the information sequence additionally contains a partition code which generates a partition indicator PI dependent on the group number and the virtual page number. This is illustrated diagrammatically in FIG. 8, where VP represents the virtual page number, RP represents the real page number, GN represents the group number and INFO represents the information sequence. In the described embodiment PI is four bits long. - The partition indicator gives information regarding the partition into which the data item may be placed when it is first loaded into the cache memory22. For the cache structure illustrated in FIG. 2, each partition can constitute one bank of the cache. In the partition indicator, each bit refers to one of the banks. The value of 1 in bit j of the partition indicator means that the data in that page may not be placed in partition j. The value of 0 in bit j means that the data in that page may be placed in partition j. Data may be placed in more than one partition by having a 0 in more than one bit of the partition indicator. A partition indicator which is all zeros allows the data to be placed in any partition of the cache. A partition indicator which is all ones does not allow any data items to be loaded into the cache memory. This could be used for example for “freezing” the contents of the cache, for example for diagnostic purposes.
- In the example given in FIG. 3, the partition indicator indicates that replacement of data items which have that real page number in main memory may not use banks B1 or B3 but may use banks B2 or B4.
- It is quite possible to allocate more than one bank to a page. In that case, if the line in page address has more bits than the row address for the cache, the partitions would behave as a k-way set associative cache, where k partitions are allocated to a page. Thus, in the described example the real page number of FIG. 3 can use banks B2 and B4. However, it may not use banks B1 and B3.
- The partition information is not used on cache look-up, but only upon cache replacement or refill. Thus, the cache access can locate data items held anywhere in the cache memory, whereas a replacement will only replace data into the allowed partitions for that page address.
- FIG. 4 illustrates in more detail the content of the
refill engine 30. Therefill bus 32 is shown in FIG. 4 as three separate buses, a data bus 32 a, anaddress bus 32 b and a bus 32 c carrying replacement information. The address and data buses 32 a and 32 c are supplied to amemory access circuit 50 which accesses the main memory via thememory bus 54. The replacement information is fed to adecision circuit 52 which also receives thereal address 34, the partition indicator P1 onbus 36 and themiss signal 38. Thedecision circuit 52 determines the proper partition of the cache into which data accessed the main memory is to be located. - The partition indicator PI can be set in the TLB like any other TLB entry. In the described example, the partition indicators are set by kernel mode software running on the
CPU 2 and it is the responsibility of that kernel mode software to ensure that pages which should not be placed in a particular cache partition do not have their partition indicator bits set for that partition. However, a user may alter partitions by requesting that the cache partitions be altered. In that event, theCPU 2 would change to kernel mode to implement the request, change the TLB entries accordingly and then return to the user mode to allow the user to continue. Thus, a user can alter the partitioning behaviour of the cache, thus providing much greater flexibility than has hitherto been possible. - The cache partitioning mechanism described herein is particularly useful for a multi-tasking CPU. A multi-tasking processor is capable of running more than one process “simultaneously”. In practice, the processor executes part of a process and, when that process is halted for some reason, perhaps in need of data or a stimulus to proceed, the processor immediately begins executing another process. Thus, the processor is always operating even when individual processes may be held up waiting for data or another stimulus to proceed. FIG. 5 illustrates diagrammatically such a situation. On the left hand side of FIG. 5 is illustrated the sequence which a processor may undertake to run different processes P1,P2,P3,P4. On the right hand side of FIG. 5 is an illustration of where these processes may expect their data to be held in memory. Thus, the data for the process P1 are held on page 0. The data for process P2 are held on
pages P4 share page 3. In the example, the processor executes a first sequence of process P1, a first sequence of process P2, a second sequence of process P1, a second sequence of process P2 and then a first sequence of process P3. When the second sequence of the process P1 has been executed, the process P1 has been fully run by the processor. It will readily be apparent that in a conventional cache system, once the processor has started executing the first sequence of the process P2, and is thus requesting accesses frompage 1, the data items and instructions in these lines will replace in the cache the previously stored data items and instructions from page 0. However, these may soon again be required when the second sequence of the process P1 is executed. - The cache partitioning mechanism described herein avoids the timing delays and uncertainties which can result from this. FIG. 6 shows the partitioning of the cache while the processor is running process P1, and the change in the partitioning when the processor switches to running P3 etc. FIG. 6 also shows the TLB cache partition indicators for each case. Thus, on the left hand side FIG. 5 shows the cache partitioned while the processor is running processes PI and P2. The process P1 may use banks B1 and B2 of the cache, but may not use banks B3 and B4. Conversely, the process P2 may use banks B3 and B4, but not banks B1 and B2. This can be seen in the TLB entries below. This is, page 0 has a cache partition indicator allowing it to access banks B1 and B2, but not B3 and B4.
Pages Page 3 has a cache partition indicator which prevents it from accessing the cache. Thus, any attempt by the processor to load data items from the process P3 into the cache would be prohibited. For the described process sequence, this however is not a disadvantage because, as can be seen, the processor is not intending to execute any part of the process P3 until it has finished executing process P1. If it did for some reason have to execute P3, the only downside would be that it would have to make its accesses from direct memory and would not be allowed use of the cache. - When the process P1 has finished executing, the processor can request kernel mode to allow it to alter the cache partition indicators in the TLB. In the described embodiment, kernel processes do not have access to the cache. Instead they modify the TLB entries for the partition indicators to modify the behaviour of the cache. The change is illustrated on the right hand side of FIG. 6. Thus, now the cache partition indicators prevent the process P1 from using the cache at all, but allocate banks B1 and B2 to the processes P3 and P4, by altering the cache partition indicator for
page 3 so that it can access these banks of the cache. Thus, when the processor is expecting to execute the process P3, it now has a cache facility. - It will be appreciated that the present invention is not restricted to the specifically described embodiment above. Some particular possible variations are mentioned below, but this is not a comprehensive list of the variations which are possible would be quite possible to combine their functions into a single cache access circuit which performs both look-up and refill.
Claims (23)
1. A cache system for operating between a processor and a main memory of a computer, the cache system comprising:
a cache memory having a set of cache partitions, each cache partition comprising a plurality of addressable storage locations for holding items fetched from said main memory for use by the processor,
a cache refill mechanism arranged to fetch an item from the main memory and to load said item into the cache memory at one of said addressable storage locations;
a cache partition access table holding in association with addresses of items to be cached respective multi-bit partition indicators identifying into which cache partition the item may be loaded,
wherein the cache refill mechanism is operable to allocate to each said item fetched from the main memory one or more of said cache partitions in dependence on the address of said item in the main memory.
2. A cache system according to claim 1 , wherein each address in main memory comprises a page number and a line in page number, and wherein the page numbers are held in the look-up table in association with their respective partition indicators.
3. A cache system according to claim 1 wherein the processor issues addresses comprising a virtual page number and a line in page number and wherein the system comprises a translation look-aside buffer for translating the virtual page number to a real page number for accessing the main memory, the translation look-aside buffer holding respective partition indicators in association with the real page numbers for identifying the cache partition into which the addressed item is to be loaded.
4. A cache system according to claim 2 , wherein the line in page number of the item's address is used to identify the address storage location within the cache partition into which the item is to be located.
5. A cache system according to claim 3 , wherein the line in page number of the item's address is used to identify the address storage location within the cache partition into which the item is to be located.
6. A cache system according to claim 2 , wherein one or more cache partitions is allocated to a page in main memory.
7. A cache system according to claim 3 , wherein one or more cache partitions is allocated to a page in main memory.
8. A cache system according to claim 4 , wherein one or more cache partitions is allocated to a page in main memory.
9. A cache system according to claim 1 , comprising a cache access circuit which accesses items from the cache memory according to the addresses in main memory of said items and regardless of the cache partition in which the item is held in the cache memory.
10. A cache system according to claim 2 , comprising a cache access circuit which accesses items from the cache memory according to the addresses in main memory of said items and regardless of the cache partition in which the item is held in the cache memory.
11. A cache system according to claim 3 , comprising a cache access circuit which accesses items from the cache memory according to the addresses in main memory of said items and regardless of the cache partition in which the item is held in the cache memory.
12. A cache system according to claim 4 , comprising a cache access circuit which accesses items from the cache memory according to the addresses in main memory of said items and regardless of the cache partition in which the item is held in the cache memory.
13. A cache system according to claim 5 , comprising a cache access circuit which accesses items from the cache memory according to the addresses in main memory of said items and regardless of the cache partition in which the item is held in the cache memory.
14. A method of operating a cache memory arranged between a processor and a main memory of a computer, wherein, when the processor requests an item from main memory using an address in main memory for said item and that item is not held in the cache memory, said item is fetched from the main memory and loaded into one of a plurality of addressable storage locations in the cache memory, the addressable storage locations being arranged as a set of cache partitions and wherein each address is associated with a multi-bit partition indicator identifying into which cache partition the item may be loaded so that one or more of said cache partitions is allocated to said item in dependence on the address of said item in main memory.
15. A method according to claim 14 , wherein each address in main memory comprises a page number and a line in page number and wherein a plurality of processes are held in the main memory, each process including one or more sequence of instructions held at addresses in main memory with a common page number.
16. A method according to claim 15 , wherein one of said cache partitions is allocated to a process by associating said one cache partition with page numbers of that process in the main memory.
17. A method according to claim 14 , wherein the number of addressable storage locations in each cache partition is alterable.
18. A method according to claim 15 , wherein the number of addressable storage locations in each cache partition is alterable.
19. A method according to claim 16 , wherein the number of addressable storage locations in each cache partition is alterable.
20. A method according to claim 14 , wherein the association of cache partitions to page numbers is alterable while a process using these page numbers is being run by the processor.
21. A method according to claim 15 , wherein the association of cache partitions to page numbers is alterable while a process using these page numbers is being run by the processor.
22. A method according to claim 16 , wherein the association of cache partitions to page numbers is alterable while a process using these page numbers is being run by the processor.
23. A method according to claim 17 , wherein the association of cache partitions to page numbers is alterable while a process using these page numbers is being run by the processor.
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB9701960.8 | 1997-01-30 | ||
GBGB9701960.8A GB9701960D0 (en) | 1997-01-30 | 1997-01-30 | A cache system |
GB9701960 | 1997-01-30 | ||
GB9725437.9 | 1997-12-01 | ||
GBGB9725437.9A GB9725437D0 (en) | 1997-12-01 | 1997-12-01 | A Cache system for concurrent proceses |
GB9725437 | 1997-12-01 |
Publications (2)
Publication Number | Publication Date |
---|---|
US20020083269A1 true US20020083269A1 (en) | 2002-06-27 |
US6453385B1 US6453385B1 (en) | 2002-09-17 |
Family
ID=26310901
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/014,315 Expired - Lifetime US6453385B1 (en) | 1997-01-30 | 1998-01-27 | Cache system |
US09/014,194 Expired - Lifetime US6295580B1 (en) | 1997-01-30 | 1998-01-27 | Cache system for concurrent processes |
US09/924,289 Expired - Lifetime US6629208B2 (en) | 1997-01-30 | 2001-08-08 | Cache system for concurrent processes |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/014,194 Expired - Lifetime US6295580B1 (en) | 1997-01-30 | 1998-01-27 | Cache system for concurrent processes |
US09/924,289 Expired - Lifetime US6629208B2 (en) | 1997-01-30 | 2001-08-08 | Cache system for concurrent processes |
Country Status (4)
Country | Link |
---|---|
US (3) | US6453385B1 (en) |
EP (2) | EP0856797B1 (en) |
JP (2) | JPH10232839A (en) |
DE (2) | DE69814703D1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020007442A1 (en) * | 1997-03-05 | 2002-01-17 | Glenn Farrall | Cache coherency mechanism |
US20130132639A1 (en) * | 2011-11-23 | 2013-05-23 | Smart Modular Technologies, Inc. | Non-volatile memory packaging system with caching and method of operation thereof |
Families Citing this family (86)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0856797B1 (en) * | 1997-01-30 | 2003-05-21 | STMicroelectronics Limited | A cache system for concurrent processes |
GB9701960D0 (en) * | 1997-01-30 | 1997-03-19 | Sgs Thomson Microelectronics | A cache system |
JPH10340197A (en) * | 1997-06-09 | 1998-12-22 | Nec Corp | Cashing control method and microcomputer |
US6260114B1 (en) * | 1997-12-30 | 2001-07-10 | Mcmz Technology Innovations, Llc | Computer cache memory windowing |
US6349363B2 (en) * | 1998-12-08 | 2002-02-19 | Intel Corporation | Multi-section cache with different attributes for each section |
GB9901933D0 (en) * | 1999-01-28 | 1999-03-17 | Univ Bristol | Cache memory |
US6453321B1 (en) * | 1999-02-11 | 2002-09-17 | Ibm Corporation | Structured cache for persistent objects |
US6493800B1 (en) * | 1999-03-31 | 2002-12-10 | International Business Machines Corporation | Method and system for dynamically partitioning a shared cache |
US6535905B1 (en) | 1999-04-29 | 2003-03-18 | Intel Corporation | Method and apparatus for thread switching within a multithreaded processor |
US6625693B2 (en) * | 1999-05-04 | 2003-09-23 | Intel Corporation | Fast exception processing |
US6341347B1 (en) * | 1999-05-11 | 2002-01-22 | Sun Microsystems, Inc. | Thread switch logic in a multiple-thread processor |
US6542921B1 (en) | 1999-07-08 | 2003-04-01 | Intel Corporation | Method and apparatus for controlling the processing priority between multiple threads in a multithreaded processor |
JP4497689B2 (en) * | 1999-10-01 | 2010-07-07 | キヤノン株式会社 | Printing device, exchange unit, and memory unit |
US7689462B1 (en) | 1999-10-28 | 2010-03-30 | Ebay Inc. | Computer system and method for providing an on-line mall |
US6457102B1 (en) * | 1999-11-05 | 2002-09-24 | Emc Corporation | Cache using multiple LRU's |
AU7728300A (en) * | 1999-11-22 | 2001-06-04 | Ericsson Inc. | Buffer memories, methods and systems for buffering having seperate buffer memories for each of a plurality of tasks |
US6496925B1 (en) | 1999-12-09 | 2002-12-17 | Intel Corporation | Method and apparatus for processing an event occurrence within a multithreaded processor |
US6889319B1 (en) | 1999-12-09 | 2005-05-03 | Intel Corporation | Method and apparatus for entering and exiting multiple threads within a multithreaded processor |
US7051329B1 (en) | 1999-12-28 | 2006-05-23 | Intel Corporation | Method and apparatus for managing resources in a multithreaded processor |
US6662297B1 (en) * | 1999-12-30 | 2003-12-09 | Intel Corporation | Allocation of processor bandwidth by inserting interrupt servicing instructions to intervene main program in instruction queue mechanism |
US7856633B1 (en) | 2000-03-24 | 2010-12-21 | Intel Corporation | LRU cache replacement for a partitioned set associative cache |
US6587937B1 (en) * | 2000-03-31 | 2003-07-01 | Rockwell Collins, Inc. | Multiple virtual machine system with efficient cache memory design |
WO2002008911A1 (en) * | 2000-07-24 | 2002-01-31 | Hitachi,Ltd | Data processing system |
JP2002189603A (en) * | 2000-12-19 | 2002-07-05 | Fujitsu Ltd | Computer and controlling method thereof |
US6961773B2 (en) * | 2001-01-19 | 2005-11-01 | Esoft, Inc. | System and method for managing application service providers |
US6604175B2 (en) * | 2001-03-01 | 2003-08-05 | Sony Corporation | Data cache and method of storing data by assigning each independently cached area in the cache to store data associated with one item type |
US7555561B2 (en) * | 2001-03-19 | 2009-06-30 | The Aerospace Corporation | Cooperative adaptive web caching routing and forwarding web content data broadcasting method |
JP2002342163A (en) * | 2001-05-15 | 2002-11-29 | Fujitsu Ltd | Method for controlling cache for multithread processor |
US20030041213A1 (en) * | 2001-08-24 | 2003-02-27 | Yakov Tokar | Method and apparatus for using a cache memory |
JP2003069639A (en) * | 2001-08-27 | 2003-03-07 | Nec Corp | xDSL STORAGE DEVICE, MULTICAST DELIVERY SYSTEM, AND DATA DELIVERY METHOD |
US6768358B2 (en) * | 2001-08-29 | 2004-07-27 | Analog Devices, Inc. | Phase locked loop fast power up methods and apparatus |
US6848026B2 (en) * | 2001-11-09 | 2005-01-25 | International Business Machines Corporation | Caching memory contents into cache partitions based on memory locations |
US6857046B1 (en) * | 2002-03-28 | 2005-02-15 | Cisco Technology, Inc. | Caching for context switching applications |
US6857937B2 (en) * | 2002-05-30 | 2005-02-22 | Komag, Inc. | Lapping a head while powered up to eliminate expansion of the head due to heating |
US8024735B2 (en) | 2002-06-14 | 2011-09-20 | Intel Corporation | Method and apparatus for ensuring fairness and forward progress when executing multiple threads of execution |
JP3900025B2 (en) * | 2002-06-24 | 2007-04-04 | 日本電気株式会社 | Hit determination control method for shared cache memory and hit determination control method for shared cache memory |
US7062606B2 (en) * | 2002-11-01 | 2006-06-13 | Infineon Technologies Ag | Multi-threaded embedded processor using deterministic instruction memory to guarantee execution of pre-selected threads during blocking events |
JP4664586B2 (en) * | 2002-11-11 | 2011-04-06 | パナソニック株式会社 | Cache control device, cache control method, and computer system |
US7088950B2 (en) * | 2002-11-26 | 2006-08-08 | Nokia Corporation | Method and apparatus for controlling integrated receiver operation in a communications terminal |
US7103748B2 (en) * | 2002-12-12 | 2006-09-05 | International Business Machines Corporation | Memory management for real-time applications |
KR100985239B1 (en) * | 2003-02-24 | 2010-10-04 | 엔엑스피 비 브이 | Reducing cache trashing of certain pieces |
KR20050116811A (en) * | 2003-03-06 | 2005-12-13 | 코닌클리즈케 필립스 일렉트로닉스 엔.브이. | Data processing system with cache optimised for processing dataflow applications |
CA2435148A1 (en) | 2003-07-15 | 2005-01-15 | Robert J. Blainey | System and method for lock caching for compound atomic operations on shared memory |
CN1879092B (en) * | 2003-11-12 | 2010-05-12 | 松下电器产业株式会社 | Cache memory and control method thereof |
US7590830B2 (en) * | 2004-05-28 | 2009-09-15 | Sun Microsystems, Inc. | Method and structure for concurrent branch prediction in a processor |
US9124653B2 (en) * | 2004-09-03 | 2015-09-01 | Symantec Corporation | Method and apparatus for allowing sharing of streamable applications |
US7257678B2 (en) * | 2004-10-01 | 2007-08-14 | Advanced Micro Devices, Inc. | Dynamic reconfiguration of cache memory |
US7472224B1 (en) | 2004-10-01 | 2008-12-30 | Advanced Micro Devices, Inc. | Reconfigurable processing node including first and second processor cores |
WO2006082554A2 (en) * | 2005-02-02 | 2006-08-10 | Koninklijke Philips Electronics N.V. | Data processing system comprising a cache unit |
US8489846B1 (en) | 2005-06-24 | 2013-07-16 | Rockwell Collins, Inc. | Partition processing system and method for reducing computing problems |
US20070136177A1 (en) * | 2005-12-09 | 2007-06-14 | Ebay Inc. | Registry for on-line auction system |
US8275942B2 (en) * | 2005-12-22 | 2012-09-25 | Intel Corporation | Performance prioritization in multi-threaded processors |
US20070162475A1 (en) * | 2005-12-30 | 2007-07-12 | Intel Corporation | Method and apparatus for hardware-based dynamic escape detection in managed run-time environments |
WO2007099483A2 (en) * | 2006-03-02 | 2007-09-07 | Nxp B.V. | Method and apparatus for dynamic resizing of cache partitions based on the execution phase of tasks |
US8898652B2 (en) * | 2006-03-23 | 2014-11-25 | Microsoft Corporation | Cache metadata for accelerating software transactional memory |
US20080010413A1 (en) * | 2006-07-07 | 2008-01-10 | Krishnan Kunjunny Kailas | Method and apparatus for application-specific dynamic cache placement |
US20080024489A1 (en) * | 2006-07-28 | 2008-01-31 | Robert Allen Shearer | Cache Utilization Optimized Ray Traversal Algorithm with Minimized Memory Bandwidth Requirements |
US7565492B2 (en) | 2006-08-31 | 2009-07-21 | Intel Corporation | Method and apparatus for preventing software side channel attacks |
US7610448B2 (en) * | 2006-12-27 | 2009-10-27 | Intel Corporation | Obscuring memory access patterns |
US8230154B2 (en) * | 2007-01-19 | 2012-07-24 | Spansion Llc | Fully associative banking for memory |
US7681020B2 (en) * | 2007-04-18 | 2010-03-16 | International Business Machines Corporation | Context switching and synchronization |
JP5245349B2 (en) | 2007-10-17 | 2013-07-24 | 日本電気株式会社 | Registration destination way fixing method, processor, and information processing apparatus |
US8095736B2 (en) * | 2008-02-25 | 2012-01-10 | Telefonaktiebolaget Lm Ericsson (Publ) | Methods and systems for dynamic cache partitioning for distributed applications operating on multiprocessor architectures |
US8250305B2 (en) * | 2008-03-19 | 2012-08-21 | International Business Machines Corporation | Method, system and computer program product for data buffers partitioned from a cache array |
JP5239890B2 (en) * | 2009-01-21 | 2013-07-17 | トヨタ自動車株式会社 | Control device |
US8108650B2 (en) | 2009-05-29 | 2012-01-31 | Apple Inc. | Translation lookaside buffer (TLB) with reserved areas for specific sources |
US8250332B2 (en) * | 2009-06-11 | 2012-08-21 | Qualcomm Incorporated | Partitioned replacement for cache memory |
US8543769B2 (en) * | 2009-07-27 | 2013-09-24 | International Business Machines Corporation | Fine grained cache allocation |
US8745618B2 (en) * | 2009-08-25 | 2014-06-03 | International Business Machines Corporation | Cache partitioning with a partition table to effect allocation of ways and rows of the cache to virtual machine in virtualized environments |
MX345332B (en) * | 2010-01-29 | 2017-01-25 | Nestec Sa | Extruded animal litters having an increased absorption rate. |
US9104583B2 (en) | 2010-06-24 | 2015-08-11 | International Business Machines Corporation | On demand allocation of cache buffer slots |
FR2962567B1 (en) * | 2010-07-12 | 2013-04-26 | Bull Sas | METHOD FOR OPTIMIZING MEMORY ACCESS, WHEN RE-EXECUTING AN APPLICATION, IN A MICROPROCESSOR COMPRISING SEVERAL LOGICAL HEARTS AND COMPUTER PROGRAM USING SUCH A METHOD |
US8868843B2 (en) | 2011-11-30 | 2014-10-21 | Advanced Micro Devices, Inc. | Hardware filter for tracking block presence in large caches |
US9824013B2 (en) | 2012-05-08 | 2017-11-21 | Qualcomm Incorporated | Per thread cacheline allocation mechanism in shared partitioned caches in multi-threaded processors |
US9336147B2 (en) | 2012-06-12 | 2016-05-10 | Microsoft Technology Licensing, Llc | Cache and memory allocation for virtual machines |
US9432806B2 (en) | 2012-12-04 | 2016-08-30 | Ebay Inc. | Dynamic geofence based on members within |
US9098417B2 (en) * | 2012-12-13 | 2015-08-04 | Advanced Micro Devices, Inc. | Partitioning caches for sub-entities in computing devices |
JP6088951B2 (en) | 2013-09-20 | 2017-03-01 | 株式会社東芝 | Cache memory system and processor system |
JP5992592B1 (en) | 2015-09-16 | 2016-09-14 | 株式会社東芝 | Cache memory system |
US20170083441A1 (en) * | 2015-09-23 | 2017-03-23 | Qualcomm Incorporated | Region-based cache management |
US10197999B2 (en) | 2015-10-16 | 2019-02-05 | Lemmings, Llc | Robotic golf caddy |
US9852084B1 (en) | 2016-02-05 | 2017-12-26 | Apple Inc. | Access permissions modification |
US20190034337A1 (en) * | 2017-12-28 | 2019-01-31 | Intel Corporation | Multi-level system memory configurations to operate higher priority users out of a faster memory level |
CN111984197B (en) * | 2020-08-24 | 2023-12-15 | 许昌学院 | Computer cache allocation method |
US11860780B2 (en) | 2022-01-28 | 2024-01-02 | Pure Storage, Inc. | Storage cache management |
WO2024006371A1 (en) | 2022-06-28 | 2024-01-04 | Apple Inc. | Pc-based computer permissions |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS58147879A (en) * | 1982-02-26 | 1983-09-02 | Toshiba Corp | Control system of cache memory |
US4905141A (en) * | 1988-10-25 | 1990-02-27 | International Business Machines Corporation | Partitioned cache memory with partition look-aside table (PLAT) for early partition assignment identification |
US5875464A (en) * | 1991-12-10 | 1999-02-23 | International Business Machines Corporation | Computer system with private and shared partitions in cache |
US5487162A (en) * | 1992-02-25 | 1996-01-23 | Matsushita Electric Industrial Co., Ltd. | Cache lock information feeding system using an address translator |
WO1995012165A1 (en) * | 1993-10-22 | 1995-05-04 | Gestalt Technologies, Incorporated | Distributed management in a partitioned memory system |
US5537635A (en) * | 1994-04-04 | 1996-07-16 | International Business Machines Corporation | Method and system for assignment of reclaim vectors in a partitioned cache with a virtual minimum partition size |
US5584014A (en) * | 1994-12-20 | 1996-12-10 | Sun Microsystems, Inc. | Apparatus and method to preserve data in a set associative memory device |
US5796944A (en) * | 1995-07-12 | 1998-08-18 | 3Com Corporation | Apparatus and method for processing data frames in an internetworking device |
JP3348367B2 (en) * | 1995-12-06 | 2002-11-20 | 富士通株式会社 | Multiple access method and multiple access cache memory device |
US5809522A (en) * | 1995-12-18 | 1998-09-15 | Advanced Micro Devices, Inc. | Microprocessor system with process identification tag entries to reduce cache flushing after a context switch |
GB2311880A (en) * | 1996-04-03 | 1997-10-08 | Advanced Risc Mach Ltd | Partitioned cache memory |
EP0856797B1 (en) * | 1997-01-30 | 2003-05-21 | STMicroelectronics Limited | A cache system for concurrent processes |
-
1998
- 1998-01-26 EP EP98300515A patent/EP0856797B1/en not_active Expired - Lifetime
- 1998-01-26 DE DE69814703T patent/DE69814703D1/en not_active Expired - Lifetime
- 1998-01-26 EP EP98300518A patent/EP0856798B1/en not_active Expired - Lifetime
- 1998-01-26 DE DE69826539T patent/DE69826539D1/en not_active Expired - Lifetime
- 1998-01-27 US US09/014,315 patent/US6453385B1/en not_active Expired - Lifetime
- 1998-01-27 US US09/014,194 patent/US6295580B1/en not_active Expired - Lifetime
- 1998-01-30 JP JP10019729A patent/JPH10232839A/en active Pending
- 1998-01-30 JP JP10019476A patent/JPH10232834A/en active Pending
-
2001
- 2001-08-08 US US09/924,289 patent/US6629208B2/en not_active Expired - Lifetime
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020007442A1 (en) * | 1997-03-05 | 2002-01-17 | Glenn Farrall | Cache coherency mechanism |
US6546467B2 (en) * | 1997-03-05 | 2003-04-08 | Sgs-Thomson Microelectronics Limited | Cache coherency mechanism using an operation to be executed on the contents of a location in a cache specifying an address in main memory |
US20130132639A1 (en) * | 2011-11-23 | 2013-05-23 | Smart Modular Technologies, Inc. | Non-volatile memory packaging system with caching and method of operation thereof |
US9424188B2 (en) * | 2011-11-23 | 2016-08-23 | Smart Modular Technologies, Inc. | Non-volatile memory packaging system with caching and method of operation thereof |
Also Published As
Publication number | Publication date |
---|---|
US6295580B1 (en) | 2001-09-25 |
EP0856798A1 (en) | 1998-08-05 |
EP0856798B1 (en) | 2004-09-29 |
US6453385B1 (en) | 2002-09-17 |
DE69826539D1 (en) | 2004-11-04 |
US6629208B2 (en) | 2003-09-30 |
EP0856797B1 (en) | 2003-05-21 |
JPH10232839A (en) | 1998-09-02 |
DE69814703D1 (en) | 2003-06-26 |
US20020002657A1 (en) | 2002-01-03 |
JPH10232834A (en) | 1998-09-02 |
EP0856797A1 (en) | 1998-08-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6453385B1 (en) | Cache system | |
US7437514B2 (en) | Cache system | |
US5410669A (en) | Data processor having a cache memory capable of being used as a linear ram bank | |
US5974508A (en) | Cache memory system and method for automatically locking cache entries to prevent selected memory items from being replaced | |
US6092172A (en) | Data processor and data processing system having two translation lookaside buffers | |
US6772316B2 (en) | Method and apparatus for updating and invalidating store data | |
US8806177B2 (en) | Prefetch engine based translation prefetching | |
US6625714B1 (en) | Parallel distributed function translation lookaside buffer | |
US6546467B2 (en) | Cache coherency mechanism using an operation to be executed on the contents of a location in a cache specifying an address in main memory | |
US20070094450A1 (en) | Multi-level cache architecture having a selective victim cache | |
US5715427A (en) | Semi-associative cache with MRU/LRU replacement | |
JPH08272682A (en) | Tag separated at inside of load/store unit provided with load buffer and method for access to data array as well as apparatus provided with said array | |
US6751700B2 (en) | Date processor and storage system including a set associative cache with memory aliasing | |
US5603008A (en) | Computer system having cache memories with independently validated keys in the TLB | |
US6311253B1 (en) | Methods for caching cache tags | |
JPH08314802A (en) | Cache system,cache memory address unit and method for operation of cache memory | |
US6766435B1 (en) | Processor with a general register set that includes address translation registers | |
EP0611462B1 (en) | Memory unit including a multiple write cache | |
KR0173854B1 (en) | Cache memory control method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SGS-THOMSON MICROELECTRONICS LIMITED, UNITED KINGD Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:STURGES, ANDREW CRAIG;MAY, DAVID;FARRALL, GLENN;AND OTHERS;REEL/FRAME:009137/0370;SIGNING DATES FROM 19980218 TO 19980223 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FPAY | Fee payment |
Year of fee payment: 8 |
|
FPAY | Fee payment |
Year of fee payment: 12 |