US20180157593A1 - Value cache in a computing system - Google Patents

Value cache in a computing system Download PDF

Info

Publication number
US20180157593A1
US20180157593A1 US15/372,135 US201615372135A US2018157593A1 US 20180157593 A1 US20180157593 A1 US 20180157593A1 US 201615372135 A US201615372135 A US 201615372135A US 2018157593 A1 US2018157593 A1 US 2018157593A1
Authority
US
United States
Prior art keywords
value
cache
specified
primary
computer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US15/372,135
Other versions
US9990301B1 (en
Inventor
Shobhit O. Kanaujia
Kalyan Saladi
Narsing Vijayrao
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Meta Platforms Inc
Original Assignee
Facebook Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Facebook Inc filed Critical Facebook Inc
Priority to US15/372,135 priority Critical patent/US9990301B1/en
Assigned to FACEBOOK, INC. reassignment FACEBOOK, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: VIJAYRAO, NARSING, KANAUJIA, SHOBHIT O, SALADI, KALYAN
Application granted granted Critical
Publication of US9990301B1 publication Critical patent/US9990301B1/en
Publication of US20180157593A1 publication Critical patent/US20180157593A1/en
Assigned to META PLATFORMS, INC. reassignment META PLATFORMS, INC. CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: FACEBOOK, INC.
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0891Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using clearing, invalidating or resetting means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0893Caches characterised by their organisation or structure
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • G06F3/0641De-duplication techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/60Details of cache memory

Definitions

  • Caching is a process in which data is stored in a central processing unit (CPU) cache of a computer to reduce the average cost, e.g., time or other computing resources, to access data from the main memory.
  • the cache is a smaller, faster memory which stores copies of the data from frequently used main memory locations.
  • Cache management is a process that controls movement of data between the cache and the main memory. More the number of cache hits, the lesser is the average cost of accessing the data. The cache hits can be maximized by storing all or most of the data from the memory in the cache.
  • cache memory is very expensive and the cost of the cache increases exponentially with the size of the cache. Having a cache that can store all or most of the data from the memory in the cache may not be practical. Accordingly, the cache management processes use various caching policies to selectively store data that is requested more often than not to maximize the cache hits.
  • prior cache management systems cache data items that are more frequently accessed. At least some of these data items can be duplicates. That is, the cache can store multiple copies of a data item. For example, if a cache management process determines that memory locations, e.g., “address 1” and “address 2,” are more frequently accessed and have to be cached, the values of “address 1” and “address 2” are cached regardless of whether those addresses store the same or different values. By storing duplicate values, the amount of cache storage available for storing distinct values is reduced, which can minimize the number cache hits. Further, it can also increase cache pollution and memory bandwidth.
  • FIG. 1 is a block diagram of an environment in which the disclosed embodiments may be implemented.
  • FIG. 2 is a block diagram of hierarchy of a primary cache and value cache of the computer system of FIG. 1 , consistent with various embodiments.
  • FIG. 3 is a block diagram illustrating an example movement of data between the primary cache and the value cache, consistent with various embodiments.
  • FIG. 4 is a block diagram illustrating an example of executing a data access request in the value cache of FIG. 1 , consistent with various embodiments.
  • FIG. 5 is a flow diagram of a process for promoting a candidate value to the primary cache, consistent with various embodiments.
  • FIG. 6 is a flow diagram of a process for promoting a candidate value to the primary cache, consistent with various embodiments.
  • FIG. 7 is a flow diagram of a process for processing a data access request in the computer system of FIG. 1 , consistent with various embodiments.
  • FIG. 8 is a flow diagram of a process for processing a data update request in the computer system of FIG. 1 , consistent with various embodiments.
  • FIG. 9 is a block diagram of a processing system that can implement operations, consistent with various embodiments.
  • Embodiments are directed to a cache management system for storing data in a cache of a computer system in a compact form.
  • the cache management system compresses multiple cache blocks of the primary cache storing the same value into a single cache block storing that value.
  • the cache management system identifies multiple instances or occurrences of a candidate value stored in the primary cache and stores those multiple instances of the primary cache candidate value as a single value, thereby making more space available in the primary cache for storing distinct values.
  • the cache management system reduces the cache pollution and the memory bandwidth.
  • such compression of data in the cache can minimize the amount of silicon real estate for storing a given amount of data.
  • the cache management system facilitates storing more values, e.g., distinct values, in the primary cache of a given size.
  • the cache management system includes a value cache that stores values that occurs multiple times in the primary cache.
  • the cache management system identifies a candidate value in the primary cache that satisfies a specified criterion, e.g., number of occurrences of the candidate value exceeding a specified threshold, for being promoted to the value cache, and then stores the candidate value as a single instance in a value buffer of the value cache. Further, the cache management system also stores multiple pointers to the candidate value in the value cache in which each of the pointers corresponds to an address in a memory of the computer system storing an instance of the candidate value.
  • the cache management can store as many pointers as the number of occurrences of the candidate value in the primary cache.
  • the pointers can be stored in a pointer array of the value cache.
  • the value cache has the same hierarchy as the primary cache, e.g., if the primary cache has L1, L2, L3 hierarchy, then the value cache also has the same hierarchy.
  • the cache management system can allocate at least a portion of the primary cache to form the value cache.
  • the value cache and the primary cache can be exclusive to each other, e.g., a value is either stored in the value cache or the primary cache.
  • the cache management system can execute a read operation at both the value cache and the primary cache, e.g., in parallel. If there is a hit for the specified memory address in the value cache, the specified value is returned from the value cache. If there is a miss in the value cache and a hit in the primary cache, the specified value is returned from the primary cache. If there is a miss in both the value cache and the primary cache, then the specified value is obtained from the memory. In some embodiments, the specified value may be written to the primary cache after being obtained from the memory depending on caching policies used, e.g., a cache policy for storing most recently used value.
  • the cache management system can determine if there is a hit for the specified address in the value cache (which means the write operation is updating the current value stored at the specified memory address), and if there is a hit, the cache management system can determine if the specified value is different from the current value stored in the cache. If the specified value is different from the current value, the specified memory address is evicted from the value cache and the specified value is written to the memory at the specified memory address. Once the specified value is written to the memory, the specified value may also be written to the primary cache based on the caching policies used, e.g., a cache policy for storing most recently used value. If the specified value is the same as the current value, the specified memory address is added to the value cache if not already existing.
  • the cache management system determines if there is a hit in the primary cache. If there is a hit in the primary cache, the current value in the primary cache is updated with the specified value, which can be further pushed to the memory based on cache eviction policies. If there is neither a hit in the primary cache nor the value cache, the specified value is written to the specified memory address in the memory. After writing the specified value to the memory, the specified value can also be written to the primary cache as well based on the caching policies implemented in the computer system.
  • the cache management system can determine if there are multiple instances of the specified value in the primary cache and if there are multiple instances, determine if the specified value satisfies the criterion for being promoted to the value cache. If the specified value satisfies the criterion for being promoted to the value cache, the specified value is promoted to the value cache and all the instances of the specified value in the primary cache are evicted, thereby making storage space available for storing more number of distinct values.
  • FIG. 1 is a block diagram of an environment 100 in which the disclosed embodiments may be implemented.
  • the environment 100 includes a computer system 150 in which the disclosed embodiments can be implemented.
  • the computer system 150 can be any type of computing device in which a cache can be implemented, e.g., a server computing device, a desktop, a laptop, a tablet PC, a smartphone, and a wearable computing device.
  • the computer system 150 includes a processor 105 that executes various operations of the computer system 150 , e.g., read and/or write operations.
  • the computer system 150 includes a memory 125 , which can store data.
  • the data can be received from another entity, e.g., another computer system (not illustrated), retrieved from a secondary storage system of the computer system 150 , e.g., a disk drive (not illustrated) and/or generated by the computer system 150 .
  • the memory 125 can be a random access memory (RAM) or a variation thereof.
  • the computer system 150 includes (a) a primary cache 120 that caches data, e.g., from the memory 125 , based on various caching policies, and (b) a value cache 115 that stores a candidate value from the primary cache 120 in a compact and/or compressed form.
  • a read and/or write latencies of the primary cache 120 are typically lower compared to that of the memory 125 .
  • the primary cache 120 can be made up of multiple caches that are organized into various hierarchies an example of which is illustrated in FIG. 2 (described below in detail).
  • the hierarchy of the value cache 115 is the same as the hierarchy of the primary cache 120 , e.g., as illustrated in FIG. 2 .
  • the computer system 150 includes a cache management component 110 that among various other operations manages movement of data between the primary cache 120 and the value cache 115 .
  • the cache management component 110 identifies a candidate value in the primary cache 120 that satisfies the criterion for being promoted to the value cache 115 and promotes it to the value cache 115 .
  • the criterion can be based on a number of instances or copies of the candidate value being stored in the primary cache 120 . For example, the criterion can be that the number of instances of the candidate value stored in the primary cache 120 exceeds a specified threshold.
  • the cache management component 110 After identifying the candidate value that satisfies the criterion, the cache management component 110 promotes the candidate value to the value cache 115 , e.g., stores those multiple instances of the candidate value in the primary cache as a single instance in the value cache 115 . Further, the cache management component 110 also evicts those multiple instances of the candidate value from the primary cache 120 after the candidate value is promoted to the value cache 115 .
  • the multiple instances of the candidate value in the primary cache 120 correspond to the candidate value stored at multiple addresses of the memory 125 .
  • a pointer that is associated with each of these multiple addresses and that points to a location in the value cache 115 at which the single instance is stored is added to the value cache 115 .
  • a memory address of the candidate value is retrieved from and/or derived from the information in the data access request. If the memory address is present in the value cache 115 , the memory address resolves to a specified pointer corresponding to the memory address.
  • the candidate value can be retrieved from the value cache 115 based on the specified pointer, which points to a location in the value cache 115 at which the single instance of the candidate value is stored.
  • the cache management component 110 can also demote or evict a specified value from the value cache 115 to the primary cache 120 if the specified value ceases to satisfy the criterion for being promoted to or stored at the value cache 115 .
  • the cache management component 110 can execute a read request for a candidate value at both the primary cache 120 and the value cache 115 , e.g., simultaneously, in parallel, or in a serial fashion.
  • the cache management component 110 is made up of multiple components, which perform discrete operations. For example, a first component (not illustrated) in the cache management component 110 can identify the multiple instances of the candidate value in the primary cache 120 . A second component (not illustrated) can promote the candidate value to the value cache 115 , and a third component (not illustrated) can serve a read request for the candidate value by obtaining the candidate value from the value cache 115 .
  • the value cache 115 is not a separate hardware from that of the primary cache 120 .
  • the cache management component 110 can allocate a portion of the primary cache 120 for forming the value cache 115 .
  • FIG. 2 is a block diagram of hierarchy 200 of a primary cache and value cache of the computer system of FIG. 1 , consistent with various embodiments.
  • the primary cache 120 can be made of multiple caches.
  • the primary cache 120 can include a first level cache “L1” 205 , a second level cache “L2” and a third level cache 215 as the last level cache “LLC.”
  • all these caches can be on-chip caches.
  • the “lower-level” caches e.g., L1 cache 205
  • the “lower-level” caches have a smaller number of blocks, smaller block size, and fewer blocks in a set, but have very short access times compared to “higher-level” caches (e.g., L2 210 and above).
  • Cache-entry replacement policy is determined by a caching policy selected. That is, data is propagated from the memory 125 to these different levels based on caching policies.
  • the value cache 115 typically has the same hierarchy as the primary cache 120 .
  • the value cache 115 has a first level cache “VC1” 255 that has similar properties as that of L1 205 , a second level cache “VC2” 260 that has similar properties as that of L2 260 , and a third level cache “VLLC” 265 that has similar properties as that of LLC 215 .
  • FIG. 3 is a block diagram of an example 300 illustrating movement of data from the primary cache to the value cache, consistent with various embodiments.
  • the primary cache 120 stores multiple instances of a first value “v1,” a second value “v2” and a third value “v3.”
  • the criterion for promoting a value from the primary cache 120 to the value cache 115 is number of instances of a value exceeding three. Accordingly, the cache management component 110 determines that the first value, “v1,” which occurs four times in the primary cache 120 , satisfies the criterion for being promoted to the value cache 115 .
  • the criterion can be set by a user.
  • the cache management component 110 promotes the first value to the value cache 115 , e.g., by storing a single instance of the first value in the value cache 115 .
  • the value cache 115 includes a value buffer 310 in which values can be stored.
  • the value buffer 310 is an array data structure.
  • the first value is stored in the value buffer 310 .
  • a number of pointers associated with the memory addresses e.g., “A,” “B,” “C,” and “D” storing the first value, and each of which points to a location in the value buffer 310 that stores the first value is also stored in the value cache 115 , e.g., in a pointer array 305 .
  • the cache management component 110 evicts the multiple instances of the first value from the primary cache 120 , thereby making storage space available for storing more number of distinct values in the primary cache 120 .
  • the amount of storage space consumed in storing a given set of data is minimized as the storage space consumed in storing a pointer is lesser compared to that consumed in storing a copy of the value.
  • more number of distinct values can be stored in a primary cache of a given size as at least some of the duplicate values are promoted to the value cache 115 . That is, the available storage space in the primary cache 120 is maximized for storing more number of distinct values.
  • the cache management component 110 retrieves or otherwise derives the memory address from the data access request, resolves the memory address to a specified pointer in the pointer array 305 and obtains the value from the value buffer using the specified pointer.
  • the value cache 115 and the primary cache 120 are exclusive to each other, e.g., a candidate value can be stored in either the value cache 115 or the primary cache 120 .
  • FIG. 4 is a block diagram illustrating an example 400 of executing a data access request in the value cache of FIG. 1 , consistent with various embodiments.
  • the processor 105 forwards the request to the cache management component 110 .
  • the cache management component 110 retrieves or otherwise derives the memory address, e.g., input address 405 , at which the data is to be accessed from the data access request.
  • the processor 105 may itself retrieve or otherwise derive the input address 405 from the data access request and forward it to the cache management component 110 .
  • the input address 405 which is of a particular bit length, can be divided into three portions, a tag 406 , an index 407 and an offset 408 all of which together represent the specified memory address.
  • Each of the three portions can be of one or more bits.
  • the tag 406 is a unique identifier for a group of data. Because different regions of the memory 125 may be mapped into a cache line of the primary cache 120 , the tag 406 is used to differentiate between them.
  • the cache line can be a basic unit for cache storage and can store multiple bytes or words of data.
  • the index 407 can indicate at which cache line the data has been stored.
  • the offset 408 can indicate the offset of the data stored in the cache line indicated by the index.
  • a tag array 410 which can be an array data structure, has as many rows as the number of cache lines of the primary cache 120 , and each row typically stores the tags of the data stored in the corresponding cache line.
  • the cache management component 110 indexes into the tag array 410 to a specified row corresponding to the index 407 and obtains the tag stored in the specified row. The cache management component 110 then compares the tag 406 in the input address with the tag retrieved from the specified row. If there is a hit, e.g., the tag in the specified row matches the tag 406 , the cache management component 110 retrieves a specified pointer from the pointer array 305 stored at a location corresponding to the matched tag. For example, if the matched tag is stored at row 2, then the specified pointer is retrieved from row 2 and offset 408 in row 2 of the pointer array 305 . Finally, the data stored in the value buffer 310 at a location indicated by the specified pointer is retrieved and returned to the requesting entity.
  • the cache management component 110 performs a look up in the primary cache 120 for the input address 405 . If there is a hit in the primary cache 120 , e.g., the input address 405 is available in the primary cache 120 , the value is retrieved and returned to the requesting entity. However, if there is a miss in the primary cache 120 , the request is serviced by accessing the memory 125 .
  • the tag array 410 can be shared between the value cache 115 and the primary cache 120 . That is, both the value cache 115 and the primary cache 120 can use the tag array to serve the data access requests, e.g., to lookup the data in the corresponding cache. Further, while the data is maintained in a value buffer 310 and accessed using the pointer array 305 in the value cache 115 , in the primary cache 120 the data may be stored in a data array and accessed using the tag array 410 .
  • FIG. 5 is a flow diagram of a process 500 for promoting a candidate value to the primary cache, consistent with various embodiments.
  • the process 500 may be implemented in the environment 100 of FIG. 1 .
  • the process 500 begins at block 505 , and at block 510 , the cache management component 110 identifies multiple instances of the candidate value in the primary cache 120 .
  • the process of identifying can be executed based on a trigger, e.g., whenever a value is updated or written in the primary cache 120 , storage consumed in the cache exceeds a specified threshold, expiry of a specified time interval.
  • the cache management component 110 determines if the candidate value satisfies the criterion for promoting the candidate value to the value cache.
  • the criterion can be many, e.g., a number of instances or occurrences or copies of a value exceeding a specified threshold.
  • the cache management component 110 promotes the candidate value to the value cache 115 (additional details of which are described at least with reference to FIG. 6 below).
  • the cache management component 110 evicts or deletes the multiple copies of the candidate value from the primary cache 120 .
  • the storage space of the primary cache 120 is maximized for caching more number of distinct values for a given cache size, which can improve cache hit ratio and reduce cache pollution and memory bandwidth.
  • the above process can minimize the need to increase the physical size of the primary cache 120 to store more number of distinct data.
  • FIG. 6 is a flow diagram of a process 600 for promoting a candidate value to the primary cache, consistent with various embodiments.
  • the process 600 may be implemented in the environment 100 of FIG. 1 , and can be executed as part of block 520 of process 500 of FIG. 5 .
  • the process 600 begins at block 605 , and at block 610 , the cache management component 110 stores a single instance or copy of the candidate value that has been promoted from the primary cache 120 in the value buffer 310 of the value cache 115 .
  • the cache management component 110 stores multiple pointers to the candidate value stored in the value buffer 310 in the pointer array 305 of the value cache 115 .
  • the cache management component 110 stores as many pointers as the number of copies of the candidate value stored in the primary cache 120 . Further, when a data access request is received for a specified memory address, the cache management component 110 can use the specified memory address to identify or determine a pointer in the pointer array 305 that corresponds to the specified memory address, e.g., as described at least with reference to FIG. 4 , and access the candidate value in the value buffer 310 pointed to by the pointer.
  • FIG. 7 is a flow diagram of a process 700 for processing a data access request in the computer system of FIG. 1 , consistent with various embodiments.
  • the process 700 can be implemented in the environment 100 of FIG. 1 .
  • the process 700 starts at block 705 , and at block 710 , the cache management component 110 receives a data access request.
  • the data access request can be received by the processor 105 from a client computer system, and the processor 105 can forward the request to the cache management component 110 .
  • the cache management component 110 executes the data access request at the value cache 115 and the primary cache 120 , e.g., simultaneously, in parallel or in serial. In some embodiments, the cache management component 110 retrieves or otherwise determines, e.g., from the information in the data access request, an input memory address at which the data is to be accessed.
  • the cache management component 110 determines whether there is a hit in the value cache 115 for the input memory address. The process of determining whether there is a hit in the value cache 115 is described at least with reference to FIGS. 3 and 4 . If there is a hit in the value cache 115 , at block 725 , the cache management component 110 proceeds to access the data in the value cache 115 , e.g., in the value buffer 310 , and the process 700 returns.
  • the cache management component 110 determines whether there is a hit in the primary cache 120 for the input memory address. If there is a hit in the primary cache 120 , the cache management component 110 proceeds to access the data in the primary cache 120 at block 725 . On the other hand, if there is a miss in the primary cache 120 , at block 735 , the cache management component 110 proceeds to access the data at the memory 125 .
  • the cache management component 110 if the data access request is a read request, the cache management component 110 returns the data accessed at block 725 to the requesting entity. If the data access request is a write/update request, the cache management component 110 performs the write/update and can acknowledge the requesting entity upon completion of the same.
  • the cache management component 110 can further store the accessed data in the primary cache 120 , e.g., based on a caching policy implemented by the computer system 150 . For example, if the caching policy is based on most recently used concept, then the value that is accessed in the memory 125 can be written to the primary cache 120 .
  • FIG. 8 is a flow diagram of a process 800 for processing a data update request in the computer system of FIG. 1 , consistent with various embodiments.
  • the process 800 can be implemented in the environment 100 of FIG. 1 .
  • the process 800 starts at block 805 , and at block 810 , the cache management component 110 receives a data update request.
  • the cache management component 110 retrieves or otherwise determines, e.g., from the information in the data update request, an input memory address at which the data is to be updated or written.
  • the processor 105 can forward the input memory address to the cache management component 110 .
  • the cache management component 110 performs a lookup for the input memory address at the value cache 115 and the primary cache 120 , e.g., simultaneously, in parallel or in serial.
  • the cache management component 110 determines whether there is a hit in the value cache 115 for the input memory address. The process of determining whether there is a hit in the value cache 115 is described at least with reference to FIGS. 3 and 4 . If there is a hit in the value cache 115 , at block 825 , the cache management component 110 proceeds to update the data in the value cache 115 , e.g., in the value buffer 310 . After updating the current value to an updated value, e.g., specified by the data update request, at determination block 830 , the cache management component 110 determines if the updated value is to be demoted to the primary cache 120 .
  • the updated value is demoted to the primary cache 120 if the criterion for being stored or promoted to the value cache 115 is not satisfied any more. For example, if the number of copies of the updated value falls below the specified threshold for being stored at the value cache 115 , then the updated value is evicted from the value cache 115 .
  • the value cache 115 is storing five instances of a candidate value, e.g., “45” of the primary cache 120 , in a compressed format, e.g., as a single instance of “45” and multiple pointers, e.g., “5” pointers each corresponding to an instance that was stored in the primary cache 120 , pointing to the single instance.
  • the cache management component 110 determines if the updated value “50” satisfies the criterion to be stored in the value cache 115 , e.g., if number of instances of the value “50” exceeds the specified threshold for being stored at the value cache 115 . If the updated value satisfies the criterion for being stored at the value cache 115 , then the updated value is not demoted to the primary cache 120 , and the process 800 returns.
  • the cache management component 110 evicts the updated value to the primary cache.
  • the pointer corresponding to the input memory address at which the updated value is stored is also deleted from the value cache 115 .
  • the cache management component 110 determines whether there is a hit in the primary cache 120 for the input memory address. If there is a hit in the primary cache 120 , at block 845 , the cache management component 110 updates the data in the primary cache 120 . In some embodiments, after updating the primary cache 120 , at determination block 850 , the cache management component 110 determines if the updated value is to be promoted to the value cache 115 .
  • a criterion for promoting the candidate value to the value cache 115 is that a number of instances of the candidate value exceeds a specified threshold, e.g., “4.” Further, consider that the primary cache 120 stores four instances a value, e.g., “45.” Upon updating an instance of the candidate value in the value cache 115 , e.g., from “50” to “45,” the number of instances of the value “45” increases to “5” and therefore, the updated value “45” satisfies the criterion for being promoted to the value cache 115 .
  • the cache management component 110 promotes the updated value to the value cache 115 if the updated value satisfies the criterion, and at block 860 , evicts the instances of the updated value from the primary cache. If the updated value does not satisfy the criterion for being promoted to the value cache 115 , the process 800 returns from determination block 850 .
  • the cache management component 110 proceeds to update the data at memory 125 .
  • FIG. 9 is a block diagram of a computer system as may be used to implement features of the disclosed embodiments.
  • the computing system 900 may be used to implement any of the entities, components, modules, systems, or services depicted in the examples of the foregoing figures (and any other entities described in this specification).
  • the computing system 900 may include one or more central processing units (“processors”) 905 , memory 910 , input/output devices 925 (e.g., keyboard and pointing devices, display devices), storage devices 920 (e.g., disk drives), and network adapters 930 (e.g., network interfaces) that are connected to an interconnect 915 .
  • processors central processing units
  • memory 910 volatile and non-volatile memory
  • input/output devices 925 e.g., keyboard and pointing devices, display devices
  • storage devices 920 e.g., disk drives
  • network adapters 930 e.g., network interfaces
  • the interconnect 915 is illustrated as an abstraction that represents any one or more separate physical buses, point to point connections, or both connected by appropriate bridges, adapters, or controllers.
  • the interconnect 915 may include, for example, a system bus, a Peripheral Component Interconnect (PCI) bus or PCI-Express bus, a HyperTransport or industry standard architecture (ISA) bus, a small computer system interface (SCSI) bus, a universal serial bus (USB), IIC (I2C) bus, or an Institute of Electrical and Electronics Engineers (IEEE) standard 1394 bus, also called “Firewire”.
  • PCI Peripheral Component Interconnect
  • ISA HyperTransport or industry standard architecture
  • SCSI small computer system interface
  • USB universal serial bus
  • I2C IIC
  • IEEE Institute of Electrical and Electronics Engineers
  • the memory 910 and storage devices 920 are computer-readable storage media that may store instructions that implement at least portions of the described embodiments.
  • the data structures and message structures may be stored or transmitted via a data transmission medium, such as a signal on a communications link.
  • Various communications links may be used, such as the Internet, a local area network, a wide area network, or a point-to-point dial-up connection.
  • computer readable media can include computer-readable storage media (e.g., “non transitory” media).
  • the instructions stored in memory 910 can be implemented as software and/or firmware to program the processor(s) 905 to carry out actions described above.
  • such software or firmware may be initially provided to the processing system 900 by downloading it from a remote system through the computing system 900 (e.g., via network adapter 930 ).
  • programmable circuitry e.g., one or more microprocessors
  • special-purpose hardwired circuitry may be in the form of, for example, one or more ASICs, PLDs, FPGAs, etc.
  • references in this specification to “one embodiment” or “an embodiment” means that a specified feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the disclosure.
  • the appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
  • various features are described which may be exhibited by some embodiments and not by others.
  • various requirements are described which may be requirements for some embodiments but not for other embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The disclosure is directed to a cache management system (“system”) for storing data in a cache of a computer system in a compact form. The system identifies multiple instances of a candidate value stored in a primary cache and stores those instances as a single value in a value cache. The system stores multiple pointers to the candidate value in the value cache, e.g., as many as the number of occurrences of the candidate value in the primary cache, each of which corresponds to an address in a memory of the computer system storing an instance of the primary cache candidate value. By storing multiple instances of the candidate value as a single instance, the system reduces the cache pollution and the memory bandwidth and facilitates storing more number of distinct values in the primary cache of a given size.

Description

    BACKGROUND
  • Caching is a process in which data is stored in a central processing unit (CPU) cache of a computer to reduce the average cost, e.g., time or other computing resources, to access data from the main memory. The cache is a smaller, faster memory which stores copies of the data from frequently used main memory locations. Cache management is a process that controls movement of data between the cache and the main memory. More the number of cache hits, the lesser is the average cost of accessing the data. The cache hits can be maximized by storing all or most of the data from the memory in the cache. However, cache memory is very expensive and the cost of the cache increases exponentially with the size of the cache. Having a cache that can store all or most of the data from the memory in the cache may not be practical. Accordingly, the cache management processes use various caching policies to selectively store data that is requested more often than not to maximize the cache hits.
  • Many of the prior cache management processes are inefficient as the number of cache hits is not maximized. For example, prior cache management systems cache data items that are more frequently accessed. At least some of these data items can be duplicates. That is, the cache can store multiple copies of a data item. For example, if a cache management process determines that memory locations, e.g., “address 1” and “address 2,” are more frequently accessed and have to be cached, the values of “address 1” and “address 2” are cached regardless of whether those addresses store the same or different values. By storing duplicate values, the amount of cache storage available for storing distinct values is reduced, which can minimize the number cache hits. Further, it can also increase cache pollution and memory bandwidth. These problems are further multiplied in a datacenter scenario where a number of server computing devices are installed for serving data access requests from a number of client computing devices. Having an inefficient cache management process can increase the read and/or write latency of the entire datacenter and/or the average cost, e.g., time or other computing resources, to access the data stored at the datacenter.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a block diagram of an environment in which the disclosed embodiments may be implemented.
  • FIG. 2 is a block diagram of hierarchy of a primary cache and value cache of the computer system of FIG. 1, consistent with various embodiments.
  • FIG. 3 is a block diagram illustrating an example movement of data between the primary cache and the value cache, consistent with various embodiments.
  • FIG. 4 is a block diagram illustrating an example of executing a data access request in the value cache of FIG. 1, consistent with various embodiments.
  • FIG. 5 is a flow diagram of a process for promoting a candidate value to the primary cache, consistent with various embodiments.
  • FIG. 6 is a flow diagram of a process for promoting a candidate value to the primary cache, consistent with various embodiments.
  • FIG. 7 is a flow diagram of a process for processing a data access request in the computer system of FIG. 1, consistent with various embodiments.
  • FIG. 8 is a flow diagram of a process for processing a data update request in the computer system of FIG. 1, consistent with various embodiments.
  • FIG. 9 is a block diagram of a processing system that can implement operations, consistent with various embodiments.
  • DETAILED DESCRIPTION
  • Embodiments are directed to a cache management system for storing data in a cache of a computer system in a compact form. The cache management system compresses multiple cache blocks of the primary cache storing the same value into a single cache block storing that value. In one such embodiment, the cache management system identifies multiple instances or occurrences of a candidate value stored in the primary cache and stores those multiple instances of the primary cache candidate value as a single value, thereby making more space available in the primary cache for storing distinct values. By storing multiple instances of the primary cache candidate value as a single instance and making storage space available for storing more number of distinct values, the cache management system reduces the cache pollution and the memory bandwidth. In addition, such compression of data in the cache can minimize the amount of silicon real estate for storing a given amount of data. In other words, the cache management system facilitates storing more values, e.g., distinct values, in the primary cache of a given size.
  • The cache management system includes a value cache that stores values that occurs multiple times in the primary cache. The cache management system identifies a candidate value in the primary cache that satisfies a specified criterion, e.g., number of occurrences of the candidate value exceeding a specified threshold, for being promoted to the value cache, and then stores the candidate value as a single instance in a value buffer of the value cache. Further, the cache management system also stores multiple pointers to the candidate value in the value cache in which each of the pointers corresponds to an address in a memory of the computer system storing an instance of the candidate value. The cache management can store as many pointers as the number of occurrences of the candidate value in the primary cache. The pointers can be stored in a pointer array of the value cache.
  • In some embodiments, the value cache has the same hierarchy as the primary cache, e.g., if the primary cache has L1, L2, L3 hierarchy, then the value cache also has the same hierarchy. The cache management system can allocate at least a portion of the primary cache to form the value cache. The value cache and the primary cache can be exclusive to each other, e.g., a value is either stored in the value cache or the primary cache.
  • When a read request arrives at the computer system for reading a specified value at a specified memory address, the cache management system can execute a read operation at both the value cache and the primary cache, e.g., in parallel. If there is a hit for the specified memory address in the value cache, the specified value is returned from the value cache. If there is a miss in the value cache and a hit in the primary cache, the specified value is returned from the primary cache. If there is a miss in both the value cache and the primary cache, then the specified value is obtained from the memory. In some embodiments, the specified value may be written to the primary cache after being obtained from the memory depending on caching policies used, e.g., a cache policy for storing most recently used value.
  • When a write request arrives at the computer system for updating or writing a specified value to a specified memory address, the cache management system can determine if there is a hit for the specified address in the value cache (which means the write operation is updating the current value stored at the specified memory address), and if there is a hit, the cache management system can determine if the specified value is different from the current value stored in the cache. If the specified value is different from the current value, the specified memory address is evicted from the value cache and the specified value is written to the memory at the specified memory address. Once the specified value is written to the memory, the specified value may also be written to the primary cache based on the caching policies used, e.g., a cache policy for storing most recently used value. If the specified value is the same as the current value, the specified memory address is added to the value cache if not already existing.
  • Referring back to determining whether there is a hit for the specified memory address in the value cache, if there is no hit for the specified memory address in the value cache, the cache management system determines if there is a hit in the primary cache. If there is a hit in the primary cache, the current value in the primary cache is updated with the specified value, which can be further pushed to the memory based on cache eviction policies. If there is neither a hit in the primary cache nor the value cache, the specified value is written to the specified memory address in the memory. After writing the specified value to the memory, the specified value can also be written to the primary cache as well based on the caching policies implemented in the computer system. Referring back to updating the primary cache with the specified value, in some embodiments, the cache management system can determine if there are multiple instances of the specified value in the primary cache and if there are multiple instances, determine if the specified value satisfies the criterion for being promoted to the value cache. If the specified value satisfies the criterion for being promoted to the value cache, the specified value is promoted to the value cache and all the instances of the specified value in the primary cache are evicted, thereby making storage space available for storing more number of distinct values.
  • Turning now to the figures, FIG. 1 is a block diagram of an environment 100 in which the disclosed embodiments may be implemented. The environment 100 includes a computer system 150 in which the disclosed embodiments can be implemented. The computer system 150 can be any type of computing device in which a cache can be implemented, e.g., a server computing device, a desktop, a laptop, a tablet PC, a smartphone, and a wearable computing device. The computer system 150 includes a processor 105 that executes various operations of the computer system 150, e.g., read and/or write operations. The computer system 150 includes a memory 125, which can store data. The data can be received from another entity, e.g., another computer system (not illustrated), retrieved from a secondary storage system of the computer system 150, e.g., a disk drive (not illustrated) and/or generated by the computer system 150. The memory 125 can be a random access memory (RAM) or a variation thereof.
  • The computer system 150 includes (a) a primary cache 120 that caches data, e.g., from the memory 125, based on various caching policies, and (b) a value cache 115 that stores a candidate value from the primary cache 120 in a compact and/or compressed form. A read and/or write latencies of the primary cache 120 are typically lower compared to that of the memory 125. The primary cache 120 can be made up of multiple caches that are organized into various hierarchies an example of which is illustrated in FIG. 2 (described below in detail). In some embodiments, the hierarchy of the value cache 115 is the same as the hierarchy of the primary cache 120, e.g., as illustrated in FIG. 2.
  • The computer system 150 includes a cache management component 110 that among various other operations manages movement of data between the primary cache 120 and the value cache 115. The cache management component 110 identifies a candidate value in the primary cache 120 that satisfies the criterion for being promoted to the value cache 115 and promotes it to the value cache 115. The criterion can be based on a number of instances or copies of the candidate value being stored in the primary cache 120. For example, the criterion can be that the number of instances of the candidate value stored in the primary cache 120 exceeds a specified threshold. After identifying the candidate value that satisfies the criterion, the cache management component 110 promotes the candidate value to the value cache 115, e.g., stores those multiple instances of the candidate value in the primary cache as a single instance in the value cache 115. Further, the cache management component 110 also evicts those multiple instances of the candidate value from the primary cache 120 after the candidate value is promoted to the value cache 115.
  • The multiple instances of the candidate value in the primary cache 120 correspond to the candidate value stored at multiple addresses of the memory 125. When the multiple instances of the candidate value is stored as a single instance of the candidate value in the value cache 115, a pointer that is associated with each of these multiple addresses and that points to a location in the value cache 115 at which the single instance is stored is added to the value cache 115. When a data access request is received for the candidate value, a memory address of the candidate value is retrieved from and/or derived from the information in the data access request. If the memory address is present in the value cache 115, the memory address resolves to a specified pointer corresponding to the memory address. The candidate value can be retrieved from the value cache 115 based on the specified pointer, which points to a location in the value cache 115 at which the single instance of the candidate value is stored.
  • Like promoting a candidate value from the primary cache 120 to the value cache 115, the cache management component 110 can also demote or evict a specified value from the value cache 115 to the primary cache 120 if the specified value ceases to satisfy the criterion for being promoted to or stored at the value cache 115.
  • In some embodiments, the cache management component 110 can execute a read request for a candidate value at both the primary cache 120 and the value cache 115, e.g., simultaneously, in parallel, or in a serial fashion. In some embodiments, the cache management component 110 is made up of multiple components, which perform discrete operations. For example, a first component (not illustrated) in the cache management component 110 can identify the multiple instances of the candidate value in the primary cache 120. A second component (not illustrated) can promote the candidate value to the value cache 115, and a third component (not illustrated) can serve a read request for the candidate value by obtaining the candidate value from the value cache 115.
  • In some embodiments, the value cache 115 is not a separate hardware from that of the primary cache 120. The cache management component 110 can allocate a portion of the primary cache 120 for forming the value cache 115.
  • FIG. 2 is a block diagram of hierarchy 200 of a primary cache and value cache of the computer system of FIG. 1, consistent with various embodiments. The primary cache 120 can be made of multiple caches. For example, the primary cache 120 can include a first level cache “L1” 205, a second level cache “L2” and a third level cache 215 as the last level cache “LLC.” In some embodiments, all these caches can be on-chip caches. Typically, the “lower-level” caches (e.g., L1 cache 205) have a smaller number of blocks, smaller block size, and fewer blocks in a set, but have very short access times compared to “higher-level” caches (e.g., L2 210 and above). The higher-level caches have progressively larger numbers of blocks, larger block size, more blocks in a set, and relatively longer access times, but are still much faster than the memory 125. Cache-entry replacement policy is determined by a caching policy selected. That is, data is propagated from the memory 125 to these different levels based on caching policies.
  • The value cache 115 typically has the same hierarchy as the primary cache 120. For example, the value cache 115 has a first level cache “VC1” 255 that has similar properties as that of L1 205, a second level cache “VC2” 260 that has similar properties as that of L2 260, and a third level cache “VLLC” 265 that has similar properties as that of LLC 215.
  • FIG. 3 is a block diagram of an example 300 illustrating movement of data from the primary cache to the value cache, consistent with various embodiments. In the example 300, the primary cache 120 stores multiple instances of a first value “v1,” a second value “v2” and a third value “v3.” Consider that the criterion for promoting a value from the primary cache 120 to the value cache 115 is number of instances of a value exceeding three. Accordingly, the cache management component 110 determines that the first value, “v1,” which occurs four times in the primary cache 120, satisfies the criterion for being promoted to the value cache 115. In some embodiments, the criterion can be set by a user. The cache management component 110 promotes the first value to the value cache 115, e.g., by storing a single instance of the first value in the value cache 115.
  • In some embodiments, the value cache 115 includes a value buffer 310 in which values can be stored. In some embodiments, the value buffer 310 is an array data structure. When the first value is promoted to the value cache 115, the first value is stored in the value buffer 310. Further, a number of pointers associated with the memory addresses, e.g., “A,” “B,” “C,” and “D” storing the first value, and each of which points to a location in the value buffer 310 that stores the first value is also stored in the value cache 115, e.g., in a pointer array 305. After the first value and the corresponding pointers to the first value are stored in the value cache 115, the cache management component 110 evicts the multiple instances of the first value from the primary cache 120, thereby making storage space available for storing more number of distinct values in the primary cache 120.
  • By storing a number of pointers to a single instance of a specified value rather than storing multiple instances of the same value, the amount of storage space consumed in storing a given set of data is minimized as the storage space consumed in storing a pointer is lesser compared to that consumed in storing a copy of the value. Further, more number of distinct values can be stored in a primary cache of a given size as at least some of the duplicate values are promoted to the value cache 115. That is, the available storage space in the primary cache 120 is maximized for storing more number of distinct values.
  • When a data access request is received, the cache management component 110 retrieves or otherwise derives the memory address from the data access request, resolves the memory address to a specified pointer in the pointer array 305 and obtains the value from the value buffer using the specified pointer. As mentioned above, the value cache 115 and the primary cache 120 are exclusive to each other, e.g., a candidate value can be stored in either the value cache 115 or the primary cache 120.
  • FIG. 4 is a block diagram illustrating an example 400 of executing a data access request in the value cache of FIG. 1, consistent with various embodiments. When a data access request is received at the computer system 150, the processor 105 forwards the request to the cache management component 110. The cache management component 110 retrieves or otherwise derives the memory address, e.g., input address 405, at which the data is to be accessed from the data access request. In some embodiments, the processor 105 may itself retrieve or otherwise derive the input address 405 from the data access request and forward it to the cache management component 110. The input address 405, which is of a particular bit length, can be divided into three portions, a tag 406, an index 407 and an offset 408 all of which together represent the specified memory address. Each of the three portions can be of one or more bits. In some embodiments, the tag 406 is a unique identifier for a group of data. Because different regions of the memory 125 may be mapped into a cache line of the primary cache 120, the tag 406 is used to differentiate between them. The cache line can be a basic unit for cache storage and can store multiple bytes or words of data. The index 407 can indicate at which cache line the data has been stored. The offset 408 can indicate the offset of the data stored in the cache line indicated by the index. A tag array 410, which can be an array data structure, has as many rows as the number of cache lines of the primary cache 120, and each row typically stores the tags of the data stored in the corresponding cache line.
  • To determine whether the input address 405 exists in the value cache 115, the cache management component 110 indexes into the tag array 410 to a specified row corresponding to the index 407 and obtains the tag stored in the specified row. The cache management component 110 then compares the tag 406 in the input address with the tag retrieved from the specified row. If there is a hit, e.g., the tag in the specified row matches the tag 406, the cache management component 110 retrieves a specified pointer from the pointer array 305 stored at a location corresponding to the matched tag. For example, if the matched tag is stored at row 2, then the specified pointer is retrieved from row 2 and offset 408 in row 2 of the pointer array 305. Finally, the data stored in the value buffer 310 at a location indicated by the specified pointer is retrieved and returned to the requesting entity.
  • Referring back to the comparing the retrieved tag with the tag 406, if there is a miss, e.g., the tag in the specified row does not match the tag 406, the cache management component 110 performs a look up in the primary cache 120 for the input address 405. If there is a hit in the primary cache 120, e.g., the input address 405 is available in the primary cache 120, the value is retrieved and returned to the requesting entity. However, if there is a miss in the primary cache 120, the request is serviced by accessing the memory 125.
  • In some embodiments, the tag array 410 can be shared between the value cache 115 and the primary cache 120. That is, both the value cache 115 and the primary cache 120 can use the tag array to serve the data access requests, e.g., to lookup the data in the corresponding cache. Further, while the data is maintained in a value buffer 310 and accessed using the pointer array 305 in the value cache 115, in the primary cache 120 the data may be stored in a data array and accessed using the tag array 410.
  • FIG. 5 is a flow diagram of a process 500 for promoting a candidate value to the primary cache, consistent with various embodiments. In some embodiments, the process 500 may be implemented in the environment 100 of FIG. 1. The process 500 begins at block 505, and at block 510, the cache management component 110 identifies multiple instances of the candidate value in the primary cache 120. In some embodiments, the process of identifying can be executed based on a trigger, e.g., whenever a value is updated or written in the primary cache 120, storage consumed in the cache exceeds a specified threshold, expiry of a specified time interval.
  • At block 515, the cache management component 110 determines if the candidate value satisfies the criterion for promoting the candidate value to the value cache. The criterion can be many, e.g., a number of instances or occurrences or copies of a value exceeding a specified threshold.
  • At block 520, if the promotion criterion is satisfied, the cache management component 110 promotes the candidate value to the value cache 115 (additional details of which are described at least with reference to FIG. 6 below).
  • At block 525, the cache management component 110 evicts or deletes the multiple copies of the candidate value from the primary cache 120. In some embodiments, by evicting multiple copies of the same value, the storage space of the primary cache 120 is maximized for caching more number of distinct values for a given cache size, which can improve cache hit ratio and reduce cache pollution and memory bandwidth. In addition, the above process can minimize the need to increase the physical size of the primary cache 120 to store more number of distinct data.
  • FIG. 6 is a flow diagram of a process 600 for promoting a candidate value to the primary cache, consistent with various embodiments. In some embodiments, the process 600 may be implemented in the environment 100 of FIG. 1, and can be executed as part of block 520 of process 500 of FIG. 5. The process 600 begins at block 605, and at block 610, the cache management component 110 stores a single instance or copy of the candidate value that has been promoted from the primary cache 120 in the value buffer 310 of the value cache 115. At block 615, the cache management component 110 stores multiple pointers to the candidate value stored in the value buffer 310 in the pointer array 305 of the value cache 115. In some embodiments, the cache management component 110 stores as many pointers as the number of copies of the candidate value stored in the primary cache 120. Further, when a data access request is received for a specified memory address, the cache management component 110 can use the specified memory address to identify or determine a pointer in the pointer array 305 that corresponds to the specified memory address, e.g., as described at least with reference to FIG. 4, and access the candidate value in the value buffer 310 pointed to by the pointer.
  • FIG. 7 is a flow diagram of a process 700 for processing a data access request in the computer system of FIG. 1, consistent with various embodiments. The process 700 can be implemented in the environment 100 of FIG. 1. The process 700 starts at block 705, and at block 710, the cache management component 110 receives a data access request. In some embodiments, the data access request can be received by the processor 105 from a client computer system, and the processor 105 can forward the request to the cache management component 110.
  • At block 715, the cache management component 110 executes the data access request at the value cache 115 and the primary cache 120, e.g., simultaneously, in parallel or in serial. In some embodiments, the cache management component 110 retrieves or otherwise determines, e.g., from the information in the data access request, an input memory address at which the data is to be accessed.
  • At determination block 720, the cache management component 110 determines whether there is a hit in the value cache 115 for the input memory address. The process of determining whether there is a hit in the value cache 115 is described at least with reference to FIGS. 3 and 4. If there is a hit in the value cache 115, at block 725, the cache management component 110 proceeds to access the data in the value cache 115, e.g., in the value buffer 310, and the process 700 returns.
  • On the other hand, if there is a miss in the value cache 115, at determination block 730, the cache management component 110 determines whether there is a hit in the primary cache 120 for the input memory address. If there is a hit in the primary cache 120, the cache management component 110 proceeds to access the data in the primary cache 120 at block 725. On the other hand, if there is a miss in the primary cache 120, at block 735, the cache management component 110 proceeds to access the data at the memory 125.
  • In some embodiments, if the data access request is a read request, the cache management component 110 returns the data accessed at block 725 to the requesting entity. If the data access request is a write/update request, the cache management component 110 performs the write/update and can acknowledge the requesting entity upon completion of the same.
  • Further, if the data is accessed at the memory 125, e.g., read from and/or written to the memory 125, the cache management component 110 can further store the accessed data in the primary cache 120, e.g., based on a caching policy implemented by the computer system 150. For example, if the caching policy is based on most recently used concept, then the value that is accessed in the memory 125 can be written to the primary cache 120.
  • FIG. 8 is a flow diagram of a process 800 for processing a data update request in the computer system of FIG. 1, consistent with various embodiments. The process 800 can be implemented in the environment 100 of FIG. 1. The process 800 starts at block 805, and at block 810, the cache management component 110 receives a data update request. In some embodiments, the cache management component 110 retrieves or otherwise determines, e.g., from the information in the data update request, an input memory address at which the data is to be updated or written. In some embodiments, the processor 105 can forward the input memory address to the cache management component 110. At block 815, the cache management component 110 performs a lookup for the input memory address at the value cache 115 and the primary cache 120, e.g., simultaneously, in parallel or in serial.
  • At determination block 820, the cache management component 110 determines whether there is a hit in the value cache 115 for the input memory address. The process of determining whether there is a hit in the value cache 115 is described at least with reference to FIGS. 3 and 4. If there is a hit in the value cache 115, at block 825, the cache management component 110 proceeds to update the data in the value cache 115, e.g., in the value buffer 310. After updating the current value to an updated value, e.g., specified by the data update request, at determination block 830, the cache management component 110 determines if the updated value is to be demoted to the primary cache 120. In some embodiments, the updated value is demoted to the primary cache 120 if the criterion for being stored or promoted to the value cache 115 is not satisfied any more. For example, if the number of copies of the updated value falls below the specified threshold for being stored at the value cache 115, then the updated value is evicted from the value cache 115. Consider that the value cache 115 is storing five instances of a candidate value, e.g., “45” of the primary cache 120, in a compressed format, e.g., as a single instance of “45” and multiple pointers, e.g., “5” pointers each corresponding to an instance that was stored in the primary cache 120, pointing to the single instance. Upon updating an instance of the candidate value in the value cache 115, e.g., to “50,” the cache management component 110 determines if the updated value “50” satisfies the criterion to be stored in the value cache 115, e.g., if number of instances of the value “50” exceeds the specified threshold for being stored at the value cache 115. If the updated value satisfies the criterion for being stored at the value cache 115, then the updated value is not demoted to the primary cache 120, and the process 800 returns. On the other hand, if the updated value does not satisfy the criterion for being stored at the value cache 115, at block 835, the cache management component 110 evicts the updated value to the primary cache. In some embodiments, the pointer corresponding to the input memory address at which the updated value is stored is also deleted from the value cache 115.
  • Referring back to determination block 820, if there is a miss in the value cache 115, at determination block 840, the cache management component 110 determines whether there is a hit in the primary cache 120 for the input memory address. If there is a hit in the primary cache 120, at block 845, the cache management component 110 updates the data in the primary cache 120. In some embodiments, after updating the primary cache 120, at determination block 850, the cache management component 110 determines if the updated value is to be promoted to the value cache 115. Consider that a criterion for promoting the candidate value to the value cache 115 is that a number of instances of the candidate value exceeds a specified threshold, e.g., “4.” Further, consider that the primary cache 120 stores four instances a value, e.g., “45.” Upon updating an instance of the candidate value in the value cache 115, e.g., from “50” to “45,” the number of instances of the value “45” increases to “5” and therefore, the updated value “45” satisfies the criterion for being promoted to the value cache 115. Accordingly, at block 855, the cache management component 110 promotes the updated value to the value cache 115 if the updated value satisfies the criterion, and at block 860, evicts the instances of the updated value from the primary cache. If the updated value does not satisfy the criterion for being promoted to the value cache 115, the process 800 returns from determination block 850.
  • Referring back to determination block 840, if there is a miss in the primary cache 120, at block 865, the cache management component 110 proceeds to update the data at memory 125.
  • FIG. 9 is a block diagram of a computer system as may be used to implement features of the disclosed embodiments. The computing system 900 may be used to implement any of the entities, components, modules, systems, or services depicted in the examples of the foregoing figures (and any other entities described in this specification). The computing system 900 may include one or more central processing units (“processors”) 905, memory 910, input/output devices 925 (e.g., keyboard and pointing devices, display devices), storage devices 920 (e.g., disk drives), and network adapters 930 (e.g., network interfaces) that are connected to an interconnect 915. The interconnect 915 is illustrated as an abstraction that represents any one or more separate physical buses, point to point connections, or both connected by appropriate bridges, adapters, or controllers. The interconnect 915, therefore, may include, for example, a system bus, a Peripheral Component Interconnect (PCI) bus or PCI-Express bus, a HyperTransport or industry standard architecture (ISA) bus, a small computer system interface (SCSI) bus, a universal serial bus (USB), IIC (I2C) bus, or an Institute of Electrical and Electronics Engineers (IEEE) standard 1394 bus, also called “Firewire”.
  • The memory 910 and storage devices 920 are computer-readable storage media that may store instructions that implement at least portions of the described embodiments. In addition, the data structures and message structures may be stored or transmitted via a data transmission medium, such as a signal on a communications link. Various communications links may be used, such as the Internet, a local area network, a wide area network, or a point-to-point dial-up connection. Thus, computer readable media can include computer-readable storage media (e.g., “non transitory” media).
  • The instructions stored in memory 910 can be implemented as software and/or firmware to program the processor(s) 905 to carry out actions described above. In some embodiments, such software or firmware may be initially provided to the processing system 900 by downloading it from a remote system through the computing system 900 (e.g., via network adapter 930).
  • The embodiments introduced herein can be implemented by, for example, programmable circuitry (e.g., one or more microprocessors) programmed with software and/or firmware, or entirely in special-purpose hardwired (non-programmable) circuitry, or in a combination of such forms. Special-purpose hardwired circuitry may be in the form of, for example, one or more ASICs, PLDs, FPGAs, etc.
  • Remarks
  • The above description and drawings are illustrative and are not to be construed as limiting. Numerous specific details are described to provide a thorough understanding of the disclosure. However, in some instances, well-known details are not described in order to avoid obscuring the description. Further, various modifications may be made without deviating from the scope of the embodiments. Accordingly, the embodiments are not limited except as by the appended claims.
  • Reference in this specification to “one embodiment” or “an embodiment” means that a specified feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the disclosure. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Moreover, various features are described which may be exhibited by some embodiments and not by others. Similarly, various requirements are described which may be requirements for some embodiments but not for other embodiments.
  • The terms used in this specification generally have their ordinary meanings in the art, within the context of the disclosure, and in the specific context where each term is used. Terms that are used to describe the disclosure are discussed below, or elsewhere in the specification, to provide additional guidance to the practitioner regarding the description of the disclosure. For convenience, some terms may be highlighted, for example using italics and/or quotation marks. The use of highlighting has no influence on the scope and meaning of a term; the scope and meaning of a term is the same, in the same context, whether or not it is highlighted. It will be appreciated that the same thing can be said in more than one way. One will recognize that “memory” is one form of a “storage” and that the terms may on occasion be used interchangeably.
  • Consequently, alternative language and synonyms may be used for any one or more of the terms discussed herein, nor is any special significance to be placed upon whether or not a term is elaborated or discussed herein. Synonyms for some terms are provided. A recital of one or more synonyms does not exclude the use of other synonyms. The use of examples anywhere in this specification including examples of any term discussed herein is illustrative only, and is not intended to further limit the scope and meaning of the disclosure or of any exemplified term. Likewise, the disclosure is not limited to various embodiments given in this specification.
  • Those skilled in the art will appreciate that the logic illustrated in each of the flow diagrams discussed above, may be altered in various ways. For example, the order of the logic may be rearranged, substeps may be performed in parallel, illustrated logic may be omitted; other logic may be included, etc.
  • Without intent to further limit the scope of the disclosure, examples of instruments, apparatus, methods and their related results according to the embodiments of the present disclosure are given below. Note that titles or subtitles may be used in the examples for convenience of a reader, which in no way should limit the scope of the disclosure. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains. In the case of conflict, the present document, including definitions will control.

Claims (20)

I/we claim:
1. A computer-implemented method, comprising:
identifying, by a computer system and in a primary cache, multiple instances of a candidate value;
promoting, by the computer system, the candidate value to a value cache of the computer system by storing a single instance of the candidate value in a value buffer associated with the value cache, wherein the value cache is distinct from the primary cache;
storing, by the computer system, multiple instances of a pointer to the candidate value in a pointer array data structure associated with the value cache, wherein a number of the instances of the pointer is the same as the number of instances of the candidate value in the primary cache; and
evicting, by the computing system, the multiple instances of the candidate value from the primary cache.
2. The computer-implemented method of claim 1 further comprising:
receiving a read request at the computer system;
retrieving a memory address to be read from the read request;
performing, in parallel, a lookup in both the primary cache and the value cache for the memory address; and
retrieving a specified value associated with the memory address from either the primary cache or the value cache.
3. The computer-implemented method of claim 2, wherein retrieving the specified value includes:
confirming a hit in the value cache for the memory address, and
retrieving the specified value from value cache.
4. The computer-implemented method of claim 2, wherein retrieving the specified value includes:
confirming a miss in the value cache for the memory address, and
retrieving the specified value from the primary cache or a memory of the computer system.
5. The computer-implemented method of claim 4, wherein retrieving the specified value includes:
confirming a hit in the primary cache for the memory address, and
retrieving the specified value from the primary cache.
6. The computer-implemented method of claim 4, wherein retrieving the specified value includes:
confirming a miss in the primary cache for the memory address, and
retrieving the specified value from the memory.
7. The computer-implemented method of claim 1, wherein serving the read request includes:
retrieving an input memory address from the read request, the retrieving including retrieving an input tag, an input index and an input offset of the input memory address,
determining a cache hit in the value cache for the input memory address, the cache hit occurring in an event the input tag matches a specified tag stored in a specified row of a tag array associated with the value cache, the specified row determined based on the input index,
retrieving a specified pointer from the pointer array based on the input index and the input offset portion, and
retrieving the candidate value from the value buffer based on the specified pointer.
8. The computer-implemented method of claim 7, wherein determining the cache hit includes:
identifying, using the input index, the specified row in the tag array, the tag array including multiple rows, the rows storing a second portion of memory addresses of values stored in the value cache,
comparing the input tag with the specified tag stored in the specified row,
confirming the cache hit in an event the input tag matches the specified tag.
9. The computer-implemented method of claim 1 further comprising:
serving, by the computing system, a read request for the candidate value from the value cache, wherein executing the application in the record mode includes modifying the first version and the second version of the application to make application code to be deterministic.
10. The computer-implemented method of claim 1 further comprising:
updating, in response to receiving an update request, a value associated with a specified memory address in the primary cache to a specified value;
determining that, in response to the updating, a number of instances of the specified value in the primary cache has met a criterion for promotion to the value cache; and
promoting the specified value to the value cache by writing a single instance of the specified value to a specified location in the value buffer and storing a number of pointers to the specified location in the pointer array, the number of pointers corresponding to the number of instances of the specified value.
11. The computer-implemented method of claim 1 further comprising:
receiving an update request to update a specified instance of the candidate value in the value cache to a specified value, the specified value being different from the candidate value;
updating a specified memory address associated with the specified instance of the candidate value to the specified value;
evicting the specified instance from the value cache; and
adding the specified value to the primary cache based on cache replacement policy associated with the primary cache.
12. The computer-implemented method of claim 1, wherein evicting the specified instance includes deleting a specified pointer associated with the specified instance from the pointer array.
13. The computer-implemented method of claim 1, wherein the primary cache and the value cache are exclusive to each other.
14. The computer-implemented method of claim 1, wherein the value cache is of the same hierarchy as the primary cache.
15. The computer-implemented method of claim 1, wherein identifying the multiple instances of the candidate value includes identifying a value in the primary cache whose number of instances exceeds a specified threshold.
16. A computer-readable storage medium storing computer-readable instructions, comprising:
instructions for retrieving a memory address to be read from a read request received at a computer system;
instructions for performing, in parallel, a lookup in both a primary cache and a value cache of the computer system for the memory address,
wherein the value cache stores a candidate value from the primary cache in a compressed form by:
storing multiple instances of the candidate value that are in the primary cache as a single instance in the value cache, and
evicting the multiple instances of the candidate value from the primary cache; and
instructions for retrieving a specified value associated with the memory address from the value cache in an event of a cache hit at the value cache.
17. The computer-readable storage medium of claim 16, wherein the instructions for storing the multiple instances of the candidate value as the single instance include instructions for storing multiple instances of a pointer that points to the single instance in a pointer array of the value cache, wherein a number of instances of the pointer stored is the same as that of the instances of the candidate value.
18. The computer-readable storage medium of claim 16, wherein the value cache is allocated a portion of memory from the primary cache.
19. The computer-readable storage medium of claim 16, wherein the instructions for evicting the multiple instances of the candidate value from the primary cache include instructions for storing distinct values in the primary cache and promoting duplicate values to the value cache.
20. A system, comprising:
a processor;
a first component configured to identify multiple instances of a candidate value in a primary cache;
a second component configured to:
promote the candidate value to a value cache of the computer system by storing a single instance of the candidate value in a value buffer associated with the value cache, wherein the value cache is distinct from the primary cache,
store multiple instances of a pointer to the candidate value in a pointer array data structure associated with the value cache, and
evict the multiple instances of the candidate value from the primary cache; and
a third component configured to serve a read request for the candidate value from the value cache.
US15/372,135 2016-12-07 2016-12-07 Value cache in a computing system Active US9990301B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/372,135 US9990301B1 (en) 2016-12-07 2016-12-07 Value cache in a computing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US15/372,135 US9990301B1 (en) 2016-12-07 2016-12-07 Value cache in a computing system

Publications (2)

Publication Number Publication Date
US9990301B1 US9990301B1 (en) 2018-06-05
US20180157593A1 true US20180157593A1 (en) 2018-06-07

Family

ID=62235039

Family Applications (1)

Application Number Title Priority Date Filing Date
US15/372,135 Active US9990301B1 (en) 2016-12-07 2016-12-07 Value cache in a computing system

Country Status (1)

Country Link
US (1) US9990301B1 (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11354247B2 (en) * 2017-11-10 2022-06-07 Smart IOPS, Inc. Devices, systems, and methods for configuring a storage device with cache
US10394474B2 (en) 2017-11-10 2019-08-27 Smart IOPS, Inc. Devices, systems, and methods for reconfiguring storage devices with applications
US11580030B2 (en) 2019-08-18 2023-02-14 Smart IOPS, Inc. Devices, systems, and methods of logical-to-physical address mapping
US12105626B2 (en) 2022-02-09 2024-10-01 Dell Products L.P. Method for optimized cache insertion
US12105627B2 (en) 2022-02-09 2024-10-01 Dell Products L.P. Methods for cache insertion using ghost lists
US20220222178A1 (en) * 2022-03-31 2022-07-14 Intel Corporation Selective fill for logical control over hardware multilevel memory
US20230333985A1 (en) * 2022-04-13 2023-10-19 Dell Products L.P. Methods for cache insertion and cache eviction using ghost list in a cache system that includes a reverse cache and a main cache
US12105639B2 (en) 2022-05-17 2024-10-01 Dell Products L.P. Methods for cache insertion and cache eviction in a cache system that includes a reverse cache and a main cache
US20240272908A1 (en) * 2023-02-13 2024-08-15 Arm Limited Load-with-substitution instruction

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9678976B2 (en) * 2014-07-21 2017-06-13 Red Hat, Inc. Distributed deduplication using locality sensitive hashing

Also Published As

Publication number Publication date
US9990301B1 (en) 2018-06-05

Similar Documents

Publication Publication Date Title
US9990301B1 (en) Value cache in a computing system
US8443149B2 (en) Evicting data from a cache via a batch file
US10169232B2 (en) Associative and atomic write-back caching system and method for storage subsystem
US10846235B2 (en) Integrated circuit and data processing system supporting attachment of a real address-agnostic accelerator
US10915461B2 (en) Multilevel cache eviction management
JP4226057B2 (en) Method and apparatus for pre-sacrificial selection to reduce undesirable replacement behavior in an inclusive cache
EP2478441B1 (en) Read and write aware cache
US10019377B2 (en) Managing cache coherence using information in a page table
US10423528B2 (en) Operation processing device, information processing apparatus, and control method for operation processing device
US8762651B2 (en) Maintaining cache coherence in a multi-node, symmetric multiprocessing computer
US9501419B2 (en) Apparatus, systems, and methods for providing a memory efficient cache
US5802571A (en) Apparatus and method for enforcing data coherency in an information handling system having multiple hierarchical levels of cache memory
US20170177488A1 (en) Dynamic victim cache policy
US9524110B2 (en) Page replacement algorithms for use with solid-state drives
US20110320720A1 (en) Cache Line Replacement In A Symmetric Multiprocessing Computer
US20110314228A1 (en) Maintaining Cache Coherence In A Multi-Node, Symmetric Multiprocessing Computer
US11593268B2 (en) Method, electronic device and computer program product for managing cache
US20080307169A1 (en) Method, Apparatus, System and Program Product Supporting Improved Access Latency for a Sectored Directory
US10169234B2 (en) Translation lookaside buffer purging with concurrent cache updates
US7328313B2 (en) Methods to perform cache coherency in multiprocessor system using reserve signals and control bits
KR102492316B1 (en) Power efficient snoop filter design for mobile platform
US9836398B2 (en) Add-on memory coherence directory
CN111133424B (en) Open addressed probe barrier
US10678699B2 (en) Cascading pre-filter to improve caching efficiency
US10733103B2 (en) Non-blocking directory-based cache coherence

Legal Events

Date Code Title Description
AS Assignment

Owner name: FACEBOOK, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KANAUJIA, SHOBHIT O;SALADI, KALYAN;VIJAYRAO, NARSING;SIGNING DATES FROM 20170110 TO 20170117;REEL/FRAME:041050/0665

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4

AS Assignment

Owner name: META PLATFORMS, INC., CALIFORNIA

Free format text: CHANGE OF NAME;ASSIGNOR:FACEBOOK, INC.;REEL/FRAME:058871/0336

Effective date: 20211028