US20220083474A1 - Write-back cache device - Google Patents
Write-back cache device Download PDFInfo
- Publication number
- US20220083474A1 US20220083474A1 US17/186,192 US202117186192A US2022083474A1 US 20220083474 A1 US20220083474 A1 US 20220083474A1 US 202117186192 A US202117186192 A US 202117186192A US 2022083474 A1 US2022083474 A1 US 2022083474A1
- Authority
- US
- United States
- Prior art keywords
- data
- cache
- state instruction
- state
- unit data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0891—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using clearing, invalidating or resetting means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0804—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with main memory updating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
- G06F12/023—Free address space management
- G06F12/0238—Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0893—Caches characterised by their organisation or structure
- G06F12/0895—Caches characterised by their organisation or structure of parts of caches, e.g. directory or tag array
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1016—Performance improvement
- G06F2212/1024—Latency reduction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1041—Resource optimization
- G06F2212/1044—Space efficiency improvement
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/20—Employing a main memory using a specific memory technology
- G06F2212/202—Non-volatile memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/50—Control mechanisms for virtual memory, cache or TLB
- G06F2212/502—Control mechanisms for virtual memory, cache or TLB using adaptive policy
Definitions
- Embodiments described herein relate generally to a write-back cache device.
- a non-volatile memory such as an MRAM.
- a non-volatile memory such as an MRAM has properties such as a limited number of times that writing is possible, large current consumption at writing, and high latency at writing.
- a write-back cache memory (hereinafter, a “cache memory” is simply referred to as a “cache” as appropriate) is used as effective means.
- a write-back cache includes a plurality of cache lines in each of which, for example, a plurality of pieces of unit data (unit data is, for example, word data) are stored. Whether data yet to be written to the main memory exists in a cache line is managed in units of cache lines by using a dirty bit.
- the dirty bit indicates that the data in the cache line is already written to the main memory.
- the dirty bit indicates that the data in the cache line is yet to be written to the main memory. Thus, the data in the cache line is written to the main memory when the dirty bit is dirty.
- FIG. 1 is a diagram illustrating the configuration of a computer including a write-back cache device according to a first embodiment
- FIG. 2 is a diagram for description of a correspondence relation between cache line data stored in a first storage device and data stored in a main memory according to the first embodiment
- FIG. 3 is a table for description of a relation between an address in the main memory and each of a tag, an index, and a cache line size according to the first embodiment
- FIG. 4 is a table for description of an example of data stored in first to third storage devices according to the first embodiment
- FIG. 5 is a table illustrating a specific example of values of state instruction data according to the first embodiment
- FIG. 6 is a flowchart illustrating processing performed by the write-back cache device when there is an input access according to the first embodiment
- FIG. 7 is a flowchart illustrating processing of writing from the write-back cache device to the main memory according to the first embodiment
- FIG. 8 is a table for description of effects of the write-back cache device according to the first embodiment.
- FIG. 9 is a flowchart illustrating processing performed by the write-back cache device when there is an input access according to a second embodiment
- FIG. 10 is a timing chart illustrating an example in which stall occurrence is avoided by not reading the state instruction data when there is a consecutive request for writing of unit data in the write-back cache device according to the second embodiment
- FIG. 11 is a table for description of a first comparative example in which one dirty bit corresponds to one cache line
- FIG. 12 is a table for description of a second comparative example in which dirty bits correspond to a plurality of respective pieces of unit data in one cache line.
- FIG. 13 is a timing chart illustrating a third comparative example in which stall occurs due to a consecutive request for writing of unit data.
- a write-back cache device of an embodiment includes: a first storage device capable of storing n (n is an integer equal to or larger than two) pieces of unit data in each of a plurality of cache lines, each piece of unit data being a unit of writing to a main memory; a second storage device configured to store state instruction data in each of the plurality of cache lines; and a cache controller configured to control inputting to and outputting from the first and second storage devices.
- a first state in which n pieces of unit data stored in a first cache line in the first storage device are not different from data stored at a first address in the main memory the first address corresponding to the first cache line, first state instruction data in the first cache line has a first value.
- the first state instruction data In a second state in which two or more pieces of unit data among the n pieces of unit data stored in the first cache line are different from data stored at the first address, the first state instruction data has a second value.
- one piece of unit data among the n pieces of unit data stored in the first cache line is first unit data in a third state in which only the one piece of unit data is different from data stored at the first address, the first state instruction data has a third value.
- FIG. 1 is a diagram illustrating the configuration of a computer including a write-back cache device 2 according to the present embodiment.
- the computer includes a CPU 1 , the write-back cache device 2 , a bus 3 , and a main memory 4 .
- the CPU 1 is a central processing unit configured to process information by sequentially reading, interpreting, and executing computer programs, data, and the like stored in the main memory 4 .
- the bus 3 is a common transmission path through which a plurality of devices transmit instructions and data.
- the main memory 4 is configured with a non-volatile memory such as an MRAM.
- the main memory 4 stores, for example, a computer program to be executed by the CPU 1 , and data to be processed by the CPU 1 through execution of the computer program.
- the CPU 1 reads computer programs, data, and the like from the main memory 4 through the bus 3 , and stores processing results in the main memory 4 through the bus 3 .
- the write-back cache device 2 is provided between the CPU 1 and the bus 3 to reduce the number of times of writing and the amount of data written to the main memory 4 configured with a non-volatile memory.
- the write-back cache device 2 updates only data in the write-back cache device 2 but not data in the main memory 4 .
- the data stored in the main memory 4 and the data stored in the write-back cache device 2 become different from each other in some cases.
- the write-back operation to write the data stored in the write-back cache device 2 to the main memory 4 is performed, the difference is eliminated and data match between the main memory 4 and the write-back cache device 2 .
- the write-back cache device 2 includes a first storage device 21 , a second storage device 22 , a third storage device 23 , and a cache controller 24 .
- FIG. 2 is a diagram for description of a correspondence relation between cache line data stored in the first storage device 21 and data stored in the main memory 4 .
- FIG. 3 is a table for description of a relation between an address in the main memory 4 and each of a tag TG, an index IDX, and a cache line size LSZ.
- FIG. 4 is a table for description of an example of data stored in the first storage device 21 , the second storage device 22 , and the third storage device 23 .
- the write-back cache device 2 includes a cache line, and one cache line is indicated as one row in FIG. 4 .
- a cache line is a storage region provided across the first storage device 21 , the second storage device 22 , and the third storage device 23 and specified by the index IDX.
- the write-back cache device 2 includes a cache line array in which a plurality (in the example illustrated in FIG. 4 , i (i is a positive integer)) of cache lines each indicated as one row are arrayed in a column direction.
- the first storage device 21 can store, in one cache line, n (n is an integer equal to or larger than two) pieces of unit data UD as a unit of writing to the main memory 4 .
- the unit data UD is, for example, word data (for example, 16-bit data) or double word data but not limited to these kinds of data and may be data of any number of bits.
- Cache line data is n pieces of unit data (unit data UDO to UD(n ⁇ 1) illustrated in FIG. 4 ) stored in the first storage device 21 for one cache line.
- each rectangle partitioned by grid lines represents data having the data size of one piece of cache line data.
- the main memory 4 having a storage capacity larger than a storage capacity of the first storage device 21 can store data having the data size of (i ⁇ j) (j is a positive integer) pieces of cache line data.
- a place in which data is stored in the main memory 4 is specified by an address.
- the cache controller 24 decodes an address in the main memory 4 , which is forwarded together with unit data UD, into the cache line size LSZ at a low level, the index IDX at an intermediate level, and the tag TG as illustrated in FIG. 3 .
- the tag TG specifies a column number (0 to (j ⁇ 1)) in the main memory 4 illustrated in FIG. 2 .
- the index IDX specifies a row number (0 to (i ⁇ 1)) in the main memory 4 illustrated in FIG. 2 and a cache line in the write-back cache device 2 .
- the cache line size LSZ specifies the position (0 to (n ⁇ 1)) of unit data UD in data having the data size of cache line data.
- the second storage device 22 stores state instruction data SID in each of a plurality of cache lines.
- the second storage device 22 is configured with, for example, a flip-flop or an SRAM.
- the third storage device 23 stores the tag TG in each of a plurality of cache lines.
- the cache controller 24 is a control circuit configured to control inputting to and outputting from the first storage device 21 , the second storage device 22 , and the third storage device 23 .
- the cache controller 24 stores, in a cache line specified by an index IDX in the third storage device 23 , a tag TG obtained by decoding the address in the main memory 4 .
- the cache controller 24 stores the unit data UD at a position specified by a cache line size LSZ in the cache line specified by the index IDX in the first storage device 21 .
- the cache controller 24 rewrites, as described later with reference to FIG. 7 , state instruction data SID stored in the cache line specified by the index IDX in the second storage device 22 .
- the cache controller 24 performs the write-back operation to write, to the main memory 4 , unit data UD stored in a cache line in the first storage device 21 . Then, when cache line data in a cache line has written to the main memory 4 , the cache controller 24 performs control to rewrite the value of state instruction data SID in the cache line to a value indicating a first state.
- the first state is a state in which data stored at a first address in the main memory 4 is not different from n pieces of unit data stored in a first cache line in the first storage device 21 , the first cache line corresponding to the first address.
- FIG. 11 is a table for description of a first comparative example in which one dirty bit DB corresponds to one cache line.
- the dirty bit DB is one-bit data holding information of whether cache line data stored in the same cache line in the first storage device 21 is “dirty” (bit value 1) or “clean” (bit value 0).
- “Dirty” indicates that the data stored at the first address in the main memory 4 is different from the cache line data stored in the first cache line in the first storage device 21 , the first cache line corresponding to the first address.
- “Clean” indicates that the data stored at the first address in the main memory 4 is not different from the cache line data stored in the first cache line in the first storage device 21 , the first cache line corresponding to the first address.
- FIG. 12 is a table for description of a second comparative example in which dirty bits DB correspond to a plurality of respective pieces of unit data UD in one cache line.
- illustration of data stored in the first storage device 21 and the third storage device 23 is omitted, and only data stored in the second storage device 22 is illustrated,
- n dirty bits DB are provided for n pieces, respectively, of unit data UD
- the second storage device 22 needs a storage capacity n times larger than in the first comparative example illustrated in FIG. 11 , which leads to increase of circuit dimensions.
- the second storage device 22 employs a configuration as illustrated in FIG. 4 .
- the second storage device 22 of the present embodiment stores state instruction data SID configured with a plurality of bits in one cache line.
- the state instruction data SID is configured with bits in a number smaller than n with which it is possible to distinguish a total of (n+2) states of one first state in which all n pieces of unit data UD stored in one cache line are “clean”, one second state in which two or more of the n pieces of unit data UD are “dirty”, and n sub states each corresponding to a position at which unit data UD is “dirty” in a third state in which only one of the n pieces of unit data UD is “dirty”. Accordingly, it is possible to reduce the number of bits of state instruction data SID stored in the second storage device 22 as compared to the second comparative example illustrated in FIG. 12 , and thus reduce circuit dimensions of the second storage device 22 .
- the state instruction data SID is more preferably configured with a minimum number of bits with which (n+2) states can be distinguished from one another. Accordingly, the circuit dimensions of the second storage device 22 can be minimized.
- FIG. 5 is a table illustrating a specific example of values of the state instruction data SID.
- FIG. 5 corresponds to an example in which the number n of pieces of unit data UD stored in one cache line is eight.
- the state instruction data SID is configured with data of four bits, which is a minimum number of bits with which 10 states can be distinguished from one another.
- the data amount of the state instruction data SID in one cache line is half the data amount of the configuration of FIG. 12 in which eight dirty bits DB are provided.
- the state instruction data SID may be configured with five to seven bits.
- the state instruction data SID of four bits can have a decimal value of 0 to 15.
- the value of the state instruction data SID is set to be zero in a case of the above-described first state, and the value of the state instruction data SID is set to be 15 irrespective of positions at which two pieces of unit data UD are “dirty” in a case of the above-described second state.
- the value of the state instruction data SID is set to be one to eight corresponding to respective sub states in which the position of dirty unit data UD is 0 to 7.
- FIG. 6 is a flowchart illustrating processing performed by the write-back cache device 2 when there is an input access.
- the processing illustrated in FIG. 6 is executed when the cache controller 24 receives, as input access from the CPU 1 , a write request including an address in the main memory 4 (refer to FIG. 3 ) and unit data UD.
- the cache controller 24 decodes the address in the main memory 4 to extract a tag TG, an index IDX, and a cache line size LSZ and reads, from the second storage device 22 , state instruction data SID of a cache line specified by the index IDX (step S 1 ).
- the cache controller 24 determines which of the first state, the second state, and the third state is indicated by the value of the read state instruction data SID (step S 2 ).
- the cache controller 24 rewrites the state instruction data SID of the cache line to a value of a sub state corresponding to a position (position indicated by the cache line size LSZ extracted at step S 1 ) of unit data UD to be newly written in the third state (step S 3 ).
- the cache controller 24 When it is determined at step S 2 that the second state is indicated, the cache controller 24 does not change the state instruction data SID of the cache line because no change from the second state occurs even when new unit data UD is written to the cache line (step S 4 ).
- the cache controller 24 determines whether the position of unit data UD to be newly written matches with the position of unit data UD that is already “dirty” (step S 5 ).
- step S 5 When it is determined at step S 5 that the position of unit data UD to be newly written matches with the position of unit data UD that is already “dirty”, the cache controller 24 proceeds to step S 4 described above and does not change the state instruction data SID of the cache line.
- step S 5 When it is determined at step S 5 that the position of unit data UD to be newly written does not match with the position of unit data UD that is already “dirty”, the cache controller 24 rewrites the value of the state instruction data SID of the cache line to a value (“15” in the example illustrated in FIG. 5 ) indicating the second state (step S 6 ).
- the cache controller 24 Having performed processing at step S 3 , S 4 , or S 6 , the cache controller 24 writes unit data UD at the unit data UD position in the cache line in the first storage device 21 , and writes the tag TG extracted at step S 1 in the cache line in the third storage device 23 (step S 7 ).
- the cache controller 24 ends the processing illustrated in FIG. 6 .
- FIG. 7 is a flowchart illustrating processing of writing from the write-back cache device 2 to the main memory 4 .
- Write-back processing illustrated in FIG. 7 is performed on background in a duration in which no access to the main memory 4 is performed.
- the write-back processing illustrated in FIG. 7 is also performed when a new cache line is to be ensured in the write-back cache device 2 in response to a write request from the CPU 1 but there is no unused cache line.
- the cache controller 24 Before writing unit data UD in a cache line in the first storage device 21 to the main memory 4 , the cache controller 24 reads the state instruction data SID of the cache line from the second storage device 22 (step S 11 ).
- the cache controller 24 determines which of the first state, the second state, or the third state is indicated by the value of the read state instruction data SID (step S 12 ).
- the cache controller 24 When it is determined at step S 12 that the first state is indicated, the cache controller 24 writes none of n pieces of unit data UD in the cache line to the main memory 4 (step S 13 ) and does not change the value of the state instruction data SID of the cache line (step S 14 ).
- the cache controller 24 When it is determined at step S 12 that the third state is indicated, the cache controller 24 writes only one piece of unit data UD to a position indicated by the state instruction data SID in the cache line to the main memory 4 (step S 15 ).
- the cache controller 24 When it is determined at step S 12 that the second state is indicated, the cache controller 24 writes all n pieces of unit data UD in the cache line to the main memory 4 (step S 16 ).
- the cache controller 24 rewrites the value of the state instruction data SID of the cache line to a value (in the example illustrated in FIG. 5 , “0”) indicating the first state (step S 17 ).
- the cache controller 24 ends the processing illustrated in FIG. 7 .
- FIG. 8 is a table for description of effects of the write-back cache device 2 .
- a fraction in which the number of pieces of dirty unit data UD at writing to the main memory 4 is one is equal to or larger than 75%.
- the fraction in which the number of pieces of dirty unit data UD at writing to the main memory 4 is one is 30% approximately.
- a parameter such as a cache capacity is tuned so that a cache hit rate is 90% approximately.
- the number of pieces of written unit data UD of Configuration A when the number of pieces of written unit data UD of Configuration A is normalized to one, the number of pieces of written unit data UD of Configuration B is 0.74 approximately, and the number of pieces of written unit data UD of Configuration C is 0.62 approximately.
- a fraction in which the number of pieces of dirty unit data UD at writing to the main memory 4 is equal to or larger than two is higher than the fraction of the first benchmark scheme. Accordingly, the number of pieces of written unit data UD is larger than the number of pieces of written unit data UD of the first benchmark scheme for each of Configurations B and C, but still, effects relatively close to effects of Configuration C are obtained with Configuration B of the present embodiment.
- 16 bits are needed for state instruction data SID with the configuration of FIG. 12 , and thus the configuration of the present embodiment only needs a data amount less than 1 ⁇ 3 of a data amount with the configuration of FIG. 12 . Accordingly, an effect of reduction of the circuit dimensions of the second storage device 22 due to the configuration of the present embodiment when compared with the configuration of FIG. 12 increases as n increases.
- state instruction data SID is configured with bits in a number less than n with which a total of (n+2) states of one first state, one second state, and n sub states in the third state can be distinguished from one another based on a research result that the fraction in which the number of pieces of dirty unit data UD in one cache line is one is high at writing to the main memory 4 , it is possible to reduce the amount of data written to the main memory 4 and thus reduce the circuit dimensions of the second storage device 22 .
- a chip area for storing one-bit data is larger than a chip area for storing one-bit data when the second storage device 22 is configured with an SRAM.
- the chip area can be effectively reduced by using the configuration of the present embodiment with which the number of bits of state instruction data SID stored in one cache line in the second storage device 22 is reduced as compared to the configuration illustrated in FIG. 12 .
- state instruction data SID as the minimum number of bits with which (n+2) states can be distinguished from one another.
- state instruction data SID is read and state instruction data SID after the unit data UD is written to the cache line is set in accordance with the read state instruction data SID, which of n pieces of unit data UD in the cache line is “dirty” can be accurately reflected on the state instruction data SID.
- a second embodiment will be described below with reference to FIGS. 9 and 10 .
- a part same as a part of the first embodiment is, for example, denoted by the same reference sign, description of the part is omitted as appropriate, and any different point will be mainly described below.
- the second storage device 22 in the second embodiment is configured with an SRAM or the like that cannot perform reading and writing in the same operation cycle.
- state instruction data SID As illustrated in FIG. 5 , bifurcation based on whether processing at step S 3 , S 4 , or S 6 is to be performed occurs in accordance with results of determination on state instruction data SID at steps S 2 and S 5 as illustrated in FIG. 6 . Thus, it is needed to read already written state instruction data SID before writing state instruction data SID to the second storage device 22 .
- FIG. 13 is a timing chart illustrating a third comparative example in which stall occurs due to consecutive request for writing unit data UD. Note that each interval partitioned by dotted lines in FIGS. 10 and 13 indicates one operation cycle of the write-back cache device 2 .
- a first operation cycle when there is a first input access (W 0 ) that requests writing of unit data UD, the tag TG and the state instruction data SID of a cache line specified by an index IDX extracted from an address in the main memory 4 are read in the same operation cycle (R 0 ).
- the unit data UD is written at a corresponding position of the unit data UD in the cache line in the first storage device 21 (W 0 ).
- State instruction data SID corrected as necessary based on the state instruction data SID read at (R 0 ) is written to the second storage device 22 (W 0 ).
- stall occurs and the cache controller 24 receives the second input access (W 1 ) in a third operation cycle following the second operation cycle.
- a tag TG and state instruction data SID are read in the third operation cycle (R 1 ), the unit data UD is written at a corresponding position of the unit data UD in the cache line in a fourth operation cycle following the third operation cycle (W 1 ) and state instruction data SID is written to the second storage device 22 (W 1 ) as described above for the first and second operation cycles.
- FIG. 9 is a flowchart illustrating processing performed by the write-back cache device 2 when there is an input access.
- the cache controller 24 determines whether the input access is performed in consecutive operation cycles based on whether the input access is also performed in a previous operation cycle (step S 21 ).
- step S 21 When it is determined at step S 21 that the input access is performed in consecutive operation cycles (YES), the cache controller 24 proceeds to step S 6 and rewrites the value of the state instruction data SID of a corresponding cache line to a value indicating the second state.
- a specific example of positive determination at step S 21 is a case in which input access (W 0 ) is performed in the first operation cycle and input access (W 1 ) is performed in the second operation cycle as illustrated in FIG. 10 .
- a consecutive request for writing of unit data UD is often performed to nearby addresses in the main memory 4 . Accordingly, the second input access of consecutive input accesses is highly likely to be performed to a cache line same as a cache line to which the first input access is performed, and the cache line is highly likely to become the second state. Thus, when the cache controller 24 proceeds to step S 6 in a case of the positive determination at step S 21 , the amount of data written to the main memory 4 is unlikely to be increased as compared to a case in which the processing at step S 3 is performed.
- the cache controller 24 ends the processing illustrated in FIG. 9 .
- FIG. 10 is a timing chart illustrating an example in which stall occurrence is avoided by not reading state instruction data SID when there is a consecutive request for writing of unit data UD in the write-back cache device 2 according to the present embodiment.
- Operation in a first operation cycle is same as operation in the first operation cycle in the example described with reference to FIG. 13 . Specifically, when there is a first input access in the first operation cycle, the cache controller 24 reads, in the first operation cycle, a tag TG and state instruction data SID related to a cache line on which the first input access is performed (R 0 ).
- the unit data UD is written at a corresponding position of the unit data UD in a corresponding cache line in the first storage device 21 (W 0 ).
- State instruction data SID corrected as necessary based on the state instruction data SID read at (R 0 ) is written to the second storage device 22 (W 0 ).
- step S 1 is skipped in case of YES at step S 21 ), and thus the second input access (W 1 ) is received and a tag TG is read (R 1 ).
- the unit data UD is written at the corresponding position of the unit data UD in the corresponding cache line in the first storage device 21 (W 1 ) and state instruction data SID is written to the second storage device 22 (W 1 ) as described above for the fourth operation cycle illustrated in FIG. 13 .
- the state instruction data SID written to the second storage device 22 at (W 1 ) has the value indicating the second state (since the cache controller 24 proceeds to step S 6 in case of YES at step S 21 ).
- the cache controller 24 does not read state instruction data SID in the second operation cycle but performs, in the third operation cycle following the second operation cycle, processing of rewriting the value of state instruction data SID related to a cache line to which the second input access is performed to the value indicating the second state.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Description
- This application is based upon and claims the benefit of priority from the prior Japanese Patent Application No. 2020-152700 filed in Japan on Sep. 11, 2020; the entire contents of which are incorporated herein by reference.
- Embodiments described herein relate generally to a write-back cache device.
- Recently, it has been discussed to employ, as a main memory of a computer, a non-volatile memory such as an MRAM.
- Typically, a non-volatile memory such as an MRAM has properties such as a limited number of times that writing is possible, large current consumption at writing, and high latency at writing.
- Thus, it is required to reduce the number of times of writing to the main memory and the amount of writing data, and a write-back cache memory (hereinafter, a “cache memory” is simply referred to as a “cache” as appropriate) is used as effective means.
- A write-back cache includes a plurality of cache lines in each of which, for example, a plurality of pieces of unit data (unit data is, for example, word data) are stored. Whether data yet to be written to the main memory exists in a cache line is managed in units of cache lines by using a dirty bit.
- In a case of clean (bit value 0), the dirty bit indicates that the data in the cache line is already written to the main memory. In a case of dirty (bit value 1), the dirty bit indicates that the data in the cache line is yet to be written to the main memory. Thus, the data in the cache line is written to the main memory when the dirty bit is dirty.
- However, in the configuration in which one dirty bit is allocated to one cache line, even if only one piece of unit data in a cache line is yet to be written to a main memory for example, all pieces of unit data in the cache line are written to the main memory, which results in an increase in the amount of data to be written to the main memory.
-
FIG. 1 is a diagram illustrating the configuration of a computer including a write-back cache device according to a first embodiment; -
FIG. 2 is a diagram for description of a correspondence relation between cache line data stored in a first storage device and data stored in a main memory according to the first embodiment; -
FIG. 3 is a table for description of a relation between an address in the main memory and each of a tag, an index, and a cache line size according to the first embodiment; -
FIG. 4 is a table for description of an example of data stored in first to third storage devices according to the first embodiment; -
FIG. 5 is a table illustrating a specific example of values of state instruction data according to the first embodiment; -
FIG. 6 is a flowchart illustrating processing performed by the write-back cache device when there is an input access according to the first embodiment; -
FIG. 7 is a flowchart illustrating processing of writing from the write-back cache device to the main memory according to the first embodiment; -
FIG. 8 is a table for description of effects of the write-back cache device according to the first embodiment; -
FIG. 9 is a flowchart illustrating processing performed by the write-back cache device when there is an input access according to a second embodiment; -
FIG. 10 is a timing chart illustrating an example in which stall occurrence is avoided by not reading the state instruction data when there is a consecutive request for writing of unit data in the write-back cache device according to the second embodiment; -
FIG. 11 is a table for description of a first comparative example in which one dirty bit corresponds to one cache line; -
FIG. 12 is a table for description of a second comparative example in which dirty bits correspond to a plurality of respective pieces of unit data in one cache line; and -
FIG. 13 is a timing chart illustrating a third comparative example in which stall occurs due to a consecutive request for writing of unit data. - A write-back cache device of an embodiment includes: a first storage device capable of storing n (n is an integer equal to or larger than two) pieces of unit data in each of a plurality of cache lines, each piece of unit data being a unit of writing to a main memory; a second storage device configured to store state instruction data in each of the plurality of cache lines; and a cache controller configured to control inputting to and outputting from the first and second storage devices. In a first state in which n pieces of unit data stored in a first cache line in the first storage device are not different from data stored at a first address in the main memory, the first address corresponding to the first cache line, first state instruction data in the first cache line has a first value. In a second state in which two or more pieces of unit data among the n pieces of unit data stored in the first cache line are different from data stored at the first address, the first state instruction data has a second value. When one piece of unit data among the n pieces of unit data stored in the first cache line is first unit data in a third state in which only the one piece of unit data is different from data stored at the first address, the first state instruction data has a third value.
- Embodiments will be described below with reference to the accompanying drawings. cl First Embodiment
-
FIG. 1 is a diagram illustrating the configuration of a computer including a write-back cache device 2 according to the present embodiment. - The computer includes a
CPU 1, the write-back cache device 2, abus 3, and amain memory 4. - The
CPU 1 is a central processing unit configured to process information by sequentially reading, interpreting, and executing computer programs, data, and the like stored in themain memory 4. - The
bus 3 is a common transmission path through which a plurality of devices transmit instructions and data. - The
main memory 4 is configured with a non-volatile memory such as an MRAM. Themain memory 4 stores, for example, a computer program to be executed by theCPU 1, and data to be processed by theCPU 1 through execution of the computer program. - The
CPU 1 reads computer programs, data, and the like from themain memory 4 through thebus 3, and stores processing results in themain memory 4 through thebus 3. - The write-
back cache device 2 is provided between theCPU 1 and thebus 3 to reduce the number of times of writing and the amount of data written to themain memory 4 configured with a non-volatile memory. - Until write-back operation is performed, the write-
back cache device 2 updates only data in the write-back cache device 2 but not data in themain memory 4. As a result, the data stored in themain memory 4 and the data stored in the write-back cache device 2 become different from each other in some cases. However, when the write-back operation to write the data stored in the write-back cache device 2 to themain memory 4 is performed, the difference is eliminated and data match between themain memory 4 and the write-back cache device 2. - The write-
back cache device 2 includes afirst storage device 21, asecond storage device 22, athird storage device 23, and acache controller 24. -
FIG. 2 is a diagram for description of a correspondence relation between cache line data stored in thefirst storage device 21 and data stored in themain memory 4.FIG. 3 is a table for description of a relation between an address in themain memory 4 and each of a tag TG, an index IDX, and a cache line size LSZ.FIG. 4 is a table for description of an example of data stored in thefirst storage device 21, thesecond storage device 22, and thethird storage device 23. - The write-
back cache device 2 includes a cache line, and one cache line is indicated as one row inFIG. 4 . Specifically, a cache line is a storage region provided across thefirst storage device 21, thesecond storage device 22, and thethird storage device 23 and specified by the index IDX. As illustrated inFIG. 4 , the write-back cache device 2 includes a cache line array in which a plurality (in the example illustrated inFIG. 4 , i (i is a positive integer)) of cache lines each indicated as one row are arrayed in a column direction. - The
first storage device 21 can store, in one cache line, n (n is an integer equal to or larger than two) pieces of unit data UD as a unit of writing to themain memory 4. The unit data UD is, for example, word data (for example, 16-bit data) or double word data but not limited to these kinds of data and may be data of any number of bits. Cache line data is n pieces of unit data (unit data UDO to UD(n−1) illustrated inFIG. 4 ) stored in thefirst storage device 21 for one cache line. - In
FIG. 2 , each rectangle partitioned by grid lines represents data having the data size of one piece of cache line data. When thefirst storage device 21 can store i pieces of cache line data, themain memory 4 having a storage capacity larger than a storage capacity of thefirst storage device 21 can store data having the data size of (i×j) (j is a positive integer) pieces of cache line data. - A place in which data is stored in the
main memory 4 is specified by an address. Thecache controller 24 decodes an address in themain memory 4, which is forwarded together with unit data UD, into the cache line size LSZ at a low level, the index IDX at an intermediate level, and the tag TG as illustrated inFIG. 3 . - The tag TG specifies a column number (0 to (j−1)) in the
main memory 4 illustrated inFIG. 2 . The index IDX specifies a row number (0 to (i−1)) in themain memory 4 illustrated inFIG. 2 and a cache line in the write-back cache device 2. The cache line size LSZ specifies the position (0 to (n−1)) of unit data UD in data having the data size of cache line data. - The
second storage device 22 stores state instruction data SID in each of a plurality of cache lines. Thesecond storage device 22 is configured with, for example, a flip-flop or an SRAM. - The
third storage device 23 stores the tag TG in each of a plurality of cache lines. - The
cache controller 24 is a control circuit configured to control inputting to and outputting from thefirst storage device 21, thesecond storage device 22, and thethird storage device 23. - When unit data UD and an address in the
main memory 4 are forwarded from theCPU 1 to the write-backcache device 2, thecache controller 24 stores, in a cache line specified by an index IDX in thethird storage device 23, a tag TG obtained by decoding the address in themain memory 4. Thecache controller 24 stores the unit data UD at a position specified by a cache line size LSZ in the cache line specified by the index IDX in thefirst storage device 21. Thecache controller 24 rewrites, as described later with reference toFIG. 7 , state instruction data SID stored in the cache line specified by the index IDX in thesecond storage device 22. - In addition, the
cache controller 24 performs the write-back operation to write, to themain memory 4, unit data UD stored in a cache line in thefirst storage device 21. Then, when cache line data in a cache line has written to themain memory 4, thecache controller 24 performs control to rewrite the value of state instruction data SID in the cache line to a value indicating a first state. The first state is a state in which data stored at a first address in themain memory 4 is not different from n pieces of unit data stored in a first cache line in thefirst storage device 21, the first cache line corresponding to the first address. -
FIG. 11 is a table for description of a first comparative example in which one dirty bit DB corresponds to one cache line. - The dirty bit DB is one-bit data holding information of whether cache line data stored in the same cache line in the
first storage device 21 is “dirty” (bit value 1) or “clean” (bit value 0). - “Dirty” indicates that the data stored at the first address in the
main memory 4 is different from the cache line data stored in the first cache line in thefirst storage device 21, the first cache line corresponding to the first address. - “Clean” indicates that the data stored at the first address in the
main memory 4 is not different from the cache line data stored in the first cache line in thefirst storage device 21, the first cache line corresponding to the first address. - With the configuration of the first comparative example, for example, when only one piece of unit data UD in cache line data (n pieces of unit data UD) is yet to be written to the
main memory 4, all n pieces of unit data UD are written to themain memory 4 through the write-back operation. -
FIG. 12 is a table for description of a second comparative example in which dirty bits DB correspond to a plurality of respective pieces of unit data UD in one cache line. InFIG. 12 , illustration of data stored in thefirst storage device 21 and thethird storage device 23 is omitted, and only data stored in thesecond storage device 22 is illustrated, - As illustrated in
FIG. 12 , when n dirty bits DB are provided for n pieces, respectively, of unit data UD, it is possible to determine which of the n pieces of unit data UD is “dirty” or “clean”. Then, when the write-back operation is performed, only dirty unit data UD is written to themain memory 4, which achieves reduction of the amount of data written to themain memory 4. - However, in the second comparative example illustrated in
FIG. 12 , thesecond storage device 22 needs a storage capacity n times larger than in the first comparative example illustrated inFIG. 11 , which leads to increase of circuit dimensions. - Research has found out that the number of pieces of dirty unit data UD in one cache line at writing to the
main memory 4 is one in a large fraction of cache lines. One reason for this research result is thought to be that when there are a plurality of global variables (variables accessible from every scope in computer programming), these variables are stored and accessed at every other address in themain memory 4. - Thus, in the present embodiment, the
second storage device 22 employs a configuration as illustrated inFIG. 4 . - The
second storage device 22 of the present embodiment stores state instruction data SID configured with a plurality of bits in one cache line. - The state instruction data SID is configured with bits in a number smaller than n with which it is possible to distinguish a total of (n+2) states of one first state in which all n pieces of unit data UD stored in one cache line are “clean”, one second state in which two or more of the n pieces of unit data UD are “dirty”, and n sub states each corresponding to a position at which unit data UD is “dirty” in a third state in which only one of the n pieces of unit data UD is “dirty”. Accordingly, it is possible to reduce the number of bits of state instruction data SID stored in the
second storage device 22 as compared to the second comparative example illustrated inFIG. 12 , and thus reduce circuit dimensions of thesecond storage device 22. - The state instruction data SID is more preferably configured with a minimum number of bits with which (n+2) states can be distinguished from one another. Accordingly, the circuit dimensions of the
second storage device 22 can be minimized. -
FIG. 5 is a table illustrating a specific example of values of the state instruction data SID.FIG. 5 corresponds to an example in which the number n of pieces of unit data UD stored in one cache line is eight. - When positions of the eight pieces of unit data UD in the cache line are indicated by
numbers 0 to 7, the state instruction data SID is configured with data of four bits, which is a minimum number of bits with which 10 states can be distinguished from one another. With this configuration, the data amount of the state instruction data SID in one cache line is half the data amount of the configuration ofFIG. 12 in which eight dirty bits DB are provided. - However, since the number of bits smaller than eight bits is smaller than the number of bits for the configuration of
FIG. 12 , the state instruction data SID may be configured with five to seven bits. - The state instruction data SID of four bits can have a decimal value of 0 to 15. For example, the value of the state instruction data SID is set to be zero in a case of the above-described first state, and the value of the state instruction data SID is set to be 15 irrespective of positions at which two pieces of unit data UD are “dirty” in a case of the above-described second state. In a case of the above-described third state, the value of the state instruction data SID is set to be one to eight corresponding to respective sub states in which the position of dirty unit data UD is 0 to 7.
- For example, when the values illustrated in
FIG. 5 are allocated to the state instruction data SID, it is possible to determine which of the first to third states cache line data is in and determine, in a case of the third state, a sub state corresponding a position at which unit data UD is “dirty”. -
FIG. 6 is a flowchart illustrating processing performed by the write-backcache device 2 when there is an input access. - The processing illustrated in
FIG. 6 is executed when thecache controller 24 receives, as input access from theCPU 1, a write request including an address in the main memory 4 (refer toFIG. 3 ) and unit data UD. - The
cache controller 24 decodes the address in themain memory 4 to extract a tag TG, an index IDX, and a cache line size LSZ and reads, from thesecond storage device 22, state instruction data SID of a cache line specified by the index IDX (step S1). - The
cache controller 24 determines which of the first state, the second state, and the third state is indicated by the value of the read state instruction data SID (step S2). - When it is determined at step S2 that the first state is indicated, the
cache controller 24 rewrites the state instruction data SID of the cache line to a value of a sub state corresponding to a position (position indicated by the cache line size LSZ extracted at step S1) of unit data UD to be newly written in the third state (step S3). - When it is determined at step S2 that the second state is indicated, the
cache controller 24 does not change the state instruction data SID of the cache line because no change from the second state occurs even when new unit data UD is written to the cache line (step S4). - When it is determined at step S2 that the third state is indicated, the
cache controller 24 determines whether the position of unit data UD to be newly written matches with the position of unit data UD that is already “dirty” (step S5). - When it is determined at step S5 that the position of unit data UD to be newly written matches with the position of unit data UD that is already “dirty”, the
cache controller 24 proceeds to step S4 described above and does not change the state instruction data SID of the cache line. - When it is determined at step S5 that the position of unit data UD to be newly written does not match with the position of unit data UD that is already “dirty”, the
cache controller 24 rewrites the value of the state instruction data SID of the cache line to a value (“15” in the example illustrated inFIG. 5 ) indicating the second state (step S6). - Having performed processing at step S3, S4, or S6, the
cache controller 24 writes unit data UD at the unit data UD position in the cache line in thefirst storage device 21, and writes the tag TG extracted at step S1 in the cache line in the third storage device 23 (step S7). - Having performed processing at step S7, the
cache controller 24 ends the processing illustrated inFIG. 6 . -
FIG. 7 is a flowchart illustrating processing of writing from the write-backcache device 2 to themain memory 4. Write-back processing illustrated inFIG. 7 is performed on background in a duration in which no access to themain memory 4 is performed. The write-back processing illustrated inFIG. 7 is also performed when a new cache line is to be ensured in the write-backcache device 2 in response to a write request from theCPU 1 but there is no unused cache line. - Before writing unit data UD in a cache line in the
first storage device 21 to themain memory 4, thecache controller 24 reads the state instruction data SID of the cache line from the second storage device 22 (step S11). - The
cache controller 24 determines which of the first state, the second state, or the third state is indicated by the value of the read state instruction data SID (step S12). - When it is determined at step S12 that the first state is indicated, the
cache controller 24 writes none of n pieces of unit data UD in the cache line to the main memory 4 (step S13) and does not change the value of the state instruction data SID of the cache line (step S14). - When it is determined at step S12 that the third state is indicated, the
cache controller 24 writes only one piece of unit data UD to a position indicated by the state instruction data SID in the cache line to the main memory 4 (step S15). - When it is determined at step S12 that the second state is indicated, the
cache controller 24 writes all n pieces of unit data UD in the cache line to the main memory 4 (step S16). - Having performed processing at step S15 or S16, the
cache controller 24 rewrites the value of the state instruction data SID of the cache line to a value (in the example illustrated inFIG. 5 , “0”) indicating the first state (step S17). - Having performed processing at step S14 or S17, the
cache controller 24 ends the processing illustrated inFIG. 7 . -
FIG. 8 is a table for description of effects of the write-backcache device 2. -
FIG. 8 illustrates a result of comparison of the amount of data (the number of pieces of unit data UD) written to themain memory 4 among Configuration A of the first comparative example (FIG. 11 ) in which one dirty bit DB corresponds to one cache line, Configuration B of the present embodiment (FIGS. 4 and 5 ) in which state instruction data SID of four bits corresponds to one cache line, and Configuration C of the second comparative example (FIG. 12 ) in which n dirty bits DB correspond to n pieces (in this example, n=8), respectively, of unit data UD in one cache line. - Note that two benchmark schemes are used to obtain the comparison results. With a first benchmark scheme, a fraction in which the number of pieces of dirty unit data UD at writing to the
main memory 4 is one is equal to or larger than 75%. With a second benchmark scheme, the fraction in which the number of pieces of dirty unit data UD at writing to themain memory 4 is one is 30% approximately. For each benchmark scheme, a parameter such as a cache capacity is tuned so that a cache hit rate is 90% approximately. - With the first benchmark scheme, when the number of pieces of written unit data UD of Configuration A is normalized to one, the number of pieces of written unit data UD of Configuration B is 0.33 approximately, and the number of pieces of written unit data UD of Configuration C is 0.21 approximately. Thus, with Configuration B of the present embodiment, the amount of data written to the
main memory 4 can be reduced by a little under 70% as compared to Configuration A, and effects relatively close to effects of Configuration C are obtained. - With the second benchmark scheme, when the number of pieces of written unit data UD of Configuration A is normalized to one, the number of pieces of written unit data UD of Configuration B is 0.74 approximately, and the number of pieces of written unit data UD of Configuration C is 0.62 approximately. With the second benchmark scheme, a fraction in which the number of pieces of dirty unit data UD at writing to the
main memory 4 is equal to or larger than two is higher than the fraction of the first benchmark scheme. Accordingly, the number of pieces of written unit data UD is larger than the number of pieces of written unit data UD of the first benchmark scheme for each of Configurations B and C, but still, effects relatively close to effects of Configuration C are obtained with Configuration B of the present embodiment. - Note that the above description is made on an example in which the number n of pieces of unit data UD stored in one cache line is eight, but the configuration of the present embodiment is also applicable to a configuration of n=2 or larger, and in particular, the configuration of the present embodiment is effectively applicable to a configuration of n=3 or larger.
- In a case of n=16 as an example of n≠8, in the present embodiment, state instruction data SID can be configured with five bits, which is a minimum number of bits with which (n+2)=18 states can be distinguished from one another. However, 16 bits are needed for state instruction data SID with the configuration of
FIG. 12 , and thus the configuration of the present embodiment only needs a data amount less than ⅓ of a data amount with the configuration ofFIG. 12 . Accordingly, an effect of reduction of the circuit dimensions of thesecond storage device 22 due to the configuration of the present embodiment when compared with the configuration ofFIG. 12 increases as n increases. - According to the first embodiment as described above, since state instruction data SID is configured with bits in a number less than n with which a total of (n+2) states of one first state, one second state, and n sub states in the third state can be distinguished from one another based on a research result that the fraction in which the number of pieces of dirty unit data UD in one cache line is one is high at writing to the
main memory 4, it is possible to reduce the amount of data written to themain memory 4 and thus reduce the circuit dimensions of thesecond storage device 22. - In particular, when the
second storage device 22 is configured with a flip-flop, a chip area for storing one-bit data is larger than a chip area for storing one-bit data when thesecond storage device 22 is configured with an SRAM. Thus, the chip area can be effectively reduced by using the configuration of the present embodiment with which the number of bits of state instruction data SID stored in one cache line in thesecond storage device 22 is reduced as compared to the configuration illustrated inFIG. 12 . - In this case, it is possible to further reduce the circuit dimensions of the
second storage device 22 and the chip area by configuring state instruction data SID as the minimum number of bits with which (n+2) states can be distinguished from one another. - Since, when unit data UD is to be written to a cache line, state instruction data SID is read and state instruction data SID after the unit data UD is written to the cache line is set in accordance with the read state instruction data SID, which of n pieces of unit data UD in the cache line is “dirty” can be accurately reflected on the state instruction data SID.
- When the state instruction data SID has a value indicating the first state, none of the n pieces of unit data UD in the cache line are written to the
main memory 4 and the state instruction data SID is not changed, and thus the amount of data written to themain memory 4 can be zero. - When the state instruction data SID has a value indicating the third state, only dirty unit data UD among the n pieces of unit data UD in the cache line is written to the
main memory 4 and the value of the state instruction data SID is rewritten to the value indicating the first state, and thus the amount of data written to themain memory 4 can be reduced to a minimum. - A second embodiment will be described below with reference to
FIGS. 9 and 10 . In the second embodiment, a part same as a part of the first embodiment is, for example, denoted by the same reference sign, description of the part is omitted as appropriate, and any different point will be mainly described below. - The
second storage device 22 in the second embodiment is configured with an SRAM or the like that cannot perform reading and writing in the same operation cycle. - With the configuration of the second comparative example illustrated in
FIG. 12 , processing of reading a dirty bit DB from thesecond storage device 22 before a dirty bit DB is written is unnecessary, and it is only needed to rewrite, to “dirty”, a dirty bit DB corresponding to a position at which unit data UD is written. - However, with a configuration of state instruction data SID as illustrated in
FIG. 5 , bifurcation based on whether processing at step S3, S4, or S6 is to be performed occurs in accordance with results of determination on state instruction data SID at steps S2 and S5 as illustrated inFIG. 6 . Thus, it is needed to read already written state instruction data SID before writing state instruction data SID to thesecond storage device 22. - Accordingly, stall occurs when input access that requests writing of unit data UD is performed in consecutive operation cycles in a case in which the
second storage device 22 is configured with an SRAM. This will be described below with reference toFIG. 13 .FIG. 13 is a timing chart illustrating a third comparative example in which stall occurs due to consecutive request for writing unit data UD. Note that each interval partitioned by dotted lines inFIGS. 10 and 13 indicates one operation cycle of the write-backcache device 2. - In a first operation cycle, when there is a first input access (W0) that requests writing of unit data UD, the tag TG and the state instruction data SID of a cache line specified by an index IDX extracted from an address in the
main memory 4 are read in the same operation cycle (R0). - In a second operation cycle following the first operation cycle, the unit data UD is written at a corresponding position of the unit data UD in the cache line in the first storage device 21 (W0). State instruction data SID corrected as necessary based on the state instruction data SID read at (R0) is written to the second storage device 22 (W0).
- When second input access (W1) that requests writing of unit data UD is performed in the second operation cycle, since the state instruction data SID is being written to the second storage device 22 (W0), state instruction data SID cannot be read from the
second storage device 22 in the second operation cycle (R1). - Accordingly, stall occurs and the
cache controller 24 receives the second input access (W1) in a third operation cycle following the second operation cycle. - As a result, a tag TG and state instruction data SID are read in the third operation cycle (R1), the unit data UD is written at a corresponding position of the unit data UD in the cache line in a fourth operation cycle following the third operation cycle (W1) and state instruction data SID is written to the second storage device 22 (W1) as described above for the first and second operation cycles.
- In this manner, when stall occurs, input access cannot be handled in consecutive operation cycles and delay occurs, which degrades performance of the write-back
cache device 2. - Processing of the present embodiment performed to address such a problem will be described below with reference to
FIG. 9 .FIG. 9 is a flowchart illustrating processing performed by the write-backcache device 2 when there is an input access. - When having received input access, the
cache controller 24 determines whether the input access is performed in consecutive operation cycles based on whether the input access is also performed in a previous operation cycle (step S21). - When it is determined that the input access is not performed in consecutive operation cycles (NO), operation of steps S1 to S7 described with reference to
FIG. 6 is performed. - When it is determined at step S21 that the input access is performed in consecutive operation cycles (YES), the
cache controller 24 proceeds to step S6 and rewrites the value of the state instruction data SID of a corresponding cache line to a value indicating the second state. A specific example of positive determination at step S21 is a case in which input access (W0) is performed in the first operation cycle and input access (W1) is performed in the second operation cycle as illustrated inFIG. 10 . - Note that a consecutive request for writing of unit data UD is often performed to nearby addresses in the
main memory 4. Accordingly, the second input access of consecutive input accesses is highly likely to be performed to a cache line same as a cache line to which the first input access is performed, and the cache line is highly likely to become the second state. Thus, when thecache controller 24 proceeds to step S6 in a case of the positive determination at step S21, the amount of data written to themain memory 4 is unlikely to be increased as compared to a case in which the processing at step S3 is performed. - Thereafter, having performed the processing at step S7, the
cache controller 24 ends the processing illustrated inFIG. 9 . -
FIG. 10 is a timing chart illustrating an example in which stall occurrence is avoided by not reading state instruction data SID when there is a consecutive request for writing of unit data UD in the write-backcache device 2 according to the present embodiment. - Operation in a first operation cycle is same as operation in the first operation cycle in the example described with reference to
FIG. 13 . Specifically, when there is a first input access in the first operation cycle, thecache controller 24 reads, in the first operation cycle, a tag TG and state instruction data SID related to a cache line on which the first input access is performed (R0). - In a second operation cycle following the first operation cycle, the unit data UD is written at a corresponding position of the unit data UD in a corresponding cache line in the first storage device 21 (W0). State instruction data SID corrected as necessary based on the state instruction data SID read at (R0) is written to the second storage device 22 (W0).
- In addition, in the second operation cycle, second input access (W1) that requests writing of unit data UD is performed, but reading of state instruction data SID in the second operation cycle is omitted (step S1 is skipped in case of YES at step S21), and thus the second input access (W1) is received and a tag TG is read (R1).
- In a third operation cycle following the second operation cycle, the unit data UD is written at the corresponding position of the unit data UD in the corresponding cache line in the first storage device 21 (W1) and state instruction data SID is written to the second storage device 22 (W1) as described above for the fourth operation cycle illustrated in
FIG. 13 . However, the state instruction data SID written to thesecond storage device 22 at (W1) has the value indicating the second state (since thecache controller 24 proceeds to step S6 in case of YES at step S21). - In this manner, when there is a second input access in the second operation cycle following the first input access in the first operation cycle, the
cache controller 24 does not read state instruction data SID in the second operation cycle but performs, in the third operation cycle following the second operation cycle, processing of rewriting the value of state instruction data SID related to a cache line to which the second input access is performed to the value indicating the second state. - According to the second embodiment thus configured, effects substantially same as effects of the above-described first embodiment are obtained. In addition, when there is a consecutive input access in the first and second operation cycles, state instruction data SID is not read in the second operation cycle but the value of state instruction data SID is rewritten to the value indicating the second state in the third operation cycle. Thus, it is possible to avoid stall occurrence when the
second storage device 22 is configured with an SRAM or the like and maintain fast operation of the write-back cache device. - While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel devices described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the devices described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.
Claims (8)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JPJP2020-152700 | 2020-09-11 | ||
JP2020-152700 | 2020-09-11 | ||
JP2020152700A JP7350699B2 (en) | 2020-09-11 | 2020-09-11 | write-back cache device |
Publications (2)
Publication Number | Publication Date |
---|---|
US20220083474A1 true US20220083474A1 (en) | 2022-03-17 |
US11294821B1 US11294821B1 (en) | 2022-04-05 |
Family
ID=80626655
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/186,192 Active US11294821B1 (en) | 2020-09-11 | 2021-02-26 | Write-back cache device |
Country Status (2)
Country | Link |
---|---|
US (1) | US11294821B1 (en) |
JP (1) | JP7350699B2 (en) |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH077357B2 (en) * | 1989-10-19 | 1995-01-30 | 工業技術院長 | Buffer control method |
JP3306901B2 (en) * | 1992-05-11 | 2002-07-24 | 松下電器産業株式会社 | Cache memory |
JPH0962578A (en) * | 1995-08-22 | 1997-03-07 | Canon Inc | Information processor and its control method |
JP3013781B2 (en) | 1996-07-11 | 2000-02-28 | 日本電気株式会社 | Cash system |
JP3204295B2 (en) * | 1997-03-31 | 2001-09-04 | 日本電気株式会社 | Cache memory system |
DE69924939T2 (en) | 1998-09-01 | 2006-03-09 | Texas Instruments Inc., Dallas | Improved memory hierarchy for processors and coherence protocol for this |
JP4434534B2 (en) | 2001-09-27 | 2010-03-17 | 株式会社東芝 | Processor system |
US8429386B2 (en) * | 2009-06-30 | 2013-04-23 | Oracle America, Inc. | Dynamic tag allocation in a multithreaded out-of-order processor |
JP2012033047A (en) | 2010-07-30 | 2012-02-16 | Toshiba Corp | Information processor, memory management device, memory management method and program |
JPWO2012102002A1 (en) | 2011-01-24 | 2014-06-30 | パナソニック株式会社 | Virtual computer system, virtual computer control method, virtual computer control program, recording medium, and integrated circuit |
US9342461B2 (en) | 2012-11-28 | 2016-05-17 | Qualcomm Incorporated | Cache memory system and method using dynamically allocated dirty mask space |
CN105144120B (en) * | 2013-03-28 | 2018-10-23 | 慧与发展有限责任合伙企业 | The data from cache line are stored to main memory based on storage address |
JP6711121B2 (en) | 2016-05-10 | 2020-06-17 | 富士通株式会社 | Information processing apparatus, cache memory control method, and cache memory control program |
JP7139719B2 (en) * | 2018-06-26 | 2022-09-21 | 富士通株式会社 | Information processing device, arithmetic processing device, and control method for information processing device |
-
2020
- 2020-09-11 JP JP2020152700A patent/JP7350699B2/en active Active
-
2021
- 2021-02-26 US US17/186,192 patent/US11294821B1/en active Active
Also Published As
Publication number | Publication date |
---|---|
JP7350699B2 (en) | 2023-09-26 |
US11294821B1 (en) | 2022-04-05 |
JP2022047008A (en) | 2022-03-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9563551B2 (en) | Data storage device and data fetching method for flash memory | |
US11210020B2 (en) | Methods and systems for accessing a memory | |
KR102216116B1 (en) | Memory module and operating method thereof | |
JP6088951B2 (en) | Cache memory system and processor system | |
US10235049B2 (en) | Device and method to manage access method for memory pages | |
US8028118B2 (en) | Using an index value located on a page table to index page attributes | |
US10496546B2 (en) | Cache memory and processor system | |
US10635614B2 (en) | Cooperative overlay | |
US8230154B2 (en) | Fully associative banking for memory | |
US11294821B1 (en) | Write-back cache device | |
US12093180B2 (en) | Tags and data for caches | |
CN108205500B (en) | Memory access method and system for multiple threads | |
US11720486B2 (en) | Memory data access apparatus and method thereof | |
US9817767B2 (en) | Semiconductor apparatus and operating method thereof | |
US11526448B2 (en) | Direct mapped caching scheme for a memory side cache that exhibits associativity in response to blocking from pinning | |
US20090182938A1 (en) | Content addressable memory augmented memory | |
JP7140972B2 (en) | Arithmetic processing device, information processing device, and method of controlling arithmetic processing device | |
US20240273043A1 (en) | Memory management device and method applied to intelligence processing unit | |
JPH0784886A (en) | Method and unit for cache memory control | |
US20070061683A1 (en) | Error correction apparatus and method for data stored in memory | |
JP2012174049A (en) | Storage device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
AS | Assignment |
Owner name: TOSHIBA ELECTRONIC DEVICES & STORAGE CORPORATION, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:SAKAMOTO, NOBUAKI;REEL/FRAME:057835/0948 Effective date: 20210929 Owner name: KABUSHIKI KAISHA TOSHIBA, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:SAKAMOTO, NOBUAKI;REEL/FRAME:057835/0948 Effective date: 20210929 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |