KR101442494B1 - Control method of sequential selective word reading drowsy cache with word filter - Google Patents
Control method of sequential selective word reading drowsy cache with word filter Download PDFInfo
- Publication number
- KR101442494B1 KR101442494B1 KR1020130058371A KR20130058371A KR101442494B1 KR 101442494 B1 KR101442494 B1 KR 101442494B1 KR 1020130058371 A KR1020130058371 A KR 1020130058371A KR 20130058371 A KR20130058371 A KR 20130058371A KR 101442494 B1 KR101442494 B1 KR 101442494B1
- Authority
- KR
- South Korea
- Prior art keywords
- cache
- word
- hit
- filter
- array
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/325—Power saving in peripheral device
- G06F1/3275—Power saving in memory, e.g. RAM, cache
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
The present invention relates to a method of controlling a sequential selective word access driver cache using a word filter, wherein the control method comprises a word filter cache having a storage unit of 1 word, a tag array and selective word reading (SWR) data A method of controlling a sequential selective word access driver cache using a word filter comprising a plurality of arrays and applying a sequential caching scheme and a drag caching scheme to the tag array and the SWR data array, Concurrently performing a request delivery to the cache, a request delivery to the tag array, and a wake-up signal delivery to the SWR data array; Determining whether the word filter cache has been hit; Determining, if not hit in the word filter cache, whether the tag is hit in the tag array; And if the SWR data array is not hit in the tag array, transmitting the request to the lower storage device. When the SWR data array receives the awake signal, it determines whether it is a dragged mode, .
Description
The present invention relates to a method of controlling a sequential selective word access driver cache using a word filter, and in order to reduce the dynamic power consumption of the cache, a filter cache, a sequential cache And a technique for maximizing cache performance by minimizing the overhead while reducing the dynamic and static power consumption of the cache by fusing the cache.
Cache memory is introduced to overcome the speed difference between the processor and the memory. By introducing temporal and spatial locality, memory access time is reduced and the whole system is greatly improved. Because of the increased performance, cache memory is increasingly utilized in embedded processors with limited resource constraints. However, as the capacity of the cache becomes larger and the structure becomes more complicated, the influence of the cache on the power consumption of the entire chip is also increasing.
The power consumption of cache memory can be divided into static consumption and dynamic consumption. Static consumption refers to the energy generated by a small amount of leakage current flowing through each transistor cell of the cache SRAM, and dynamic consumption is the amount of energy consumed when the transistor is switched ).
Various attempts have been made to reduce the power consumption of the cache memory as follows.
As a typical structure for reducing dynamic consumption, there is a filter cache that reduces the dynamic power consumption of the L1 cache by placing an additional storage device corresponding to the L0 cache between the processor register and the L1 cache. The filter cache is relatively smaller than the L1 cache, which can reduce the dynamic power consumption of the L1 cache when a hit occurs in the filter cache. However, since the filter cache is located in a critical path, performance is degraded if it is not hit in the filter cache. Particularly, in the case of a data cache having a relatively small temporal / spatial association, the performance of the filter cache is low because the hit rate of the filter cache is low.
In addition, Phase Access Cache or Sequential Cache separates the tag array and the data array of the cache to reduce dynamic power consumption. The phase cache operates the tag array with relatively small operating power first in the n - way set associative cache to find out the hit state, and then operates only the data array of the hit way. Thus, when hit, only 1 / n data array dynamic power is used compared to the conventional cache model. However, since it is located in the required path like the filter cache, the performance decreases, and since the performance decreases when hit, the performance decrease decreases in proportion to the dynamic power consumption.
Meanwhile, Drowsy Cache is a technique for reducing static consumption. The Draugi cache includes a normal mode in which a normal voltage is supplied to the cache line and a drowsy mode in which a low voltage is supplied to the cache line in order to maintain the data storage state while reducing leakage power I have. When data is requested when the cache line is in the drain mode, it must be accessed after a wake-up operation that first boosts the cache line to a steady state voltage. However, additional cycles are consumed in this awakening task, which results in the drawbacks of performance degradation.
As such, conventional techniques for reducing the power consumption of the cache have a limitation in sacrificing performance in order to reduce dynamic and static consumption. In addition, conventional techniques are not optimized to each other without mutual benefits.
In the art, there is a need to maximize the advantages of each technique by applying both filter caches, sequential caches, and drain caches, which are conventional techniques for reducing cache power consumption, while maximizing the cache performance by minimizing the overhead. .
According to an aspect of the present invention, there is provided a method for controlling a sequential selective word access driver cache using a word filter according to an embodiment of the present invention includes a word filter cache having a storage unit of one word, a tag array, and a selective word access A method of controlling a sequential selective word access driver cache using a word filter comprising a sequential cache scheme and a drag cache scheme for the tag array and the SWR data array, the method comprising: Concurrently performing a request delivery to the word filter cache, a request delivery to the tag array, and a wake-up signal delivery to the SWR data array; Determining whether the word filter cache has been hit; Determining, if not hit in the word filter cache, whether the tag is hit in the tag array; And if the SWR data array is not hit in the tag array, transmitting the request to the lower storage device. When the SWR data array receives the awake signal, it determines whether it is a dragged mode, .
In addition, the means for solving the above-mentioned problems are not all enumerating the features of the present invention. The various features of the present invention and the advantages and effects thereof will be more fully understood by reference to the following specific embodiments.
In order to reduce the dynamic power consumption of the cache, the filter cache, sequential cache, and drain cache, which are conventional techniques for reducing cache power consumption in the proposed selective word access cache, are merged to reduce the dynamic and static power consumption of the cache, A control method of a sequential selective word-accessing drive cache using a word filter capable of maximizing the cache performance by minimizing the head can be provided.
1 is a diagram showing a structure of a conventional L1 reference cache,
FIG. 2 is a diagram showing a structure in which a conventional filter cache is applied to the L1 reference cache shown in FIG. 1;
3 is a diagram showing the structure of a conventional sequential cache,
4 is a diagram showing the structure of a conventional drive cache,
5 is a diagram showing the structure of a cache implemented by combining both a conventional filter cache, a sequential cache, and a drive cache;
FIG. 6 is a memory request flow chart of the cache shown in FIG. 5,
Figure 7 is a diagram illustrating the structure of a selective word access cache proposed by the present invention;
8 is a diagram illustrating the structure of a cache that combines a conventional filter cache, a sequential cache, and a drive cache in the selective word access cache shown in FIG. 7 according to the present invention;
FIG. 9 is a diagram illustrating an index transfer process for awake operation of the cache shown in FIG. 8; and
10 is a memory request flowchart of the cache shown in FIG.
Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings, in order that those skilled in the art can easily carry out the present invention. In the following detailed description of the preferred embodiments of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear. In the drawings, like reference numerals are used throughout the drawings.
In addition, in the entire specification, when a part is referred to as being 'connected' to another part, it may be referred to as 'indirectly connected' not only with 'directly connected' . Also, to "include" an element means that it may include other elements, rather than excluding other elements, unless specifically stated otherwise.
Prior to describing a method of controlling a sequential selective word access driver cache using a word filter according to the present invention, conventional techniques for reducing power consumption of a conventional cache structure and a cache will be described in detail.
1 is a diagram showing a structure of a conventional L1 reference cache.
Referring to FIG. 1, when an
For example, assuming 32kB, 4-way, 32B block size, and 45nm process technology, Table 1 shows the results of investigating various factors of the conventional L1 reference cache through CACTI.
Assuming a 3Ghz processor, the data array access speed corresponds to two cycles and is equal to the cache access time.
2 is a diagram showing a structure in which a conventional filter cache is applied to the L1 reference cache shown in FIG.
Referring to FIG. 2, the
In case of performance, performance of L1 reference cache is 2.2 ~ 21.8 depending on application program and 1.5 ~ 21.5 filter cache.
In the case of energy, it sees the gain as much as the L1 power consumption when hit the filter cache, and damages as much as the filter cache power consumption in all other cases.
3 is a diagram showing a structure of a conventional sequential cache.
Referring to FIG. 3, the sequential cache is first accessed to access the
Sequential caches are often used in sub - storage - level memories that are large in size, high in set associativity, and relatively limited in time. However, in the case of the L1 cache, it is difficult to apply the sequential cache without a special algorithm because the set associativity is small and the time limit is large.
4 is a diagram showing the structure of a conventional drive cache.
In the case of the Draugi cache, the voltage control device selects a steady voltage of 1 V and a low power voltage of 0.3 V and supplies it to the SRAM. In the case of a Dragey mode (D mode) for supplying a low-power voltage, the set consumes only about 2% of static energy as compared with the steady-state voltage mode (N mode) However, an additional cycle is required for the wake-up process when approaching the set in DRAWJI mode. Thus, the Draugi cache can achieve good performance in applications where intensive memory requests are made to some of the entire Working Set.
As a result of driving the SPEC 2000 in the DRUJI cache as shown in FIG. 4, a performance reduction of about 5% to about 10% and a static power reduction of about 85% were obtained. Considering the time delay due to performance degradation, the real gain power reduction is observed to be around 80%.
FIG. 5 is a diagram showing a structure of a cache implemented by combining a conventional filter cache, a sequential cache, and a drain cache, and FIG. 6 is a memory request flowchart of the cache shown in FIG.
Referring to FIGS. 5 and 6, when a memory request is started, a request is firstly transmitted to the filter cache 550 (S610), and it is determined whether or not the
On the other hand, if not hit in the
On the other hand, when hit in the
It is determined whether or not the
As shown in FIG. 5, when the conventional filter cache, the sequential cache, and the drain cache are all combined, the filtering of the L1 access by the filter cache, the way filtering of the data array by the sequential cache and the static power In addition to the reduction effect, it is possible to obtain a large power reduction effect by extending the time of the drainage mode by the sequential filter cache. However, if the filter fails to hit the cache, each technique will see a total of three cycles of loss, one cycle, and the cache access time will increase to four cycles or five cycles.
Accordingly, the present invention proposes a technique for minimizing the performance loss while maximizing the energy reduction mechanism of conventional techniques by combining the conventional filter cache, the sequential cache, and the drain cache.
7 is a diagram illustrating a structure of a selective word access cache proposed by the present invention.
The selective word access (SWR) cache proposed by the present invention is implemented so as to reduce dynamic power consumption by deactivating other parts than necessary by making a word selection performed in the cache controller in cache connection.
In the selective word access cache, access to the tag array, which is relatively small in size and low in dynamic power consumption, is performed in the same manner as the conventional cache, and access to the data array is performed using the upper bits of the block offset, .
Comparing the structure of the cache shown in Fig. 7 with the prior art, there is a hardware difference in the data output bus portion. The data output bus is used when a read instruction in the cache is hit, and a power consumption gain occurs at this time.
Specifically, when a read hit occurs, the shaded portion is deactivated in Fig. 7, which consumes about 25% of the L1 cache and about 50% of the dynamic power of the L2 cache. In general, since the cache hit ratio exceeds 90% and the ratio of the read access command among all memory access requests is also over 90%, the selective word access cache can effectively reduce the dynamic power consumption.
To deactivate the shaded portion in FIG. 7, a portion of the address may be sent to the memory request upon cache access. The upper two bits of the block offset for the L1 cache and the upper one bit of the block offset for the L2 cache are required.
When a hit failure occurs in the cache, the data input bus is used. In this case, the same operation as that of the conventional cache is performed, and no overhead occurs.
FIG. 8 is a diagram showing the structure of a cache that combines a conventional filter cache, a sequential cache, and a drive cache in the selective word access cache shown in FIG. 7 according to the present invention.
When combining the above-described selective word access cache technique with a cache implemented by combining all of the conventional filter cache, sequential cache, and drain cache shown in FIG. 5, as shown in FIG. 8, 8B one word, i.e., a
Thus, if the storage unit of the filter cache is reduced from 32B to 8B, the number of entries can be increased by a factor of four if the same capacity is assumed. If the number of entries is increased, And a better hit ratio. On the other hand, if the same entry is assumed, the capacity can be reduced to 1/4.
FIG. 9 is a diagram illustrating an index transfer process for awakening a cache shown in FIG. 8. FIG.
The upper two bits of the block offset are additionally transmitted to the
As described above, when the power mode is managed by finely dividing the Draugi cache technique, it is possible to increase the ratio of the Dragey mode to the conventional technique of managing each block, thereby further reducing power consumption.
10 is a memory request flowchart of the cache shown in FIG.
8 and 10, when a memory request is initiated, a request is forwarded to the word filter cache 850 (S1010), a request is forwarded to the L1 tag array 830 (S1020), and the L1
It is determined whether or not the
On the other hand, if it is not hit in the
On the other hand, when hit in the
In step S1031, it is determined whether the L1
As such, according to the present invention, the
By fusing the conventional filter cache, the sequential cache and the drive cache in the selective word access cache as described above, it is possible to achieve an average of 33.28% dynamic energy . In addition, the static power reduction effect can be obtained from the Draugi cache, and the Draugi cache can be further benefited by finely dividing and managing the word.
This results in a 73.4% dynamic energy reduction in the L1 cache, a 83.2% static energy reduction, and a total energy savings of 71.7%.
The present invention is not limited to the above-described embodiments and the accompanying drawings. It will be apparent to those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
110, 210, 310, 410, 510, 810: Instruction address
120, 220, 320, 420, 520, 820: TLB
130, 230, 330, 430, 530, 830: L1 tag array
140, 240, 340, 440, 540: an L1 data array
250, 550: filter cache
840: L1 SWR data array
850: Word Filter Cache
Claims (5)
Concurrently performing a transfer of a request to the word filter cache, a transfer of a request to the tag array, and a wake-up signal transfer to the SWR data array when a memory request is delivered;
Determining whether the word filter cache has been hit;
Determining, if not hit in the word filter cache, whether the tag is hit in the tag array; And
And forwarding the request to the child storage if it is not hit in the tag array,
Wherein the SWR data array receives a wake-up signal and determines whether the wake-up mode is a draw mode, and performs a wake-up operation when the wake-up mode is a draw mode.
Further comprising forwarding a request to a hit word of the SWR data array, the hit word, when hit in the tag array. ≪ Desc / Clms Page number 19 >
Wherein the SWR data array performs a wake-up operation on a word-by-word basis, and uses a word filter selectively accessible to the word to control the sequential selective word-access driver cache.
And a word filter for transferring the upper two bits of the block offset to the SWR data array to perform the wake-up operation on a word-by-word basis.
Wherein the SWR data array performs a wakeup operation and then hits in the word filter cache or hits a miss in the tag array.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020130058371A KR101442494B1 (en) | 2013-05-23 | 2013-05-23 | Control method of sequential selective word reading drowsy cache with word filter |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020130058371A KR101442494B1 (en) | 2013-05-23 | 2013-05-23 | Control method of sequential selective word reading drowsy cache with word filter |
Publications (1)
Publication Number | Publication Date |
---|---|
KR101442494B1 true KR101442494B1 (en) | 2014-09-26 |
Family
ID=51760643
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020130058371A KR101442494B1 (en) | 2013-05-23 | 2013-05-23 | Control method of sequential selective word reading drowsy cache with word filter |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101442494B1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101831226B1 (en) * | 2015-11-09 | 2018-02-23 | 경북대학교 산학협력단 | Apparatus for controlling cache using next-generation memory and method thereof |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20050095107A (en) * | 2004-03-25 | 2005-09-29 | 삼성전자주식회사 | Cache device and cache control method reducing power consumption |
KR20120063312A (en) * | 2010-12-07 | 2012-06-15 | 전남대학교산학협력단 | Processor system having data filter cache and modified victim cache and driving method thereof |
KR20120101761A (en) * | 2011-03-07 | 2012-09-17 | 삼성전자주식회사 | Cache phase detector and processor core |
-
2013
- 2013-05-23 KR KR1020130058371A patent/KR101442494B1/en active IP Right Grant
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20050095107A (en) * | 2004-03-25 | 2005-09-29 | 삼성전자주식회사 | Cache device and cache control method reducing power consumption |
KR20120063312A (en) * | 2010-12-07 | 2012-06-15 | 전남대학교산학협력단 | Processor system having data filter cache and modified victim cache and driving method thereof |
KR20120101761A (en) * | 2011-03-07 | 2012-09-17 | 삼성전자주식회사 | Cache phase detector and processor core |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101831226B1 (en) * | 2015-11-09 | 2018-02-23 | 경북대학교 산학협력단 | Apparatus for controlling cache using next-generation memory and method thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9274592B2 (en) | Technique for preserving cached information during a low power mode | |
TWI454909B (en) | Memory device, method and system to reduce the power consumption of a memory device | |
US20110113198A1 (en) | Selective searching in shared cache | |
US20070011421A1 (en) | Method and system for decreasing power consumption in memory arrays having usage-driven power management | |
EP3602310B1 (en) | Power-conserving cache memory usage | |
US20120284475A1 (en) | Memory On-Demand, Managing Power In Memory | |
US20180336143A1 (en) | Concurrent cache memory access | |
US11221665B2 (en) | Static power reduction in caches using deterministic naps | |
US9990293B2 (en) | Energy-efficient dynamic dram cache sizing via selective refresh of a cache in a dram | |
US9767041B2 (en) | Managing sectored cache | |
US8484418B2 (en) | Methods and apparatuses for idle-prioritized memory ranks | |
US9396122B2 (en) | Cache allocation scheme optimized for browsing applications | |
JP5791529B2 (en) | MEMORY CONTROL DEVICE, CONTROL METHOD, AND INFORMATION PROCESSING DEVICE | |
EP2808758B1 (en) | Reduced Power Mode of a Cache Unit | |
KR101442494B1 (en) | Control method of sequential selective word reading drowsy cache with word filter | |
Hameed et al. | Architecting STT last-level-cache for performance and energy improvement | |
US20180074964A1 (en) | Power aware hash function for cache memory mapping | |
Ryoo et al. | i-mirror: A software managed die-stacked dram-based memory subsystem | |
US20140156941A1 (en) | Tracking Non-Native Content in Caches | |
US20030145171A1 (en) | Simplified cache hierarchy by using multiple tags and entries into a large subdivided array | |
He et al. | Optimizing energy in a DRAM based hybrid cache | |
DeMara et al. | Non-volatile memory trends: Toward improving density and energy profiles across the system stack | |
He et al. | Tcache: An energy-efficient dram cache design | |
DeMara et al. | SPOTLIGHT ON TRANSACTIONS | |
US20130124822A1 (en) | Central processing unit (cpu) architecture and hybrid memory storage system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
GRNT | Written decision to grant | ||
FPAY | Annual fee payment |
Payment date: 20181025 Year of fee payment: 5 |