CN103810118B - One kind of cache design method stt-mram - Google Patents

One kind of cache design method stt-mram Download PDF

Info

Publication number
CN103810118B
CN103810118B CN 201410072210 CN201410072210A CN103810118B CN 103810118 B CN103810118 B CN 103810118B CN 201410072210 CN201410072210 CN 201410072210 CN 201410072210 A CN201410072210 A CN 201410072210A CN 103810118 B CN103810118 B CN 103810118B
Authority
CN
Grant status
Grant
Patent type
Application number
CN 201410072210
Other languages
Chinese (zh)
Other versions
CN103810118A (en )
Inventor
成元庆
郭玮
赵巍胜
张有光
Original Assignee
北京航空航天大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Grant date

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0893Caches characterised by their organisation or structure
    • G06F12/0897Caches characterised by their organisation or structure with two or more cache hierarchy levels
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C11/00Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor
    • G11C11/02Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using magnetic elements
    • G11C11/16Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using magnetic elements using elements in which the storage effect is based on magnetic spin effect
    • G11C11/165Auxiliary circuits
    • G11C11/1673Reading or sensing circuits or methods
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C11/00Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor
    • G11C11/02Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using magnetic elements
    • G11C11/16Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using magnetic elements using elements in which the storage effect is based on magnetic spin effect
    • G11C11/165Auxiliary circuits
    • G11C11/1675Writing or programming circuits or methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing
    • Y02D10/10Reducing energy consumption at the single machine level, e.g. processors, personal computers, peripherals or power supply
    • Y02D10/13Access, addressing or allocation within memory systems or architectures, e.g. to reduce power consumption or heat production or to increase battery life

Abstract

一种新型的STT‑MRAM缓存设计方法,该方法有三大步骤。 STT-MRAM is a new cache design, the method has three steps. 本发明利用MTJ尺寸与写入电流和写入能耗的关系,设计了一种新型的STT‑MRARM缓存结构,分别用不同尺寸的MTJ存储单元实现L1Cache和L2Cache。 The present invention utilizes the relationship between the size of the MTJ write current and the write energy, a novel design of STT-MRARM buffer structure, respectively, to achieve L1Cache L2Cache and MTJ memory cells with different sizes. 与只用一种尺寸的MTJ存储单元相比,降低了写入能耗,提升了性能。 Compared with the MTJ memory cell with only one size, reducing writing energy consumption, improve performance. 与SRAM和STT‑MRAM混合的结构相比,显著降低了静态功耗。 SRAM and mixed with the STT-MRAM structure compared, significantly reduces static power consumption. 由于L1Cache是用小尺寸的MTJ存储单元构成,面积更小,存储密度更高,Cache的缺失率(missrate)显著低于SRAM Cache,提高了访存性能。 Since L1Cache MTJ memory cell is constituted by a small-sized, smaller area, higher storage density, the Cache miss rate (missrate) was significantly lower than the SRAM Cache, improved memory access performance. 此外,该发明只采用STT‑MRAM生产工艺,提高了芯片良率,降低了生产成本。 Further, the invention uses only STT-MRAM production process, improve chip yield, reduce production costs.

Description

-种STT-MRAM缓存设计方法 - kind of STT-MRAM cache design method

技术领域 FIELD

[0001 ]本发明利用STT-MRAM存储器件替代传统的SRAM器件作为忍片的缓存,提出了一种STT-MRAM缓存(Cache)设计方法,降低忍片功耗,提升忍片性能。 [0001] The present invention utilizes the STT-MRAM storage device as an alternative to traditional SRAM device cache tolerance sheet, STT-MRAM is proposed a cache (Cache) design method, reducing the chip power dissipation tolerance, tolerance to enhance sheet properties. 属于非易失性存储器设计 Belonging to the non-volatile memory design

技术领域。 Technology.

背景技术 Background technique

[0002] 随着片上晶体管集成密度的不断提高,多核处理器开始获得广泛应用。 [0002] With the increasing integration density of transistors on a chip, a multi-core processor begins widely used. 例如IBM 化, Intel的酷眷系列处理器W及1';[16^公司的1116-6乂系列处理器等。 Example of IBM, Intel family of processors dependents cool and W 1 '; [16 ^ qe company 1116-6 series processors. 虽然多核处理器极大提升了计算机的性能,它需要大容量片上缓存的支持W解决"存储墙"瓶颈。 While multi-core processors significantly enhance the performance of your computer, it requires a large on-chip cache support W to solve the "memory wall" bottleneck. 目前工业界普遍采用随机静态存储器(SRAM)作为片上缓存(化Che)。 Currently the industry commonly uses a static random memory (SRAM) as the on-chip cache (of Che). 为了降低处理器功耗,片上供电电压不断降低。 In order to reduce processor power, on-chip supply voltage continues to decrease. SRAM中的晶体管亚阔值漏电流显著增加,导致忍片的静态功耗急剧增大, 而且随着器件工艺尺寸的不断减小,该问题会变的愈加严重。 Alkylene wide SRAM transistor in the drain current value increases significantly, resulting in a sharp increase of the static power tolerance sheet, and with the continuous process of reducing the size of the device, the problem will become more severe.

[0003] 近年来,研究人员提出了自旋转移力矩磁性存储技术(STT-MRAM)。 [0003] In recent years, researchers have proposed a spin transfer torque magnetic storage technology (STT-MRAM). 与SRAM相比,该技术具有如下优势: Compared with SRAM, the technology has the following advantages:

[0004] 1. STT-MRAM利用磁性隧道结(MTJ)存储数据,是一种非挥发性存储器件,即使断电,数据也不会丢失; [0004] 1. STT-MRAM using a magnetic tunnel junction (MTJ) for storing data, a non-volatile memory device, even if the power, data will not be lost;

[0005] 2. STT-MRAM利用磁性材料而非电荷存储数据,几乎没有漏电流,具有极低的静态功耗; [0005] 2. STT-MRAM using magnetic charges instead of storing data, almost no leak current, with very low static power consumption;

[0006] 3. STT-MRAM存储单元的面积为SRAM的1/4,同样的面积可W集成更大容量的片上缓存,可W显著提高系统的性能。 [0006] 3. The area of ​​the STT-MRAM memory cell is an SRAM 1/4, W may be the same integrated area larger on-chip cache, W can significantly improve system performance.

[0007] 因此,许多研究人员提出利用STT-MRAM替代SRAM作为片上缓存。 [0007] Therefore, many researchers have proposed using the STT-MRAM as an alternative on-chip SRAM cache. 然而简单的直接替换,忍片的性能和功耗不一定会降低。 However, simple and direct replacement, tolerance sheet performance and power may not be reduced. 运是因为要往STT-MRAM的存储单元中写入数据,需要一个较大的电流(几十微安至几百微安),和较长的时间(一般为十几到几十纳秒),远远高于SRAM的写入电流和时间。 Because the operation is to write data to the memory cells in the STT-MRAM requires a large current (several hundred microamps to tens of microamps), and a longer time (typically ten to a few tens of nanoseconds) , far higher than the SRAM write current and time. 片上缓存与处理器核屯、的数据交互最为频繁。 The on-chip cache and processor core Tun, the most frequent data exchange. 如果程序执行的过程中,需要频繁写数据到缓存中,写功耗和写延迟会非常大,有可能抵消掉采用STT- MRAM所带来的好处。 If during program execution, the need to frequently write data into the cache, write power and write latency will be very large, it may offset the benefits of using STT- MRAM brings.

[000引为了解决上述问题,研究人员和设计者提出采用SRAM和STT-MRAM并存的混合存储结构。 [000 cited To solve this problem, researchers and designers to propose the use of hybrid storage SRAM structure and STT-MRAM co-exist. 对于需要频繁进行数据写入的Cache用SRAM存储,其它Cache用STT-MRAM存储。 Cache SRAM for storing frequently used data writing, storing Cache other STT-MRAM. 尽管运些技术可W在一定程度上提升忍片的性能,同时与完全用SRAM实现的化Che相比,降低了静态功耗,但也存在如下问题。 While these techniques may be transported to a certain extent W to enhance tolerance sheet performance, while compared to the full realization of Che with SRAM, reducing the static power consumption, but there are the following problems.

[0009] 第一,SRAM和STT-MRAM的生产工艺不同,将它们集成同一个忍片上,会降低忍片良率,增加生产成本; [0009] First, SRAM, and STT-MRAM different production process, they are integrated on the same chip tolerance, tolerance sheet reduces the yield, increasing the cost of production;

[0010] 第二,SRAM的存在会增大忍片的静态功耗; [0010] Second, SRAM tolerate the presence of sheet increases static power consumption;

[0011] 第Ξ,如何确定STT-MRAM和SRAM各自的容量也是一个难W解决的问题,因为不同的应用和程序,访存的行为不同,因而很难有一个普遍有效的解决方案。 [0011] The first Ξ, how to determine the STT-MRAM and respective SRAM capacity is a difficult problem to solve W, since different applications and programs, different memory access behavior, it is difficult to have a universally valid solutions.

发明内容 SUMMARY

[001 ^ 1.发明目的:本发明的目的是提供了一种STT-MRAM缓存设计方法,避免了SRAM带来的静态功耗问题,同时可W显著提升处理器的性能。 [001 ^ 1 object of the invention: object of the present invention is to provide a STT-MRAM cache design method, to avoid static power consumption caused by SRAM, while W can significantly improve the processor performance.

[0013] 2.技术方案:一般的,片上多核处理器包含有多级化che。 [0013] 2. The technical solution: Usually, the on-chip multi-core processor comprising a plurality of stages of che. 上一级的化che的容量更小,但要求访问速度更快。 Che on the level of capacity is smaller, but faster access speed requirements. 而STT-MRAM存储单元的写入电流、写入时间和磁性隧道结尺寸的关系可用如下公式刻画: And the relationship between the write current STT-MRAM memory cell write time and the size of the magnetic tunnel junction can be used the following formula Characterization:

[0014] [0014]

Figure CN103810118BD00041

(1) (1)

[001引其中,I。 [Wherein lead 001, I. 为写入电流,write_time为写入时间,A为MTJ的结面积,Jco是一个与MTJ的材料和制造工艺有关的常数,C和丫也是与MTJ相关的常数。 A write current, write_time a write time, A is the junction area of ​​the MTJ, Jco is a related manufacturing processes and MTJ material constants, C and Ah are constants associated with the MTJ. 由该公式可W看出,MTJ的面积越大,写入电流越大,写能耗也随之增大。 W can be seen from this equation, the larger the area of ​​the MTJ, the larger the write current, the write power consumption increases. 与此同时,MTJ的数据保持时间与MTJ尺寸的关系可W由下式描述: At the same time, the data of holding time MTJ of the MTJ size W can be described by the following formula:

[0016] [0016]

Figure CN103810118BD00042

(2) (2)

[0017] 其中Δ为热学因子,它直接决定了数据保持时间的长短。 [0017] where Δ is a thermal factor, which directly determines the length of the data retention time. 热学因子与MTJ体积V成正比,与溫度T成反比。 Thermal factor proportional to the MTJ volume V, temperature T and inversely. 化、Ms分别为各向异性磁场强度和饱和磁化强度。 Of, Ms are anisotropic magnetic field intensity and saturation magnetization. 由上式可知,MTJ尺寸越大,数据保持时间越长。 From the above equation, the larger the size of the MTJ, the longer the data retention time.

[0018] 利用如上关系,可W用小尺寸的MTJ构成的存储单元作为一级Cache,较大尺寸的MTJ构成的存储单元作为下一级化che,即L2化che。 [0018] With the above relation, W may be a small-sized MTJ memory cell as a memory cell configuration of a Cache, a larger size of a MTJ configured as che lower, i.e. L2 of che. 小尺寸的MTJ写入速度快,写能耗低,作为需要频繁写入的一级化che,即L1化che,可W降低静态功耗,同时并不会显著降低系统的性能。 MTJ write speed small-sized, low write-energy, as a need of frequently written che, i.e. L1 of che, can reduce the static power consumption W, and does not significantly degrade system performance. 而对L2 Cache而言,尽管写入速度慢,写能耗高,但它的写入次数与L1 Cache相比要少很多。 While L2 Cache, despite a slow-writing, writing high energy consumption, but it's the number of writes compared to the L1 Cache is much less. 利用较大尺寸的MTJ实现L2 Cache可W利用其数据保持时间长的特点,在程序多次运行期间,即使发生重启或掉电,也不会丢失数据,无需从主存中重新加载数据,提高了性能。 MTJ realized with a larger size W L2 Cache can use its data retention characteristics for a long time, several times during the program run, even if the power failure or restart, the data will not be lost, without loading data from the main memory again, improve performance.

[0019] 综上所述,本发明一种STT-MRAM缓存设计方法,该方法具体步骤如下: [0019] In summary, the present invention provides a STT-MRAM cache design, the method the following steps:

[0020] 步骤一:在忍片制造过程中,针对物理版图中L1 STT-MRAM部分,采用小尺寸的MTJ 生产工艺。 [0020] Step a: In the sheet manufacturing process tolerance, for the physical layout of the part L1 STT-MRAM using MTJ production process of a small size. 针对物理版图中L2 STT-MRAM部分采用正常尺寸的MTJ生产工艺。 Normal size for use in the physical layout L2 STT-MRAM MTJ part production process.

[0021] 步骤二:忍片工作时,访存指令执行时,处理器向L1 Cache发出物理地址,L1的标记表项与物理地址的对应部分进行比较,如果比较命中,则不需要访问L2化che。 [0021] Step two: tolerance work pieces, memory access instruction is executed, the processor issues the physical address to the Cache L1, L1 tag entries compared with the corresponding portion of the physical address, a hit if the comparison is no need to access L2 of che. 如果是读操作,贝化TT-MRAM的读能耗与SRAM相差不大;如果是写操作,贝化于L1化che的STT-MRAM尺寸较小,写能耗得到显著降低,同时由于相同面积内可W集成更多数目的小尺寸MTJ,L1 化Che的容量可W更大,减少了L1化Che的缺失率,提高了处理器性能。 If it is a read operation, the read power consumption of shellfish and TT-MRAM SRAM or less; if it is a write operation, the smaller of the shell in L1 of che size of the STT-MRAM, the write power consumption are significantly reduced, and because the same area W may be integrated within a larger number of the small size of the MTJ, L1 W of Che a larger capacity, reduce L1 miss rate of Che, improved processor performance.

[0022] 步骤如果访问L1 Cache不命中,需要将L2 Cache中的数据复制到L1化che中。 [0022] If Step L1 Cache miss access, the data needs to be copied to the L2 Cache in L1 of che. 如果L1 Cache已满,需要将某些L1 Cache中的数据替换出来。 If L1 Cache full, some data need to replace it in the L1 Cache. 如果被替换的数据没有被修改,则不需要复制到L2 Cache中。 If the replaced data is not altered, there is no need to copy the L2 Cache. 如果已经被修改,则需要将该数据写入L2 Cache中。 If it has been modified, it is necessary to write the data in the L2 Cache. 由于L1 Cache的容量比较大,所W处理器与L2 Cache的交互次数显著减少,避免写较大尺寸的L2化che带来的性能损失和能耗开销。 Since L1 Cache relatively large capacity, W is the number of interactions with the processor L2 Cache is significantly reduced, avoiding large size L2 of the write performance penalty of che power consumption and cost.

[0023] 3.优点和功效:本发明利用MTJ尺寸与写入电流和写入能耗的关系,设计了一种STT-MRARM缓存结构,分别用不同尺寸的MTJ存储单元实现L1 Cache和L2 Cache。 [0023] 3. The advantages and effects: the present invention utilizes the relationship between the size of the MTJ and the write current of the write energy, to design a STT-MRARM buffer structure, respectively, to achieve L1 Cache and L2 Cache MTJ memory cell with different sizes . 与只用一种尺寸的MTJ存储单元相比,降低了写入能耗,提升了性能。 Compared with the MTJ memory cell with only one size, reducing writing energy consumption, improve performance. 与SRAM和STT-MRAM混合的结构相比,显著降低了静态功耗。 SRAM and mixed with the STT-MRAM structure compared, significantly reduces static power consumption. 由于L1化Che是用小尺寸的MTJ存储单元构成,面积更小,存储密度更高,Cache的缺失率(miss rate)显著低于SRAM化che,提高了访存性能。 Since L1 of Che MTJ memory cell is constituted by a small-sized, smaller area, higher storage density, Cache miss rate (miss rate) was significantly lower than that of SRAM Che, improved memory access performance. 此外,该发明只采用STT-MRAM生产工艺,提高了忍片良率,降低了生产成本。 Further, the invention uses only STT-MRAM manufacturing process, improves sheet yield tolerance, reduce production costs.

附图说明 BRIEF DESCRIPTION

[0024] 图1为磁性随机存取存储器位单元示意图。 [0024] FIG. 1 is a schematic diagram of the bit magnetic random access memory cells. 其由一个磁性隧道结(MTJ)和一个N型晶体管(醒0S)串联构成。 Consisting of a magnetic tunnel junction (MTJ) and a N-type transistor (wake 0S) connected in series. 其中,化为位线(Bit Line);化为源线(Source Line) ;WL为字线(Word Line)。 Wherein the bit lines into (Bit Line); into a source line (Source Line); WL is a word line (Word Line).

[0025] 图2为多核处理器片上缓存层次结构示意图。 [0025] FIG. 2 is a schematic diagram of the cache hierarchy multi-core processor chip.

[00%]图3为L1 STT-MRAM Caclie与L2 STT-MRAM Caclie数据交互示意图。 [00%] FIG. 3 is a STT-MRAM Caclie the STT-MRAM Caclie a schematic diagram of data interaction L2 L1.

[0027] 图4为本发明的流程框图。 [0027] FIG. 4 is a block flow diagram of the present invention.

具体实施方式 detailed description

[0028] 本发明所设及的STT-MRAM存储单元的工作原理如图1所示。 [0028] The working principle of STT-MRAM memory cell of the present invention and shown in Figure 1 is provided. STT-MRAM存储单元一般采取ITIJQ化ansistor和1MTJ)。 STT-MRAM memory cells generally take ITIJQ of ansistor and 1MTJ). 晶体管控制MTJ数据的存取。 Control access to the data MTJ transistor. MTJ的结构分为自由层、参考层和中间的氧化层。 The MTJ structure is divided into the free layer, the reference layer and the intermediate oxide layer. 其中参考层的磁化方向是固定的。 Wherein the magnetization direction of the reference layer is fixed. 通过对MTJ施加不同方向的电流, 可W改变自由层的磁化方向。 By applying a current in different directions to the MTJ, W can change the magnetization direction of the free layer. 如果自由层磁化方向与参考层相同,MTJ的阻值变小,可W认为存储逻辑"0"。 If the magnetization direction of the free layer is the same as the reference layer, the resistance of the MTJ becomes small, W may be considered to store a logic "0." 反之,存储逻辑"Γ。读取存储单元数据时,将字线置为有效,在位线化和源线SL间加0.1 V所有的小电压,根据自由层与参考层磁化方向的异同,读取电流也会不同。该电流与参考电流相比较后,可知存储单元存取的是逻辑"0"还是逻辑"Γ。 Conversely, storing a logic "Γ. When reading memory cell data, the word line is asserted, the bit line between the source line SL and all of the 0.1 V plus a small voltage, in accordance with the similarities and differences of the free layer and the magnetization direction of the reference layer, the read take current will be different after the current is compared with a reference current, the memory cell accessed understood logic "0" or a logical "Gamma]. 写入数据时,首先将字线置为有效,在位线和源线间施加一个大电压(0.7V~1.2V),根据电压和由此产生的自旋电流方向的不同,决定写入"0"或"Γ。 When writing data, the first word line is asserted, the bit line and a source line is applied between a large voltage (0.7V ~ 1.2V), depending on the direction of the spin current and the voltage thus produced, determined write " 0 "or" Γ.

[0029] 本发明所设及的多核处理器的存储架构如图2所示。 [0029] The present invention is provided, and the multi-core processor architecture memory shown in Figure 2. L1 STT-MRAM Cache包含在处理器核中,为私有Cache,如图斜线部分所示。 L1 STT-MRAM Cache contained in the processor core, a private Cache, hatched portion shown in FIG. 方格线部分为L2 STT-MRAM Cache,为共享化che,如图方格线部分所示。 Part of the grid lines L2 STT-MRAM Cache, sharing of che, grid line portion shown in FIG.

[0030] 参考图3和图4,本发明一种STT-MRAM缓存设计方法,该方法具体步骤如下(CPU与LLCW及主存的交互关系类似,在此不再寶述): [0030] with reference to FIGS. 3 and 4, the present invention provides a STT-MRAM cache design, the method steps are as follows (similar to the CPU and main memory LLCW interactions, which is not described Po):

[0031] 步骤一:在忍片制造过程中,针对物理版图中L1 STT-MRAM部分,采用小尺寸的MTJ 生产工艺如19护。 [0031] Step a: In the sheet manufacturing process tolerance, for the physical layout of the L1 STT-MRAM portion, a small-sized production process MTJ 19 as protection. 其中,F为晶体管的特征尺寸。 Where, F is the feature size of a transistor. 针对物理版图中L2 STT-MRAM部分采用正常尺寸的MTJ生产工艺,例如32F2。 For the physical layout of the part L2 STT-MRAM using MTJ production process of normal size, e.g. 32F2.

[0032] 步骤二:忍片工作时,访存指令执行时,处理器向L1 Cache发出物理地址,L1的标记表项与物理地址的对应部分进行比较,如果比较命中,则不需要访问L2化che。 [0032] Step two: tolerance work pieces, memory access instruction is executed, the processor issues the physical address to the Cache L1, L1 tag entries compared with the corresponding portion of the physical address, a hit if the comparison is no need to access L2 of che. 如果是读操作,贝化TT-MRAM的读能耗与SRAM相差不大;如果是写操作,贝化于L1化che的STT-MRAM尺寸较小,写能耗得到显著降低,同时由于相同面积内可W集成更多数目的小尺寸MTJ,L1 化Che的容量可W更大,减少了L1化Che的缺失率,提高了处理器性能。 If it is a read operation, the read power consumption of shellfish and TT-MRAM SRAM or less; if it is a write operation, the smaller of the shell in L1 of che size of the STT-MRAM, the write power consumption are significantly reduced, and because the same area W may be integrated within a larger number of the small size of the MTJ, L1 W of Che a larger capacity, reduce L1 miss rate of Che, improved processor performance.

[0033] 步骤Ξ:如果访问L1 Cache不命中,需要从L2 Cache中将数据取出回填到L1 Cache中。 [0033] Step Ξ: L1 Cache miss if the access needs to be removed in the backfill L2 Cache data to the L1 Cache. 由于L1 Cache的写速度比L2 Cache的读取速度慢,需要设置一个Load Buffer,表项数与处理器具体执行的程序有关,例如设置为10项。 Since L1 Cache writing speed is slower than the reading speed L2 Cache, it is necessary to set a Load Buffer, and the number of entry program executed by the processor is specifically related to, for example, set to 10. 如果L1 Cache已满,需要将某些L1 Cache中的数据替换出来。 If L1 Cache full, some data need to replace it in the L1 Cache. 如果被替换的数据没有被修改,则不需要复制到L2 Cache中。 If the replaced data is not altered, there is no need to copy the L2 Cache. 如果已经被修改,则需要将该数据写入L2 Cache中。 If it has been modified, it is necessary to write the data in the L2 Cache. 由于L2 Cache的写速度比L1 Cache要慢, 要设置WriteBuffer。 Because of L2 Cache write speed slower than L1 Cache, to set WriteBuffer. 先将L1 Cache的数据写入Write Buffer,然后由Write Buffer写入L2 (^iche中。因为L2 Cache的写入速度更慢,Write Buffer的表项数目要多于Load Buffer,例如设置为20。由于L1化Che的容量比较大,所W处理器与L2 Cache的交互次数显著减少,避免写较大尺寸的L2化che带来的性能损失和能耗开销。 First L1 Cache Write Buffer data is written, and then writes the Write Buffer L2 (^ iche in. Since L2 Cache write slower, the number of entries to be more than the Write Buffer Load Buffer, for example, set to 20. Since Che L1 of relatively large capacity, W is the number of interactions with the processor L2 Cache is significantly reduced, avoiding large size L2 of the write performance penalty of che power consumption and cost.

Claims (1)

  1. I. 一种STT-MRAM缓存设计方法,其特征在于:该方法具体步骤如下: 步骤一:在芯片制造过程中,针对物理版图中第一级片上缓存自旋转移力矩磁性存储技术STT-MRAM部分,采用小尺寸的利用磁性隧道结MT J生产工艺;针对物理版图中第二级片上缓存自旋转移力矩磁性存储技术STT-MRAM部分采用正常尺寸的MTJ生产工艺; 所述第一级片上缓存为Ll片上缓存;所述第二级片上缓存为L2片上缓存; 步骤二:芯片工作时,访存指令执行时,处理器向Ll片上缓存Cache发出物理地址,Ll片上缓存Cache的标记表项与物理地址的对应部分进行比较,如果比较命中,则不需要访问L2 片上缓存Cache;如果是读操作,则STT-MRAM的读能耗与SRAM相差不大;如果是写操作,则由于Ll片上缓存Cache的STT-MRAM尺寸较小,写能耗得到显著降低,同时由于相同面积内集成更多数目的小尺寸MTJ,L1片上缓存Cache的容 I. STT-MRAM one kind of cache design method characterized in that: the method the following steps: Step 1: a chip manufacturing process, for the physical layout of the cache memory technology spin transfer torque magnetic STT-MRAM chip portion of the first stage , using a small-sized magnetic tunnel junction manufacturing process MT J; with the normal size for cache spin transfer torque magnetic storage portion STT-MRAM technology physical layout sheet MTJ second stage production process; on-chip cache of the first stage cache on the Ll substrate; cache the L2 chip cache on the second stage sheet; step two: when chip working memory access instruction is executed, the processor cache cache issue the physical address to the Ll-chip cache cache on Ll chip mark entry with the physical comparing corresponding portions of the address, a hit if the comparison is not required to access the L2 cache cache sheet; if the operation is a read, then read the STT-MRAM energy SRAM or less; if it is a write operation, since on-chip cache Ll cache STT-MRAM is small in size, the write energy consumption are significantly reduced, while incorporating a larger number of small-sized since the inner MTJ same area, the L1 cache cache receiving sheet 更大,减少了Ll片上缓存Cache的缺失率, 提尚了处理器性能; 步骤三:如果访问Ll片上缓存Cache不命中,需要将L2片上缓存Cache中的数据复制到Ll片上缓存Cache中;如果Ll片上缓存Cache已满,需要将某些Ll片上缓存Cache中的数据替换出来;如果被替换的数据没有被修改,则不需要复制到L2片上缓存Cache中;如果已经被修改,则需要将被替换的数据写入L2片上缓存Cache中;由于Ll片上缓存Cache的容量比较大,所以处理器与L2片上缓存Cache的交互次数显著减少,避免写较大尺寸的L2片上缓存Cache带来的性能损失和能耗开销。 Greater, reducing the deletion of cache Cache on Ll sheet, mention still processor performance; Step three: if the access cache Cache on Ll-chip miss, it is necessary to copy the data cache Cache L2 of sheet onto Ll chip cache Cache; if cache cache is full, it is necessary to replace it cache on Ll chip certain Ll slice data in the cache; if the replacement data is not altered, there is no need to copy into the cache cache L2, sheet; if it has been modified, it is necessary to be Alternatively the data written to the cache cache L2 of sheet; because the cache cache on Ll sheet relatively large capacity, so the number of interactions cache cache of the processor and L2 sheet significantly reduced, to avoid performance losses written larger size L2 chip cache cache brought and energy costs.
CN 201410072210 2014-02-28 2014-02-28 One kind of cache design method stt-mram CN103810118B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201410072210 CN103810118B (en) 2014-02-28 2014-02-28 One kind of cache design method stt-mram

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201410072210 CN103810118B (en) 2014-02-28 2014-02-28 One kind of cache design method stt-mram

Publications (2)

Publication Number Publication Date
CN103810118A true CN103810118A (en) 2014-05-21
CN103810118B true CN103810118B (en) 2016-08-17

Family

ID=50706912

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201410072210 CN103810118B (en) 2014-02-28 2014-02-28 One kind of cache design method stt-mram

Country Status (1)

Country Link
CN (1) CN103810118B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9437272B1 (en) * 2015-03-11 2016-09-06 Qualcomm Incorporated Multi-bit spin torque transfer magnetoresistive random access memory with sub-arrays
CN105551516A (en) * 2015-12-15 2016-05-04 中电海康集团有限公司 Memory constructed based on STT-MRAM

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101923447A (en) * 2009-06-11 2010-12-22 S·艾勒特;M·莱因万德 Memory device for a hierarchical memory architecture
CN103065673A (en) * 2011-10-20 2013-04-24 爱思开海力士有限公司 Combined memory block and data processing system having the same

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8238145B2 (en) * 2009-04-08 2012-08-07 Avalanche Technology, Inc. Shared transistor in a spin-torque transfer magnetic random access memory (STTMRAM) cell

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101923447A (en) * 2009-06-11 2010-12-22 S·艾勒特;M·莱因万德 Memory device for a hierarchical memory architecture
CN103065673A (en) * 2011-10-20 2013-04-24 爱思开海力士有限公司 Combined memory block and data processing system having the same

Also Published As

Publication number Publication date Type
CN103810118A (en) 2014-05-21 application

Similar Documents

Publication Publication Date Title
Loh 3D-stacked memory architectures for multi-core processors
US6178479B1 (en) Cycle-skipping DRAM for power saving
US20030204675A1 (en) Method and system to retrieve information from a storage device
Wu et al. Hybrid cache architecture with disparate memory technologies
Sun et al. Multi retention level STT-RAM cache designs with a dynamic refresh scheme
Ramos et al. Page placement in hybrid memory systems
Zhou et al. A durable and energy efficient main memory using phase change memory technology
Xu et al. Design of last-level on-chip cache using spin-torque transfer RAM (STT RAM)
US5856940A (en) Low latency DRAM cell and method therefor
US20100110748A1 (en) Hybrid volatile and non-volatile memory device
US20140160871A1 (en) System and method for performing sram write assist
US6848055B1 (en) Integrated circuit having various operational modes and a method therefor
Chang et al. Technology comparison for large last-level caches (L 3 Cs): Low-leakage SRAM, low write-energy STT-RAM, and refresh-optimized eDRAM
Kim et al. A case for exploiting subarray-level parallelism (SALP) in DRAM
US20040057299A1 (en) Memory card having a buffer memory for storing testing instruction
Xie Modeling, architecture, and applications for emerging memory technologies
US7209404B2 (en) Low power memory sub-system architecture
Abdel-Majeed et al. Warped register file: A power efficient register file for GPGPUs
US20090175107A1 (en) Apparatus for and Method of Current Leakage Reduction in Static Random Access Memory Arrays
Lee et al. Tiered-latency DRAM: A low latency and low cost DRAM architecture
US20100153646A1 (en) Memory hierarchy with non-volatile filter and victim caches
Mittal et al. A survey of architectural approaches for managing embedded DRAM and non-volatile on-chip caches
Son et al. Reducing memory access latency with asymmetric DRAM bank organizations
Park et al. Future cache design using STT MRAMs for improved energy efficiency: devices, circuits and architecture
CN101989183A (en) Method for realizing energy-saving storing of hybrid main storage

Legal Events

Date Code Title Description
C06 Publication
C10 Entry into substantive examination
C14 Grant of patent or utility model