TW201432691A

TW201432691A - Non-volatile multi-level-cell memory with decoupled bits for higher performance and energy efficiency

Info

Publication number: TW201432691A
Application number: TW102146933A
Authority: TW
Inventors: Naveen Muralimanohar; Han-Bin Yoon; Norman Paul Jouppi
Original assignee: Hewlett Packard Development Co
Priority date: 2013-01-31
Filing date: 2013-12-18
Publication date: 2014-08-16
Also published as: CN105103235A; US20150364191A1; TWI527034B; WO2014120193A1; US9852792B2; EP2951834A4; CN105103235B; EP2951834A1

Abstract

A non-volatile multi-level cell (''MLC'') memory device is disclosed. The memory device has an array of non-volatile memory cells, an array of non-volatile memory cells, with each non-volatile memory cell storing multiple groups of bits. A row buffer in the memory device has multiple buffer portions, each buffer portion storing one or more bits from the memory cells and having different read and write latencies and energies.

Description

Non-electrical multi-level cell memory with decoupled bits for higher performance and energy efficiency

本發明係有關於用於較高效能及能量效率之具有解耦位元的非依電性多位準胞元記憶體。 The present invention relates to non-electrical multi-level meta-cell memory with decoupled bits for higher performance and energy efficiency.

Background of the invention

非依電性記憶體諸如憶阻器及相變記憶體(PCM)已經萌出成為目前普遍的記憶體技術例如動態隨機存取記憶體(DRAM)及快取記憶體的有展望且可擴充的替代方案。除了比較DRAM及快取記憶體可引領至更高記憶體密度、每個位元更低成本、及更大容量之儲存資料的根本上不同辦法之外，此等正在萌出的非依電性記憶體支援多位準胞元(MLC)技術，該技術許可各個記憶體胞元儲存二或多個位元(相反地，DRAM每個胞元只能儲存一個位元)。以較低功率操作的潛力更進一步增添憶阻器及PCM作為可擴充性DRAM替代方案的競爭能力。 Non-electrical memory such as memristors and phase change memory (PCM) have emerged as promising and scalable alternatives to current memory technologies such as dynamic random access memory (DRAM) and cache memory. Program. In addition to comparing fundamentally different ways in which DRAM and cache memory can lead to higher memory densities, lower cost per bit, and larger capacity storage data, these evolving non-electrical memories The body supports multi-bit cell (MLC) technology, which allows each memory cell to store two or more bits (instead, DRAM can store only one bit per cell). The potential to operate at lower power further adds to the competitiveness of memristors and PCM as a scalable DRAM alternative.

更明確言之，PCM乃正在萌芽的記憶體技術，該PCM係藉改變稱作為硫屬化物的一種材料之電阻而儲存資料。藉施熱及然後許可以不同速率冷卻，硫屬化物可被操控以停留在非晶性(快速淬冷)高電阻態(例如邏輯低或零)與結晶性(緩慢冷卻)低電阻態(例如邏輯高或壹)間。PCM乃非依電性，原因在於當無電力時仍然保有硫屬化物的狀態。PCM胞元的非晶態與結晶態間的大電阻差(達3次冪幅度)許可於PCM胞元具現MLC技術。此點係藉將大電阻差劃分成四個分開區域達成，各自表示「11」、「10」、「01」及「00」的2-位元值。藉將電阻精確地控制在此等電阻區域中之一者內部，一胞元內可儲存多於一個位元。 More specifically, PCM is a budding memory technology that stores resources by changing the resistance of a material called chalcogenide. material. By applying heat and then permitting cooling at different rates, the chalcogenide can be manipulated to stay in a non-crystalline (rapid quenching) high resistance state (eg, logic low or zero) and crystalline (slow cooling) low resistance state (eg Logic high or 壹). PCM is non-electrical because the state of chalcogenide remains in the absence of electricity. The large electrical resistance difference between the amorphous and crystalline states of the PCM cell (up to a power of 3) is permitted in the MLC technology of the PCM cell. This is achieved by dividing the large resistance difference into four separate areas, each representing the 2-bit value of "11", "10", "01", and "00". By precisely controlling the resistance within one of these resistance regions, more than one bit can be stored in one cell.

但於PCM支援MLC遭致較高的存取延遲及能量。MLC要求胞元電阻精準地控制在一較窄範圍內，其需要以多次感測迭代的迭代寫與讀技術，結果導致較高的讀延遲及能量以及較高的寫延遲及能量。 However, the PCM support MLC suffers from higher access latency and energy. MLC requires that the cell resistance be accurately controlled over a narrow range, which requires iterative write and read techniques with multiple sensing iterations, resulting in higher read latency and energy, as well as higher write latency and energy.

依據本發明之一實施例，係特地提出一種非依電性多位準胞元(MLC)記憶體包含一陣列之非依電性記憶體胞元，各個非依電性記憶體胞元係儲存多組位元；及具有多個緩衝器部分的一列緩衝器，各個緩衝器部分係儲存得自該等記憶體胞元之一或多個位元且具有不同的讀及寫延遲及能量。 According to an embodiment of the present invention, a non-electrical multi-level cell (MLC) memory comprises an array of non-electric memory cells, and each non-electrical memory cell is stored. a plurality of sets of bits; and a column of buffers having a plurality of buffer portions, each buffer portion storing one or more bits from the memory cells and having different read and write delays and energy.

100、610、810‧‧‧記憶體、非依電性多位準胞元(MLC)記憶體 100, 610, 810‧‧‧ memory, non-electrical multi-level cell (MLC) memory

105、120、125、130、135、400、620‧‧‧記憶體胞元 105, 120, 125, 130, 135, 400, 620‧‧‧ memory cells

110、625、825‧‧‧字元線 110, 625, 825‧‧ ‧ character lines

115、630、830‧‧‧位元線 115, 630, 830 ‧ ‧ bit line

140、415、425‧‧‧列 Columns 140, 415, 425‧‧

145‧‧‧行 145‧‧‧

150a-c‧‧‧感測放大器 150a-c‧‧‧Sense Amplifier

155、635、835‧‧‧列緩衝器 155, 635, 835 ‧ ‧ column buffer

160、615、815‧‧‧記憶體控制器 160, 615, 815‧‧‧ memory controller

170‧‧‧記憶體排組、最高有效位元(MSB) 170‧‧‧ Memory Banking, Most Significant Bit (MSB)

175‧‧‧最低有效位元(LSB) 175‧‧‧Least Significant Bit (LSB)

200、420A-B‧‧‧胞元 200, 420A-B‧‧‧ cells

205‧‧‧感測放大器 205‧‧‧Sense Amplifier

210、235‧‧‧線圖 210, 235‧‧‧ line chart

215、240、305、310、320、325‧‧‧ 粗線 215, 240, 305, 310, 320, 325‧‧ Thick line

220‧‧‧類比至數位轉換器 220‧‧‧ analog to digital converter

225、230‧‧‧栓鎖 225, 230‧‧ ‧ latch

300、315‧‧‧略圖 300, 315‧‧‧ thumbnail

430a、700‧‧‧MSB半列 430a, 700‧‧‧MSB half-column

430b、705‧‧‧LSB半列 430b, 705‧‧‧LSB half-column

500‧‧‧資料區塊位址對映圖 500‧‧‧data block address mapping

505‧‧‧習知資料區塊位址對映圖 505‧‧‧Study data block address mapping

600、800‧‧‧電腦系統 600, 800‧‧‧ computer system

605、805‧‧‧處理資源 605, 805 ‧ ‧ processing resources

640‧‧‧MSB緩衝器部分 640‧‧‧MSB buffer section

645‧‧‧LSB緩衝器部分 645‧‧‧LSB buffer section

710、715‧‧‧半列 710, 715‧‧‧ half-column

840-850‧‧‧緩衝器部分 840-850‧‧‧Buffer section

900-910、1000-1025‧‧‧方塊 900-910, 1000-1025‧‧‧ squares

本案連結關聯附圖所作的詳細說明部分可能可更為明瞭，附圖中相似的元件符號係指全文中相似的部件，及附圖中：圖1為依據多個實施例一非依電性MLC記憶體之示意圖；圖2A-B為示意圖示例說明依據多個實施例一記憶體胞元的讀延遲；圖3A-B為示意圖示例說明依據多個實施例一記憶體胞元的寫延遲；圖4為一示意圖示例說明MSB及LSB如何可在非依電性多位準記憶體胞元解耦以利用讀及寫延遲及能量不對稱性；圖5為一示意圖對比此處提示的資料區塊位址對映圖與習知方案；圖6為用於較高效能及能量效率在一非依電性MLC記憶體具有解耦位元的一電腦系統之一示意圖；圖7示例說明於一列緩衝器中MSB與LSB之交插以結合寫至該記憶體；圖8為用於較高效能及能量效率在一非依電性MLC記憶體具有解耦位元的一電腦系統之另一示意圖；圖9為用於較高效能及能量效率在一非依電性MLC記憶體內解耦位元之一流程圖；及圖10為為了獲得較高效能及效率將寫結合至非依電性MLC記憶體的一流程圖。 The detailed description of the accompanying drawings may be more apparent in the accompanying drawings. 1 is a schematic diagram of a non-electrical MLC memory according to various embodiments; FIG. 2A-B is a schematic diagram illustrating a read delay of a memory cell according to various embodiments; FIG. 3A-B is a schematic diagram illustrating the basis Multiple embodiments - write latency of memory cells; Figure 4 is a schematic diagram illustrating how MSB and LSB can be decoupled from non-electrical multi-level memory cells to exploit read and write delays and energy asymmetry FIG. 5 is a schematic diagram comparing the data block address mapping and the conventional scheme suggested herein; FIG. 6 is for the higher performance and energy efficiency having decoupling bits in a non-electrical MLC memory. A schematic diagram of a computer system; Figure 7 illustrates the interleaving of MSB and LSB in a column of buffers for writing to the memory; Figure 8 is for a higher performance and energy efficiency in a non-electrical MLC memory. Another schematic diagram of a computer system with decoupled bits; Figure 9 is a flow chart for decoupling bits in a non-electrical MLC memory for higher performance and energy efficiency; and Figure 10 is for High performance and efficiency combine writing with first-class non-electrical MLC memory Fig.

Detailed description of the preferred embodiment

揭示用於較高效能及能量效率之具有解耦位元的非依電性多位準胞元(MLC)記憶體。如此處通用描述，該非依電性MLC記憶體乃具有多個記憶體胞元的非依電性記憶體，各個記憶體胞元儲存多於一個位元。於多個實施例中，非依電性MLC記憶體可為每個胞元儲存多組位元的非依電性記憶體(例如PCM、憶阻器等)，於該處各組可具有一或多個位元。舉例言之，一記憶體胞元可儲存兩組位元，各組具有單一位元(每個胞元共儲存兩個位元)。一組可儲存一最高有效位元(MSB)及另一組可儲存一最低有效位元(LSB)。於另一個實施例中，一記憶體胞元可儲存四組位元，各組具有單一位元(每個胞元共儲存4個位元)。及於又另一個實施例中，一記憶體胞元可儲存兩組位元，各組具有2位元(每個胞元也共儲存4個位元)。也預期以進一步細節涵蓋及描述多個其它實施例如下。 Reveal decoupled bits for higher performance and energy efficiency Non-electrical multi-level cell (MLC) memory. As described generally herein, the non-electrical MLC memory is a non-electrical memory having a plurality of memory cells, each memory cell storing more than one bit. In various embodiments, the non-electrical MLC memory may store a plurality of sets of non-electrical memory (eg, PCM, memristor, etc.) for each cell, where each group may have one Or multiple bits. For example, a memory cell can store two sets of bits, each group having a single bit (each cell storing a total of two bits). One group can store one most significant bit (MSB) and the other can store a least significant bit (LSB). In another embodiment, a memory cell can store four sets of bits, each set having a single bit (each cell storing a total of 4 bits). In yet another embodiment, a memory cell can store two sets of bits, each set having 2 bits (each cell also storing a total of 4 bits). It is also contemplated that various other embodiments are contemplated and described in further detail.

非依電性MLC記憶體將各個記憶體胞元劃分成多組。具有多個緩衝器部分的一列緩衝器係用以儲存來自該記憶體胞元的該等位元，各個緩衝器部分具有不同的讀及寫延遲及能量。為求容易說明，後文描述可稱作第一實施例，於該處一記憶體胞元具有兩組位元，各組具有單一位元。於本實施例中，該MLC記憶體具有一MSB半部儲存一MSB位元及一LSB半部儲存一LSB位元。該MSB半部具有減低的讀延遲及能量，而該LSB半部具有減低的寫延遲及能量。來自該記憶體的該等MSB半部的MSB位元係儲存於一列緩衝器的一MSB緩衝器部分，及來自該記憶體的該等LSB半部的LSB位元係儲存於該列緩衝器的一LSB緩衝器部分。於該MSB緩衝器部分中的資料區塊可與於該LSB緩衝器部分中的資料區塊交插以增加將寫結合入該記憶體的機會及改良其耐用性。 The non-electrical MLC memory divides each memory cell into multiple groups. A column of buffers having a plurality of buffer portions is used to store the bits from the memory cells, each buffer portion having a different read and write delay and energy. For ease of explanation, the following description may be referred to as the first embodiment, where a memory cell has two sets of bits, each group having a single bit. In this embodiment, the MLC memory has an MSB half storing an MSB bit and an LSB half storing an LSB bit. The MSB half has reduced read latency and energy, while the LSB half has reduced write latency and energy. The MSB bits from the MSB half of the memory are stored in an MSB buffer portion of a column of buffers, and the LSB bits from the LSB half of the memory are stored in the column buffer. An LSB buffer section. The data block in the MSB buffer portion can be interleaved with the data block in the LSB buffer portion to increase the chance of incorporating writes into the memory and improve its durability.

須瞭解於後文描述中，陳述無數特定細節以供徹底瞭解實施例。但須瞭解可不限於此等特定細節而體現該等實施例。於其它情況下，眾所周知之方法及結構可不以細節描述以免不必要地遮掩了實施例的說明。又，該等實施例可彼此組合使用。 It is to be understood that in the following description, numerous specific details are set forth It is to be understood that the embodiments are not limited to such specific details. In other instances, well-known methods and structures may not be described in detail to avoid obscuring the description of the embodiments. Again, the embodiments can be used in combination with one another.

現在參考圖1，描述依據多個實施例非依電性MLC記憶體的一示意圖。非依電性MLC記憶體100包含記憶體胞元陣列及周邊電路。於一陣列中，記憶體胞元係被組織成列及成行，於該處於各列中的全部胞元係連結至一共用字元線，於各行中的全部胞元係連結至一共用位元線(每個胞元係連結至一條字元線及一條位元線)。舉例言之，記憶體胞元105係連結至字元線110及位元線115。記憶體胞元105係在與記憶體胞元120及125的相同列140上，及在與記憶體胞元130及135的相同行145上。熟諳技藝人士將瞭解記憶體100係顯示有9個記憶體胞元僅用於示例說明目的。典型記憶體100可具有額外胞元。 Referring now to Figure 1, a schematic diagram of a non-electrical MLC memory in accordance with various embodiments is depicted. The non-electrical MLC memory 100 includes a memory cell array and peripheral circuits. In an array, the memory cell lines are organized into columns and rows, and all of the cell lines in each column are connected to a common word line, and all cell lines in each row are connected to a common bit. Line (each cell is connected to one word line and one bit line). For example, memory cell 105 is coupled to word line 110 and bit line 115. Memory cell 105 is on the same column 140 as memory cells 120 and 125, and on the same row 145 as memory cells 130 and 135. Those skilled in the art will appreciate that the Memory 100 series shows nine memory cells for illustrative purposes only. A typical memory 100 can have additional cells.

當存取記憶體100中的資料時，同時存取同一列胞元(例如列140)。如此進行時，列解碼器(圖中未顯示)宣告一字元線以選擇該目標列中的全部胞元，及位元線在該等胞元與周邊電路間傳輸資料。於該等周邊電路中，來自位元線的資料信號係藉於一列緩衝器155的感測放大器 150a-c檢測及栓鎖於列緩衝器155，及一行解碼器(圖中未顯示)選擇該列緩衝器155之一子集而與I/O襯墊(圖中未顯示)通訊。 When accessing data in memory 100, the same column of cells (e.g., column 140) is simultaneously accessed. In doing so, a column decoder (not shown) declares a word line to select all of the cells in the target column, and the bit line transmits data between the cells and the peripheral circuits. In the peripheral circuits, the data signals from the bit lines are borrowed from a sense amplifier of a column of buffers 155. 150a-c detects and latches in column buffer 155, and a row of decoders (not shown) selects a subset of column buffers 155 to communicate with I/O pads (not shown).

須瞭解記憶體100可經邏輯上劃分成區塊，俗稱記憶體排組。一記憶體排組乃可被獨立地定址的記憶體100之最小區劃。舉例言之，記憶體100係以一記憶體排組170示例說明。記憶體排組170中的各列遞送大量位元給感測放大器150a-c。遞送的位元數目為處理器字元的倍數(例如32位元或64位元)。記憶體排組170係藉記憶體控制器165控制，其提供記憶體100中的記憶體排組與處理器(圖中未顯示)間之介面。該記憶體控制器165係透過多工器與解多工器的一組合而讀、寫及再新記憶體100，該組合係針對該資料選擇正確的列、行、及記憶體位置。 It should be understood that the memory 100 can be logically divided into blocks, commonly known as memory banks. A memory bank is the smallest zone of memory 100 that can be independently addressed. For example, memory 100 is illustrated by a memory bank 170. Each column in memory bank 170 delivers a large number of bits to sense amplifiers 150a-c. The number of bits delivered is a multiple of the processor word (eg, 32 bits or 64 bits). The memory bank 170 is controlled by the memory controller 165, which provides an interface between the memory bank in the memory 100 and a processor (not shown). The memory controller 165 reads, writes, and re-creates the memory 100 through a combination of a multiplexer and a demultiplexer that selects the correct column, row, and memory locations for the data.

一旦一列的資料被置於列緩衝器155，針對同一列的隨後資料請求可藉存取於本緩衝器中的資料服務。此種存取係稱作為一列緩衝器命中，而可在該列緩衝器155的存取延遲獲得快速服務，不必與較慢的胞元陣列互動。但為了服務對另一列的一資料請求，資料須從該陣列存取(更換該列緩衝器155的內容)。此型存取係稱作為一列緩衝器失誤，由於作動了該陣列中的一列胞元而遭致較高的延遲及能耗。 Once a column of data is placed in the column buffer 155, subsequent data requests for the same column can be accessed by the data service in the buffer. Such access is referred to as a list of buffer hits, and the access latency of the column buffer 155 can be quickly serviced without having to interact with the slower cell array. However, in order to service a data request to another column, the data must be accessed from the array (replace the contents of the column buffer 155). This type of access is referred to as a column of buffer errors, resulting in higher latency and power consumption due to the actuation of a column of cells in the array.

具有高資料局部性的應用可從大型列緩衝器獲益，及遭致記憶體存取時間縮減。但使用多核心處理器，來自多個執行緒(處理)的記憶體請求當存取同一個記憶體排組時變成交插，結果導致列緩衝器衝突增加，及因而導致高列緩衝器失誤率。如此也增加了在記憶體控制器165的競爭，原因在於在簽發之前記憶體請求傾向於在記憶體控制器165等候較長時間。此項問題的一個可能的解決方案係藉針對各個排組支援多個列緩衝器以提高記憶體的並列性。如此，作用態的列緩衝器內容較不可能因來自另一執行緒(處理)的衝突存取而被擊敗。但此項辦法顯著地增加了面積額外負擔及記憶體成本。 Applications with high data locality can benefit from large column buffers and suffer from reduced memory access time. But with multi-core processors, memory requests from multiple threads (processing) access the same memory The time-varying transaction is inserted, resulting in an increase in column buffer collisions and thus a high column buffer miss rate. This also increases competition in the memory controller 165 because the memory request tends to wait longer in the memory controller 165 for a longer time before signing. One possible solution to this problem is to support multiple column buffers for each bank to improve the parallelism of the memory. As such, the column buffer contents of the active state are less likely to be defeated by conflicting accesses from another thread (processing). However, this approach significantly increases the area's extra burden and memory costs.

容後詳述，記憶體100的MLC特性可經探討以有效地以極低面積額外負擔而達成多個列緩衝器。於記憶體100內的各個記憶體胞元具有一MSB 170及一LSB 175。來自於記憶體排組170中的全部胞元之MSB可被儲存於列緩衝器155的一MSB緩衝器部分，及來自於記憶體排組170中的全部胞元之LSB可轉而被儲存於列緩衝器155的一LSB緩衝器部分。藉由讓列緩衝器155有效地被劃分成兩個列緩衝器部分，可達成記憶體延遲及列緩衝器命中的顯著改良。也容後詳述，記憶體100的記憶體延遲實際上係取決於一記憶體胞元中的位元型別。MSB具有比LSB更低的讀延遲及能量，其又轉而具有比MSB位元更低的寫延遲及能量。 As will be described in detail later, the MLC characteristics of the memory 100 can be explored to effectively achieve multiple column buffers with an extra burden on very low areas. Each memory cell within memory 100 has an MSB 170 and an LSB 175. The MSBs from all cells in the memory bank 170 can be stored in an MSB buffer portion of the column buffer 155, and the LSBs from all cells in the memory bank 170 can be stored in the An LSB buffer portion of column buffer 155. Significant improvements in memory latency and column buffer hits can be achieved by having the column buffer 155 effectively partitioned into two column buffer portions. As will be described later in detail, the memory delay of the memory 100 actually depends on the bit type in a memory cell. The MSB has a lower read latency and energy than the LSB, which in turn has a lower write latency and energy than the MSB bit.

現在參考圖2A-B，描述依據多個實施例示例說明一記憶體胞元的讀延遲之示意圖。於圖2A中，整合類比至數位轉換器(ADC)藉感測一列電荷(亦即電流)通過胞元200所耗用時間而量化一胞元200的電阻至2-位元值。線圖210顯示由感測放大器205所感測的感測時間呈電壓之一函數變化。電阻愈高，則感測時間愈長。結果，讀延遲係受耗用以感測最高胞元電阻的時間所限。 Referring now to Figures 2A-B, a schematic diagram illustrating the read latency of a memory cell is illustrated in accordance with various embodiments. In FIG. 2A, an integrated analog to digital converter (ADC) quantifies the resistance of a cell 200 to a 2-bit value by sensing the time it takes for a column of charge (ie, current) to pass through cell 200. The line graph 210 shows a sense of the sensed time sensed by the sense amplifier 205. The number changes. The higher the resistance, the longer the sensing time. As a result, the read latency is limited by the time it takes to sense the highest cell resistance.

如線圖210可知，在讀操作進行至完成前，可能分辨有關該胞元資料的若干資訊。各個感測時間對儲存於胞元200的位元提供資訊。舉例言之，線圖210中之粗線215顯示於t3之感測時間，儲存於胞元200的位元為「01」，或「0」MSB及「1」LSB。感測放大器205當感測t3之感測時間時，透過一類比至數位轉換器220輸出「01」位元，「0」MSB儲存於栓鎖225及「1」LSB儲存於栓鎖230。 As can be seen from the line graph 210, it is possible to distinguish some information about the cell data before the read operation is completed. Each sensing time provides information to the bits stored in cell 200. For example, the thick line 215 in the line graph 210 is displayed at the sensing time of t3, and the bit stored in the cell 200 is "01", or "0" MSB and "1" LSB. The sense amplifier 205 outputs a "01" bit through a analog-to-digital converter 220 when sensing the sensing time of t3. The "0" MSB is stored in the latch 225 and the "1" LSB is stored in the latch 230.

如圖2B示例說明，透過讀取操作的半途可決定MSB。於本實施例中，若胞元電阻係透過讀取操作的半途決定，則MSB為「1」，否則MSB為「0」而不考慮LSB。此點可見於線圖235的粗線240，其代表於線圖210中以粗線215顯示的讀操作之半。於比時間t3更早的時間t2，已經能夠決定儲存於胞元200的位元為「01」。換言之，在讀操作完成前可讀取MSB。 As illustrated in Figure 2B, the MSB can be determined halfway through the read operation. In this embodiment, if the cell resistance is determined halfway through the read operation, the MSB is "1", otherwise the MSB is "0" regardless of the LSB. This point can be seen in thick line 240 of line graph 235, which represents half of the read operation shown by thick line 215 in line graph 210. At time t2 earlier than time t3, it has been determined that the bit stored in cell 200 is "01". In other words, the MSB can be read before the read operation is completed.

此項觀察指出MSB具有比LSB更低的讀延遲(及能量)。但在習知非依電性MLC記憶體並未探討此種讀不對稱性性質，於該處一區塊資料展開橫跨MSB及LSB。如此延遲記憶體讀請求服務至較慢的LSB就緒為止。另一方面，若MSB及LSB係對映至邏輯上分開的記憶體位址，則儲存於MSB的資料區塊可以較低延遲讀取(而儲存於LSB的資料區塊可以如同先前之相同延遲讀取)。 This observation indicates that the MSB has a lower read latency (and energy) than the LSB. However, in the conventional non-electrical MLC memory, the nature of the read asymmetry is not explored, and a block of data is developed across the MSB and the LSB. This delays the memory read request service until the slower LSB is ready. On the other hand, if the MSB and the LSB are mapped to logically separated memory addresses, the data block stored in the MSB can be read with a lower delay (and the data block stored in the LSB can be read as the same delay as before). take).

在一MLC PCM可觀察得相似的寫不對稱性， LSB具有比MSB更低的寫延遲及能量。現在轉向參考圖3A-B，描述依據多個實施例示例說明一記憶體胞元的寫延遲之示意圖。多位準PCM胞元的寫延遲係取決於兩項：胞元的初始態，及胞元的目標態。此點係以略圖300示例說明於圖3A，該圖顯示於一4-位準PCM寫操作中，一記憶體胞元從任一態過渡至另一態所遭致的延遲。針對任何過渡，藉運用一種規劃方法(或將非晶型硫屬化物部分結晶化，或將結晶型硫屬化物部分非晶化)以較低延遲達成該目標胞元電阻。 Similar write asymmetry can be observed in an MLC PCM, The LSB has a lower write latency and energy than the MSB. Turning now to Figures 3A-B, a schematic diagram illustrating the write latency of a memory cell in accordance with various embodiments is described. The write delay of a multi-bit quasi-PCM cell depends on two terms: the initial state of the cell, and the target state of the cell. This point is illustrated in FIG. 3A by diagram 300, which shows the delay experienced by a memory cell transitioning from either state to another state in a 4-bit PCM write operation. For any transition, the target cell resistance is achieved with a lower delay by using a planning method (either to partially crystallize the amorphous chalcogenide or partially amorphize the crystalline chalcogenide).

當將任意資料寫至胞元區塊時，寫延遲受到完成任何過渡的最長時間所限(以粗體強調於圖3A，具有記憶體胞元態「01」與態「10」間之粗線305，及胞元態「00」與態「10」間之粗線310)。但在單一寫操作中不更動LSB及MSB二者(因而不使用圖3A的對角線過渡)，則變動LSB遭致的延遲比變動MSB更低。舉例言之，將LSB從「0」改成「1」遭致0.8x或0.84x寫延遲，及將LSB從「1」改成「0」遭致0.3x或0.2x寫延遲。 When writing any data to the cell block, the write delay is limited by the maximum time to complete any transition (in bold emphasis on Figure 3A, with a thick line between the memory cell state "01" and the state "10" 305, and a thick line 310 between the cell state "00" and the state "10". However, if both LSB and MSB are not changed in a single write operation (and thus the diagonal transition of Figure 3A is not used), the delay caused by the varying LSB is lower than the variation MSB. For example, changing the LSB from "0" to "1" results in a 0.8x or 0.84x write latency, and changing the LSB from "1" to "0" results in a 0.3x or 0.2x write latency.

圖3B強調只改變MSB(略圖315中的粗線320)係受1.0x延遲所限(從「00」至「10」)，而只改變LSB係受較低的0.84x延遲3所限(從「00」至「01」，粗線325)。將記憶體胞元規劃成「10」唯有當係從已經在結晶態的「11」過渡時才遭致0.2x延遲，於該處部分非晶化要求施加復置脈衝。本觀察指出LSB具有比MSB更低的寫延遲(及能量)。但類似前文就圖2A-B討論的讀不對稱性，當一區塊資料係展開橫過LSB及MSB時，本性質在習知MCL PCM不會槓桿化。若LSB及MSB係對映至邏輯上分開的記憶體位址，容後詳述時，儲存於LSB的資料區塊可以較低延遲寫入(而儲存於MSB的資料區塊可以如前相同延遲寫入)。 Figure 3B emphasizes that only changing the MSB (thick line 320 in thumbnail 315) is limited by the 1.0x delay (from "00" to "10"), while only changing the LSB is limited by the lower 0.84x delay of 3 (from "00" to "01", thick line 325). The memory cell is planned to be "10" only when the transition from the "11" that has been in the crystalline state is delayed by 0.2x, where partial amorphization requires the application of a reset pulse. This observation indicates that the LSB has a lower write latency (and energy) than the MSB. But similar to the reading asymmetry discussed above in Figure 2A-B, when a block of data is exhibited When the LSB and MSB are crossed, this property is not leveraged in the conventional MCL PCM. If the LSB and the MSB are mapped to logically separated memory addresses, the data blocks stored in the LSB can be written with lower latency when detailed later (and the data blocks stored in the MSB can be written with the same delay as before). In).

現在注意力朝向圖4，該圖顯示在MLC PCM胞元中MSB與LSB如何解耦以利用此等讀及寫不對稱性。MLC PCM的各個記憶體胞元400(例如圖1顯示的記憶體100)具有一MSB及一LSB。於一習知MLC PCM中，此等位元係耦合以形成沿一列的單一接續記憶體位址，如以列415顯示。列415示例說明從胞元至胞元，記憶體位址係以循序或接續方式變化。列415中的第一胞元420a係在第二胞元420b之前定址，MSB係在LSB之前定址。用於示例說明，4位元大小的資料區塊係以不同陰影強調。藉胞元420a-b形成的4-位元區塊首先係以胞元420a中的MSB(標示為「0」)，接著以胞元420a中的LSB(標示為「1」)、胞元420b中的MSB(標示為「2」)、及胞元420b中的LSB(標示為「3」)定址。以此種樣式遍歷列415。 Attention is now directed to Figure 4, which shows how the MSB and LSB are decoupled in the MLC PCM cell to take advantage of such read and write asymmetry. Each of the memory cells 400 of the MLC PCM (e.g., the memory 100 shown in FIG. 1) has an MSB and an LSB. In a conventional MLC PCM, the bits are coupled to form a single contiguous memory address along a column, as shown by column 415. Column 415 illustrates the change from cell to cell, with the memory address changing in a sequential or sequential manner. The first cell 420a in column 415 is addressed before the second cell 420b, and the MSB is addressed prior to the LSB. For illustration purposes, a 4-bit size data block is highlighted with different shading. The 4-bit block formed by the cell 420a-b is first associated with the MSB (labeled "0") in cell 420a, followed by the LSB (labeled "1") in cell 420a, cell 420b. The MSB (labeled "2") and the LSB (labeled "3") in cell 420b are addressed. Column 415 is traversed in this style.

相反地，此處呈示的非依電性MLC記憶體(例如圖1中之記憶體100)將沿一列的MSB分組而形成一個接續位址，及將沿同一列的LSB分組而形成另一個接續位址。藉此方式，駐在某個邏輯位址的一資料區塊(例如一64位元組快取區塊)實體上係只由MSB或只由LSB所占用。若該資料區塊係在MSB，則探討讀不對稱性(前文參考圖2A-B討論如上)而以減低的延遲及能量讀取該區塊。同理，若該資料區塊係在LSB，則探討寫不對稱性(前文參考圖3A-B討論如上)而以減低的延遲及能量寫入該區塊。 Conversely, the non-electrical MLC memory presented here (eg, memory 100 in FIG. 1) will group along the MSB of one column to form a contiguous address, and group the LSBs along the same column to form another contiguous address. Address. In this way, a data block (eg, a 64-bit cache block) resident at a logical address is physically occupied only by the MSB or only by the LSB. If the data block is at the MSB, the read asymmetry (discussed above with reference to Figures 2A-B) is discussed and the block is read with reduced delay and energy. Similarly, if the information The block is at the LSB, and the write asymmetry is discussed (previously discussed above with reference to Figures 3A-B) and written to the block with reduced delay and energy.

解耦位元有效地將在記憶體內的全部列劃分成二邏輯位址；一個使用MSB，另一個使用LSB。舉例言之，列425被有效地劃分成一MSB半列430a及一LSB半列430b。與一習知MLC PCM的列415相反，其中遍歷該列的位元係以接續方式定址，以此處呈示的記憶體(例如圖1中之記憶體100)，MSB半列430a的全部位元係在LSB半列430b的全部位元之前被定址。第一胞元430a的MSB係在第二胞元430b的MSB之前被定址，等等直到MSB半列430a結束。唯有在一記憶體排組的全部MSB皆被定址之後，才考慮LSB半列430b中的LSB。 The decoupling bit effectively divides all columns in the memory into two logical addresses; one uses the MSB and the other uses the LSB. For example, column 425 is effectively divided into an MSB half column 430a and an LSB half column 430b. In contrast to column 415 of a conventional MLC PCM, where the bits traversing the column are addressed in a contiguous manner, the memory presented here (e.g., memory 100 in Figure 1), all bits of MSB half column 430a It is addressed before all bits of the LSB half column 430b. The MSB of the first cell 430a is addressed before the MSB of the second cell 430b, and so on until the MSB half column 430a ends. The LSB in the LSB half column 430b is considered only after all MSBs in a memory bank are addressed.

圖5對比此處提示的資料區塊位址對映圖與習知方案。假設從一應用程式的虛擬頁面位址任意地隨機平移至記憶體內的一實體框位址，則該應用程式的工作設定值粗略半數係在MSB而另外半數係在LSB。因此，使用此處提示的資料區塊位址對映圖500，平均50%記憶體讀係以減低的延遲(達48%)及能量(達48%)服務，50%記憶體寫係以減低的延遲(達16%)及能量(達26%)服務。 Figure 5 compares the data block address mapping and the proposed scheme presented here. Assuming that an application's virtual page address is randomly and randomly translated to a physical box address in memory, the application's working setpoint is roughly half of the MSB and the other half is at the LSB. Therefore, using the data block address map 500 presented here, the average 50% memory read is served with reduced latency (up to 48%) and energy (up to 48%), and 50% memory writes are reduced. Delay (up to 16%) and energy (up to 26%) service.

資料區塊位址對映圖500的缺點為其增加了寫操作期間被規劃的胞元數目，增加了耐用性額外負擔。原因在於從各個2-位元胞元，一資料區塊只獲得一個位元，涉及當寫至該區塊時，胞元數目係等於一區塊中的位元數目。但比起習知方案，如此不會加倍了耐用性額外負擔，原因在於規劃一胞元成為冗餘(原因在於該胞元已經在欲被規劃的該目標態)的機率係低於習知方案，於該處MSB及LSB皆須匹配該寫資料。 The disadvantage of the data block address map 500 is that it increases the number of cells that are planned during the write operation, adding an additional burden of durability. The reason is that from each 2-bit cell, only one bit is obtained from a data block, and when writing to the block, the number of cells is equal to the number of bits in a block. But this does not double the extra burden of durability than the conventional solution. The reason is that the probability that the planning cell becomes redundant (because the cell is already in the target state to be planned) is lower than the conventional scheme, where both the MSB and the LSB must match the written data.

另一方面，將資料寫至500中的一區塊只有MSB或只有LSB，如此一區塊-寫具有較多的冗餘位元-寫。模擬顯示平均21%耐用性額外負擔。此點夠小而足以達到典型伺服器設計的5年壽命，考慮先前工作已經顯示PCM主記憶體具有平均8.8年的壽命。藉採用該資料區塊位址對映圖500，兩個分開邏輯位址空間共享該列緩衝器空間，各個位址空間占有該列緩衝器之半。如此縮小了在該列緩衝器內能夠保有的最長接續位址空間，潛在地減低了列緩衝器的局部性。但由資料區塊位址對映圖500所暴露的減低的記憶體位址延遲不僅補償了此項效應，同時也顯著地改善系統效能(及能量效率)優於習知資料區塊位址對映圖505，而不會遭致對記憶體電路及架構的重大修改。 On the other hand, a block that writes data to 500 has only MSB or only LSB, such a block-write has more redundant bits-writes. The simulation shows an average 21% durability extra burden. This is small enough to achieve the 5-year lifespan of a typical server design, and it has been shown that previous work has shown that the PCM main memory has an average life of 8.8 years. By using the data block address mapping 500, the two separate logical address spaces share the column buffer space, and each address space occupies half of the column buffer. This minimizes the longest contiguous address space that can be held in the column buffer, potentially reducing the locality of the column buffer. However, the reduced memory address delay exposed by the data block address mapping 500 not only compensates for this effect, but also significantly improves system performance (and energy efficiency) over conventional data block address mapping. Figure 505, without significant modifications to the memory circuitry and architecture.

現在注意力轉向圖6，顯示於一非依電性MLC記憶體內有解耦位元的電腦系統以獲得更高效能及能量效率。電腦系統600具有透過記憶體控制器615而與一非依電性MLC記憶體610通訊的一處理資源605。處理資源605可包括一或多個處理器及一或多個其它記憶體資源(例如快取記憶體)。該非依電性MLC記憶體610具有一陣列的非依電性記憶體胞元(例如記憶體胞元620)，各個多位準記憶體胞元儲存一MSB及一LSB。該記憶體胞元之陣列可組織為字元線(列)x位元線(行)的一陣列，諸如字元線625及位元線 630。 Attention is now directed to Figure 6, which shows a computer system with decoupled bits in a non-electrical MLC memory for higher performance and energy efficiency. Computer system 600 has a processing resource 605 that communicates with a non-electrical MLC memory 610 via memory controller 615. Processing resource 605 can include one or more processors and one or more other memory resources (eg, cache memory). The non-electrical MLC memory 610 has an array of non-electrical memory cells (eg, memory cells 620), and each multi-level memory cell stores an MSB and an LSB. The array of memory cells can be organized into an array of word lines (columns) x bit lines (rows), such as word line 625 and bit lines. 630.

記憶體控制器615提供於記憶體610中之非依電性記憶體胞元陣列與處理資源605間之一介面。記憶體控制器615透過多工器與解多工器的組合選擇資料的正確列、行、及記憶體位置而讀、寫、及再新記憶體610。於多個實施例中，該記憶體控制器615經由一列緩衝器635將資料讀及寫至記憶體610。該列緩衝器635具有一MSB緩衝器部分640及一LSB緩衝器部分645以分別地儲存來自於記憶體610中之非依電性記憶體胞元陣列的MSB及LSB。如前文參考圖4及5之描述，記憶體610中之MSB與LSB解耦且對映至分開的邏輯位址。解耦記憶體610中之記憶體胞元的MSB與LSB，有效地將一列分割成兩個半列，各自有其本身的接續邏輯位址。如圖所示，如此許可列緩衝器635被操控為兩個半列緩衝器，具有一MSB緩衝器部分640用以儲存MSB及一LSB緩衝器部分645用以儲存LSB。 The memory controller 615 provides an interface between the non-electrical memory cell array and the processing resource 605 in the memory 610. The memory controller 615 reads, writes, and re-news the memory 610 by selecting the correct column, row, and memory location of the data through the combination of the multiplexer and the demultiplexer. In various embodiments, the memory controller 615 reads and writes data to the memory 610 via a column of buffers 635. The column buffer 635 has an MSB buffer portion 640 and an LSB buffer portion 645 for storing MSBs and LSBs from the non-electrical memory cell arrays in the memory 610, respectively. As previously described with reference to Figures 4 and 5, the MSBs in memory 610 are decoupled from the LSBs and mapped to separate logical addresses. The MSB and LSB of the memory cells in the decoupled memory 610 effectively divide a column into two halves, each having its own contiguous logical address. As shown, the permissible column buffer 635 is manipulated as two half column buffers having an MSB buffer portion 640 for storing the MSB and an LSB buffer portion 645 for storing the LSB.

於列緩衝器部分640-645中MSB與LSB解耦且存取為分開的邏輯位址，可達成讀延遲及能量及寫延遲及能量的顯著改良。當從記憶體610讀取資料時，記憶體控制器615可以減低的讀延遲及能量從MSB緩衝器部分640讀取資料區塊(以習知讀延遲及能量從LSB緩衝器部分645讀取資料區塊)。同理，當寫資料至記憶體610時，記憶體控制器615可以減低的寫延遲及能量寫資料區塊至LSB緩衝器部分645(以習知寫延遲及能量寫資料區塊至MSB緩衝器部分640)。 The MSB is decoupled from the LSBs in column buffer sections 640-645 and accessed as separate logical addresses, achieving significant improvements in read latency and energy and write latency and energy. When reading data from memory 610, memory controller 615 can read the data block from MSB buffer portion 640 with reduced read latency and energy (read data from LSB buffer portion 645 with conventional read latency and energy). Block). Similarly, when writing data to the memory 610, the memory controller 615 can reduce the write latency and energy write data blocks to the LSB buffer portion 645 (using conventional write latency and energy write data blocks to the MSB buffer). Part 640).

此種MSB/LSB解耦的缺點為記憶體610比較其中位元係解耦的習知記憶體具有較差的耐用性。原因在於一習知位元方案中，為了規劃M位元，因有兩個邏輯上接續位元係對映至同一個胞元，故M/2胞元進行加熱及冷卻的實體規劃週期。但為了規劃記憶體610中的M位元，因各個胞元中的兩個位元只有一個改變，故M胞元進行實體規劃。如此，於資料緩衝效應不存在之下，記憶體610中的MSB與LSB解耦係以習知記憶體的兩倍速率耗用耐用週期，因而記憶體壽命減半。舉例言之，以記憶體610為PCM為例，記憶體610可單純規劃存在於記憶體中任何位置的既有資料，及被規劃的每個胞元進行一個耐用週期。因此，規劃中涉及的胞元數目直接影響記憶體610壽命。 A disadvantage of such MSB/LSB decoupling is that the memory 610 has poor durability compared to conventional memory in which the bit system is decoupled. The reason is that in a conventional bit scheme, in order to plan M bits, since there are two logically connected bit elements mapped to the same cell, the M/2 cell performs a physical planning cycle of heating and cooling. However, in order to plan the M bits in the memory 610, since only one of the two bits in each cell changes, the M cell performs physical programming. Thus, in the absence of data buffering effects, the MSB and LSB decoupling in memory 610 consumes a durable cycle at twice the rate of conventional memory, thus halving the lifetime of the memory. For example, taking the memory 610 as the PCM, the memory 610 can simply plan the existing data existing anywhere in the memory, and each cell to be planned performs a durable cycle. Therefore, the number of cells involved in the plan directly affects the lifetime of the memory 610.

由於MSB與LSB解耦結果造成不良記憶體耐用性效應可藉將寫結合至MSB及LSB成為單一寫而予緩和。寫至記憶體610被結合使得記憶體610中的一記憶體胞元可只被規劃一次而非兩次。於MSB緩衝器中之資料區塊與於LSB緩衝器中之資料區塊交插進一步提高了結合寫機率。交插係示例說明於圖7。藉交插一列之2頁面間的快取區塊(其中存取記憶體610的最小單元)，探討回寫中的空間局部性以增高快取區塊回寫至相同胞元的結合機會。為了結合回寫，該胞元的兩個位元可於規劃期間改變。 The poor memory durability effect due to the decoupling of the MSB and the LSB can be mitigated by combining the writes to the MSB and the LSB as a single write. The write to memory 610 is combined such that a memory cell in the memory 610 can be scheduled only once instead of twice. Interleaving the data block in the MSB buffer with the data block in the LSB buffer further increases the combined write probability. An example of an interleaving system is illustrated in Figure 7. By interleaving a cache block between two pages of a column (where the smallest unit of memory 610 is accessed), the spatial locality in the write back is explored to increase the chance of the cached block being written back to the same cell. In order to combine writeback, the two bits of the cell can be changed during planning.

如圖7所示，MSB半列700具有從0至7的八個資料區塊，及LSB半列705具有從8至15的八個資料區塊。MSB半列700係儲存於一列緩衝器的MSB緩衝器部分(例如MSB 緩衝器部分640)，及LSB半列705係儲存於一列緩衝器的LSB緩衝器部分(例如LSB緩衝器部分645)。儲存在列緩衝器635的資料可發送至一處理資源605，於該處於回送至列緩衝器635之前處理。如此，部分資料可加陰影「弄髒」，指示在其它區塊可從記憶體讀出至緩衝器之前，此等位元須寫至記憶體。 As shown in FIG. 7, the MSB half column 700 has eight data blocks from 0 to 7, and the LSB half column 705 has eight data blocks from 8 to 15. The MSB half column 700 is stored in the MSB buffer portion of a column of buffers (eg MSB) Buffer portion 640), and LSB half column 705 are stored in the LSB buffer portion of a column of buffers (e.g., LSB buffer portion 645). The data stored in column buffer 635 can be sent to a processing resource 605 for processing before being sent back to column buffer 635. In this way, some of the data can be shaded "stained", indicating that the bits must be written to the memory before other blocks can be read from the memory to the buffer.

從最末位準快取記憶體逐出的髒快取區塊典型地被簽發作為回寫至記憶體，及初始插入記憶體控制器的寫緩衝器。大部分系統優先排序列緩衝器命中請求(至不等程度)，因此此等髒快取區塊在記憶體控制器的寫緩衝器內佇列等候直到存取其目的地列為止，於該點其資料被發送至列緩衝器635。然後髒快取區塊資料駐在該列緩衝器635直到該等列緩衝器內容需要逐出為止(亦即逐出至於一不同列的緩衝器)，此乃該髒快取資料實際上被規劃入該記憶體胞元陣列。 The dirty cache block evicted from the last level of cache memory is typically issued as a write back to the memory and a write buffer initially inserted into the memory controller. Most systems prioritize sequence buffer hit requests (to varying degrees), so these dirty cache blocks wait in the memory controller's write buffer until they reach their destination column, at that point. Its data is sent to the column buffer 635. The dirty cache block data is then resident in the column buffer 635 until the contents of the column buffers need to be evicted (ie, eviction to a different column of buffers), which is actually the dirty cache data is planned into The memory cell array.

於本實施例中，位元1、2、及4-7為污穢。若全部此等區塊被寫入，則將共有6次分開寫至相對應於MSB半列700及LSB半列705的該列。但若在回寫至記憶體之前，如同在半列710-715般，位元在列緩衝器635被交插，換言之，若寫被結合及探討資料局部性，則位元4-5及6-7可一起寫入。不需6次分開寫至記憶體，只要求4次分開寫。注意交插的快取區塊數目可從1至高達嵌合單一頁面(其為一列之一半)的快取區塊數目。 In this embodiment, bits 1, 2, and 4-7 are dirty. If all of these blocks are written, then a total of 6 writes are written to the column corresponding to the MSB half column 700 and the LSB half column 705. However, if before writing back to the memory, as in the half column 710-715, the bit is interleaved in the column buffer 635, in other words, if the write is combined and the data locality is discussed, then the bits 4-5 and 6 -7 can be written together. It does not need to write to the memory separately 6 times, only 4 times to write separately. Note that the number of interleaved cache blocks can range from 1 up to the number of cache blocks that fit a single page (which is one and a half of a column).

須瞭解雖然第一列先來者優先(FR-FCFS)的內設排程策略當然結合寫於列緩衝器635，藉由小心地將寫排佇列等候在記憶體控制器615可改良此種發生可能。服務此項目的的機制係稱作為DRAM知曉最末位準快取回寫(DLW)。當每次逐出髒最末位準快取區塊時，DLW搜尋最末位準快取記憶體的對映至同一列的其它髒快取區塊，及臆測地簽發此等作為回寫至記憶體。於列緩衝器635的資料區塊交插係與DLW協同工作，藉簽發許多回寫至同一列，如此提高寫結合的可能。也須瞭解交插只改變資料如何於列緩衝器635解譯；其具現不要求記憶體610做任何改變。但當計算一頁面內部的快取行位置時，記憶體控制器615須考慮交插程度及據此而解碼位址。 It is important to understand that although the first column first come first (FR-FCFS) The scheduling strategy is of course written in conjunction with the column buffer 635, which can be improved by carefully waiting for the write queue to be queued to the memory controller 615. The mechanism for servicing this project is known as DRAM aware of the last level of cacheback (DLW). Each time the dirty last-most cache block is evicted, the DLW searches for the last-level cache of the last-level cache to the other dirty cache block in the same column, and sends this as a write-back to the test. Memory. The data block interleaving system in column buffer 635 works in conjunction with the DLW to issue a number of writebacks to the same column, thus increasing the possibility of write binding. It is also important to understand that interleaving only changes how the data is interpreted in the column buffer 635; it does not require any changes to the memory 610. However, when calculating the location of the cache line within a page, the memory controller 615 must consider the degree of interleaving and decode the address accordingly.

現在注意圖8，其顯示在一非依電性MLC記憶體內具有解耦位元以獲得較高效能及能量效率的一電腦系統之另一實施例。如前述，為求容易解說，圖1-7描述具有兩組位元的一記憶體胞元之實施例，各組具有單一位元(MSB或LSB)。圖8之電腦系統800具有可儲存多個其它組別位元(而非僅只MSB及LSB)的記憶體胞元。類似圖6的電腦系統600，電腦系統800具有透過記憶體控制器815而與非依電性MLC記憶體810通訊的一處理資源805。處理資源805可包括一或多個處理器及一或多個其它記憶體資源(例如快取記憶體)。該非依電性MLC記憶體810具有一陣列的非依電性記憶體胞元(例如記憶體胞元820)，各個多位準記憶體胞元儲存多組位元，標示為GB1、GB2、GB3等直到GBN，於該處N可為等於或高於3的任何整數及受記憶體810的實體限制所限。該陣列的記憶體胞元可組織為字元線(列)x位元線(行)的一陣列，諸如字元線825及位元線830。 Attention is now directed to Figure 8, which shows another embodiment of a computer system having decoupled bits in a non-electrical MLC memory for higher performance and energy efficiency. As previously mentioned, for ease of illustration, Figures 1-7 depict an embodiment of a memory cell having two sets of bits, each group having a single bit (MSB or LSB). The computer system 800 of Figure 8 has memory cells that can store a plurality of other group bits (rather than just MSBs and LSBs). Similar to the computer system 600 of FIG. 6, the computer system 800 has a processing resource 805 that communicates with the non-electrical MLC memory 810 via the memory controller 815. Processing resource 805 can include one or more processors and one or more other memory resources (eg, cache memory). The non-electrical MLC memory 810 has an array of non-electrical memory cells (eg, memory cell 820), and each multi-level memory cell stores a plurality of sets of bits, labeled as GB1, GB2, and GB3. Wait until GBN, where N can be any integer equal to or higher than 3 and subject to the physical limit of memory 810 Limited by system. The memory cells of the array can be organized into an array of word line (columns) x bit lines (rows), such as word line 825 and bit line 830.

該記憶體控制器815提供於記憶體810中的非依電性記憶體胞元之該陣列與處理資源805間之一介面。記憶體控制器815透過多工器與解多工器的組合選擇資料的正確列、行、及記憶體位置而讀、寫、及再新記憶體810。於多個實施例中，該記憶體控制器815經由一列緩衝器835將資料讀及寫至記憶體810。該列緩衝器835具有多個緩衝器部分840-850，標示以「第一緩衝器部分」(840)、「第二緩衝器部分」(845)、等等，直到「第N緩衝器部分」(850)。各個緩衝器部分840-850可儲存得自記憶體胞元820的一組位元。舉例言之，緩衝器部分840可儲存GB1，緩衝器部分845可儲存GB2，及緩衝器部分850可儲存GBN。各個緩衝器部分840-850具有一不同讀延遲及能量及一不同寫延遲及能量。 The memory controller 815 provides an interface between the array of non-electrical memory cells in the memory 810 and the processing resource 805. The memory controller 815 reads, writes, and re-news the memory 810 by selecting the correct column, row, and memory location of the data through the combination of the multiplexer and the demultiplexer. In various embodiments, the memory controller 815 reads and writes data to the memory 810 via a column of buffers 835. The column buffer 835 has a plurality of buffer portions 840-850 labeled "first buffer portion" (840), "second buffer portion" (845), etc. until "nth buffer portion" (850). Each of the buffer portions 840-850 can store a set of bits from the memory cell 820. For example, buffer portion 840 can store GB1, buffer portion 845 can store GB2, and buffer portion 850 can store GBN. Each of the buffer portions 840-850 has a different read delay and energy and a different write delay and energy.

現在轉向注意圖9，其顯示在一非依電性MLC記憶體內解耦位元用於較高效能及能量效率之流程圖。首先，該非依電性MLC記憶體的實體位址空間係解耦成多組位元，各組具有不同讀及寫延遲(900)。舉例言之，一組位元可為具有減低讀延遲的MSB，及另一組位元可為具有減低寫延遲的LSB。多組位元的不同讀及寫延遲係暴露於記憶體胞元(905)。該控根據多組的讀及寫延遲而服務一記憶體請求(例如一讀及寫請求)(910)。 Turning now to Figure 9, there is shown a flow chart for decoupling bits in a non-electrical MLC memory for higher performance and energy efficiency. First, the physical address space of the non-electrical MLC memory is decoupled into a plurality of groups of bits, each group having a different read and write delay (900). For example, one set of bits can be an MSB with reduced read latency, and another set of bits can be an LSB with reduced write latency. Different read and write delays for multiple sets of bits are exposed to the memory cells (905). The control services a memory request (e.g., a read and write request) based on a plurality of sets of read and write delays (910).

圖10為一流程圖用以結合寫至非依電性MLC記憶體用於較高效能及能量效率。首先，當對映一頁面至實體記憶體時，橫過多個列緩衝器部分的位元區塊交插，例如得自MSB緩衝器部分的位元區塊係與得自LSB緩衝器部分的位元區塊交插，如前文參考圖7所述(1000)。其次，記憶體胞元簽發一寫請求給第一位址(1005)。若有針對一第二位址而被擱置的寫請求其係對映至記憶體中之相同列及相同胞元集合(1010)，則第一及第二寫請求組合成單一結合寫以對該記憶體列做單一寫更新(1015)。否則，第一及第二位址係分開寫(1025)。記憶體控制器當排程寫請求時，若有結合可能，則可前瞻地從最末位準快取記憶體發送髒區塊給該記憶體(1020)。 Figure 10 is a flow chart for writing to a non-electrical MLC The memory is used for higher performance and energy efficiency. First, when mapping a page to a physical memory, the bit blocks across the plurality of column buffer portions are interleaved, such as the bit block from the MSB buffer portion and the bit from the LSB buffer portion. The metablock is interleaved as previously described with reference to Figure 7 (1000). Second, the memory cell issues a write request to the first address (1005). If a write request placed for a second address is mapped to the same column and the same set of cells (1010) in the memory, the first and second write requests are combined into a single combined write to The memory column is a single write update (1015). Otherwise, the first and second addresses are written separately (1025). When the memory controller schedules a write request, if there is a possibility of combining, the memory block can be forwardly sent from the last level memory to the memory (1020).

優異地，於非依電性MLC記憶體中位元的解耦，就欲探討的讀及寫延遲及能量許可讀及寫不對稱性。MSB係以減低的延遲及能量讀，而LSB係以減低的延遲及能量寫。在寫至記憶體之前MSB與LSB於列緩衝器交插，結合了寫及緩和了位元解耦的耐用性效應。 Excellently, in the decoupling of bits in non-electrical MLC memory, the read and write delays and energy-enhanced read and write asymmetry are discussed. The MSB is read with reduced delay and energy, while the LSB is written with reduced delay and energy. The MSB and LSB interleaved in the column buffer before writing to the memory, combined with the durability effect of writing and mitigating bit decoupling.

須瞭解所揭示實施例之先前描述係提供以許可熟諳技藝人士製作或使用本文揭示。此等實施例之各項修改將為熟諳技藝人士顯然易知，及不背離本文揭示之精髓及範圍，此處定義的通用原理可應用至其它實施例。如此，本文揭示並非意圖限於此處顯示的實施例，反而係根據符合此處揭示之原理及新穎特徵的最寬廣範圍。 It is to be understood that the foregoing description of the disclosed embodiments is provided by the skilled artisan. The various modifications of the embodiments are apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments without departing from the spirit and scope of the disclosure. As such, the disclosures are not intended to be limited to the embodiments shown herein, but rather the broad scope of the principles and novel features disclosed herein.

800‧‧‧電腦系統 800‧‧‧ computer system

805‧‧‧處理資源 805‧‧ ‧ Processing resources

810‧‧‧非依電性多位準胞元(MLC)記憶體 810‧‧‧ Non-electrical multi-level cell (MLC) memory

815‧‧‧記憶體控制器 815‧‧‧ memory controller

820‧‧‧記憶體胞元 820‧‧‧ memory cells

825‧‧‧字元線 825‧‧‧ character line

830‧‧‧位元線 830‧‧‧ bit line

835‧‧‧列緩衝器 835‧‧‧ column buffer

840‧‧‧第一緩衝器部分 840‧‧‧ first buffer section

845‧‧‧第二緩衝器部分 845‧‧‧Second buffer section

850‧‧‧第N緩衝器部分 850‧‧‧Nth buffer section

Claims

A non-electrical multi-level cell (MLC) memory cell comprising: an array of non-electrical memory cells, each non-electric memory cell storing multiple sets of bits; And a column of buffers having a plurality of buffer portions, each buffer portion storing one or more bits from the memory cells and having different read and write delays and energy.

The non-electrical MLC memory of claim 1, which comprises a memory controller to issue a write request to different bits in a set of memory cells, and to instruct the memory to combine the writes Request a single write to the memory cells of the set.

The non-electrical MLC memory of claim 1, wherein the first group of bits is stored in a first buffer portion, and the second group of bits is stored in a second buffer portion, and The bit block of the first buffer portion is interleaved with the bit block from the second buffer portion to be coupled to the write of the column buffer.

The non-electrical MLC memory of claim 2, wherein the column buffer comprises a plurality of sense amplifiers and an analog to digital converter, each sense amplifier being coupled to a bit line.

The non-electrical MLC memory of claim 4, wherein each analog to digital converter is coupled to a plurality of latches to hold the plurality of sets of bits.

The non-electrical MLC memory of claim 4, wherein the read latency is taken This is due to the time consumed by the plurality of sense amplifiers to sense the resistance of one of the non-electrical memory cells.

The non-electrical MLC memory of claim 1, wherein the write delay is dependent on an initial state of the non-electrical memory cells and a target state of the non-electrical memory cells.

A method for decoupling higher performance and energy efficiency bits in a non-electrical multi-level cell (MLC) memory, the method comprising: decoupling a physical address space into a plurality of groups Bits, each group having a different read and write delay; exposing read and write delays of the plurality of groups of bits to a memory controller; and servicing a memory based on the plurality of sets of read and write delays Request.

The method of claim 8, wherein decoupling a physical address space into a plurality of sets of bit systems comprises storing a plurality of buffer portions of the plurality of sets of bits to a column of buffers.

The method of claim 9, comprising the data block interleaved in a first buffer portion and the data block in a second buffer portion to increase a write binding opportunity.

The method of claim 8, further comprising: searching for the mapping of a last-level cache memory to a memory column each time the dirty data of the last level is evicted. The dirty cache block and the dirty cache block are issued as write back to the non-electrical MLC memory.

A computer system comprising: a non-electrical multi-level cell (MLC) memory having an array of non-electrical memory cells, each memory cell storing a most significant bit ( MSB) and a least significant bit (LSB); a column of buffers having an MSB buffer to store MSBs and LSB buffers from the memory cells to store LSBs from the memory cells, wherein a bit block from the MSB buffer is interleaved with a bit block from the LSB buffer; and a memory controller to write a bit of a block to the non-electrical MLC memory A set of cells in a column, identifying other write requests for the same set of cells in the column, and instructing the memory to bind the writes to the memory.

The computer system of claim 12, wherein the column buffer comprises a plurality of sense amplifiers, and the memory controller controls the plurality of sense amplifiers to select the MSB buffer or the LSB buffer to store the block Information.

The computer system of claim 12, wherein the non-electrical MLC memory system comprises a phase change memory.