CN103946826A - Apparatus and method for implementing a multi-level memory hierarchy over common memory channels - Google Patents

Apparatus and method for implementing a multi-level memory hierarchy over common memory channels Download PDF

Info

Publication number
CN103946826A
CN103946826A CN201180075093.9A CN201180075093A CN103946826A CN 103946826 A CN103946826 A CN 103946826A CN 201180075093 A CN201180075093 A CN 201180075093A CN 103946826 A CN103946826 A CN 103946826A
Authority
CN
China
Prior art keywords
memory
storer
level
data
order
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201180075093.9A
Other languages
Chinese (zh)
Other versions
CN103946826B (en
Inventor
R.K.拉马努简
D.齐亚卡斯
D.J.齐默曼
M.J.库马
M.P.斯瓦米纳桑
B.N.库里
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel Corp
Original Assignee
Intel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corp filed Critical Intel Corp
Publication of CN103946826A publication Critical patent/CN103946826A/en
Application granted granted Critical
Publication of CN103946826B publication Critical patent/CN103946826B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0811Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0238Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0844Multiple simultaneous or quasi-simultaneous cache accessing
    • G06F12/0846Cache with multiple tag or data arrays being simultaneously accessible
    • G06F12/0848Partitioned cache, e.g. separate instruction and operand caches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0893Caches characterised by their organisation or structure
    • G06F12/0897Caches characterised by their organisation or structure with two or more cache hierarchy levels
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1668Details of memory controller
    • G06F13/1684Details of memory controller using multiple buses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/20Employing a main memory using a specific memory technology
    • G06F2212/202Non-volatile memory
    • G06F2212/2024Rewritable memory not requiring erasing, e.g. resistive or ferroelectric RAM
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/25Using a specific main memory architecture
    • G06F2212/251Local memory within processor subsystem
    • G06F2212/2515Local memory within processor subsystem being configurable for different purposes, e.g. as cache or non-cache memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/30Providing cache or TLB in specific location of a processing system
    • G06F2212/304In main memory subsystem

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Memory System (AREA)

Abstract

A system and method are described for integrating a memory and storage hierarchy including a non-volatile memory tier within a computer system. In one embodiment, PCMS memory devices are used as one tier in the hierarchy, sometimes referred to as "far memory." Higher performance memory devices such as DRAM placed in front of the far memory and are used to mask some of the performance limitations of the far memory. These higher performance memory devices are referred to as "near memory."

Description

For realize equipment and the method for multi-level store level on common storage passage
Technical field
Generally speaking, this openly relates to field of computer.More particularly, the present invention relates to for realizing equipment and the method for multi-level store level.
Background technology
A. current storage and storage device configuration
One of limiting factor of computing machine innovation today is storer and memory storage technology.In conventional computer system, system storage (also referred to as primary memory, main storer, can execute store) conventionally by dynamic RAM (DRAM), realized.Storer based on DRAM is even when not having storer to read or writing while occurring also consumed power, and this is because it must constantly recharge to internal capacitor.Storer based on DRAM is volatibility, this means, once remove power supply, the data that are stored in DRAM storer are just lost.Conventional computer system also depends on multilevel cache to improve performance.High-speed cache is the high-speed memory between processor and system storage, to serve quickly them than serving memory access request from system storage.This type of high-speed cache uses static RAM (SRAM) to realize conventionally.Cache management agreement can be used for guaranteeing that the data of the most frequently accessing and instruction are stored in wherein in on-chip cache, reduces thus memory access transactions and improves performance.
With respect to mass storage device (also referred to as auxilary unit or disc memory device), conventional mass storage device device comprises magnetic medium (such as hard disk drive), light medium (such as compact disk (CD) driver, digital versatile disc (DVD) etc.), holographic media and/or mass storage device flash memory (such as solid-state drive (SSD), detachable flash drive etc.) conventionally.Generally speaking, these memory storage devices are regarded as I/O (I/O) device, and this is because they are visited by realizing the various I/O adapters of various I/O agreements by processor.These I/O adapters and the quite a large amount of power of I/O agreement consumption, and can there is significant impact to the die area of platform and form factor.Mancarried device or the mobile device (such as laptop computer, notebook, flat computer, personal digital assistant (PDA), portable electronic device, portable type game device, digital camera, mobile phone, smart phone, functional mobile phone etc.) when not being connected to permanent power source with limited battery life can comprise detachable mass storage device device (for example embedded multi-media card (eMMC), secure digital (SD) card), and they are coupled to processor to meet movable and idle power budget via low-power interconnection and I/O controller conventionally.
With respect to firmware memory (such as bootstrap memory (also referred to as BIOS flash memory)), conventional computer system, conventionally with flash memory device, store and often read but persistent system information that seldom (or never) write.For example, by processor, carried out, in the initial order of initialization critical system assembly during bootup process (basic input and output system (BIOS) reflection), be conventionally stored in flash memory device.Current generally have finite speed (for example 50 MHz) at commercially available flash memory device.Due to for reading the expense of agreement, this speed further reduces (for example 2.5 MHz).For accelerating BIOS execution speed, conventional processors is generally in the part of Extensible Firmware Interface in advance (PEI) high-speed cache bios code during the stage of bootup process.The size of processor high speed buffer memory has applied constraint to the size of the bios code using in the stage at PEI (also referred to as " PEI bios code ").
B. phase transition storage (PCM) and correlation technique
Phase transition storage (PCM) (sometimes unifying storer or chalcogenide RAM (C-RAM) also referred to as phase change random access memory devices (PRAM or PCRAM), PCME, formula difficult to understand) is the non-volatile computer memory type that adopts the idiosyncratic behavior of chalcogenide glass.Due to the heat through producing by electric current, chalcogenide glass can be at two states: crystal with noncrystal between switching.The current version of PCM can obtain two kinds of distinct additivities.
PCM provides the performance higher than flash memory, this is because the memory element of PCM switch quickly, can write (each position is changed over to 1 or 0) and without whole of erase unit first, and slower (PCM device can be survived approximate 100,000,000 and be write circulation from the degradation of writing; PCM degradation is because the thermal expansion during programming, metal (with other material) migration and other mechanism cause).
Accompanying drawing explanation
Following instructions and accompanying drawing are for illustration embodiments of the invention.In the accompanying drawings:
Fig. 1 illustration according to the high-speed cache of the embodiment of the present invention and system storage, arrange;
Fig. 2 illustration the storer and the memory storage level that adopt in embodiments of the present invention;
Fig. 3 illustration can realize the computer system of the embodiment of the present invention thereon;
Fig. 4 A illustration the first system framework that comprises PCM according to the embodiment of the present invention;
Fig. 4 B illustration the second system framework that comprises PCM according to the embodiment of the present invention;
Fig. 4 C illustration the 3rd system architecture that comprises PCM according to the embodiment of the present invention;
Fig. 4 D illustration the Quaternary system system framework that comprises PCM according to the embodiment of the present invention;
Fig. 4 E illustration the 5th system architecture that comprises PCM according to the embodiment of the present invention;
Fig. 4 F illustration the 6th system architecture that comprises PCM according to the embodiment of the present invention;
Fig. 4 G illustration the 7th system architecture that comprises PCM according to the embodiment of the present invention;
Fig. 4 H illustration the 8th system architecture that comprises PCM according to the embodiment of the present invention;
Fig. 4 I illustration the 9th system architecture that comprises PCM according to the embodiment of the present invention;
Fig. 4 J illustration the tenth system architecture that comprises PCM according to the embodiment of the present invention;
Fig. 4 K illustration the 11 system architecture that comprises PCM according to the embodiment of the present invention;
Fig. 4 L illustration the tenth two system framework that comprises PCM according to the embodiment of the present invention; And
Fig. 4 M illustration the 13 system architecture that comprises PCM according to the embodiment of the present invention;
Fig. 5 A illustration an embodiment of system architecture, it comprises the nearly storer of volatibility and non-volatile storer far away;
Fig. 5 B illustration an embodiment of memory side high-speed cache (MSC);
Fig. 5 C illustration another embodiment of memory side high-speed cache (MSC), it comprises integrated label high-speed cache and ECC generates/check logic;
Fig. 5 D illustration an embodiment of demonstration tag cache and ECC maker/inspection unit;
Fig. 5 E illustration an embodiment of PCM DIMM who comprises PCM controller;
Fig. 6 A illustration according to one embodiment of the invention, be exclusively used in MCE controller and the high-speed cache of some regulation system physical address (SPA) scope;
Fig. 6 B illustration the exemplary mapping between system memory map, nearly memory imaging shine upon with PCM address according to one embodiment of the invention;
Fig. 6 C illustration the exemplary mapping between system physical address (SPA) and PCM physical unit address (PDA) or nearly storage address (NMA) according to one embodiment of the invention; And
Fig. 6 D illustration according to one embodiment of the invention interweaving between the memory page in system physical address (SPA) space and space, address, storage channel (MCA).
Embodiment
In the following description, many details have been set forth, such as member, the resource division/share/copy type and mutual relationship and the logical partitioning/integrated selection of realization, system component of logic realization, operational code, specifies operands, to provide, to of the present invention, more thoroughly understand.Yet, those skilled in the art will recognize that do not have this type of specific detail also can put into practice the present invention.In other example, control structure, gate level circuit and full software instruction sequences are not shown specifically, in order to avoid fuzzy the present invention.It is suitably functional that those of ordinary skills use comprised description to realize, and without too much experiment.
In instructions, mention the described embodiment of indication such as " embodiment ", " embodiment ", " example embodiment " and can comprise specific features, structure or characteristic, but not necessarily each embodiment comprises this specific features, structure or characteristic.In addition, this type of phrase not necessarily refers to same embodiment.Furtherly, when specific features, structure or characteristic are described in conjunction with an embodiment, think and realize this category feature, structure or characteristic in the knowledge of those skilled in the range in conjunction with other embodiment, and no matter whether clearly described.
In following instructions and claims, can use term " coupling " and " connection " derivative together with them.Should be appreciated that, these terms are not intended as synonym each other." coupling " be used to indicate can be directly each other physical contact or electrical contact or can be direct two or more unit of physical contact or electrical contact co-operating or reciprocation each other each other." connection " is used to indicate and between two or more unit coupled to each other, sets up communication.
Parenthesized text and the piece with dotted border (such as large dash, little dash, dot-and-dash line, point etc.) sometimes can selection operation/assemblies what add supplementary features for illustration to the embodiment of the present invention herein.Yet, this type of mark should not be regarded as meaning these be only option or only can selection operation/assembly, and/or should not be regarded as meaning that the piece with solid border is not optional in certain embodiments of the present invention.
Introduce
Memory span and performance requirement are along with the quantity of processor core and new purposes model (such as virtual) increases and continues to increase.In addition, memory power and cost have become respectively the overall power of electronic system and the important component part of cost.
Some embodiments of the present invention solve above challenge by smart subdivision performance requirement and capacity requirement between memory technology.The focus of the method is to provide in the following way performance: utilize storer (such as DRAM) relatively at a high speed relatively in a small amount, use significantly more cheap and more intensive nonvolatile RAM (NVRAM) to realize most systems storer simultaneously.The embodiment of the present invention the following describes has defined for using NVRAM can realize the platform configuration of hierarchical memory subsystem tissue.In memory hierarchy, use NVRAM also can realize new purposes, such as the guide space expanding and mass storage device, realize, described in detail as follows.
fig. 1illustration according to the high-speed cache of the embodiment of the present invention and system storage, arrange.Particularly, fig. 1show memory hierarchy, it comprises one group of internal processor high-speed cache 120, serve as " near storer " 121 and " storer far away " 122 of memory cache far away, and " near storer " 121 can comprise one or more internally cached 106 and External Cache 107-109.A kind of particular type storer that can be used in some embodiments of the invention " storer far away " is nonvolatile RAM (" NVRAM ").Thereby, the general view of NVRAM is provided below, be the general view of storer far away and nearly storer subsequently.
A. nonvolatile RAM (" NVRAM ")
Exist many possible choices of technology for NVRAM, comprise PCM, phase transition storage and switch (PCMS) (the latter is the former more specifically realization), byte-addressable long-time memory (BPRAM), general-purpose storage, Ge2Sb2Te5, programmable metallization unit (PMC), resistive storer (RRAM), RESET (noncrystal) unit, SET (crystal) unit, PCME, ovonic memory, ferroelectric memory (also referred to as polymer memory and poly-(N-vinylcarbazole)), ferromagnetic store is (also referred to as spin electric device, SPRAM (spin transmitting torque RAM), STRAM (RAM is worn in spin then), magnetoresistive memory, magnetic store, MAGNETIC RANDOM ACCESS MEMORY (MRAM)) and semiconductor-oxide-nitride thing-oxide-semiconductor (SONOS, also referred to as dielectric storer).
For the memory hierarchy of describing in this application, NVRAM has following characteristic:
(1) even if remove power supply, it also keeps its content, and this is similar to the flash memory using in solid-state disk (SSD), and is different from SRAM and the DRAM of volatibility;
(2) when the free time than the lower power consumption of volatile memory (such as SRAM and DRAM);
(3) be similar to the random access (also referred to as can randow addressing) of SRAM and DRAM;
(4) for example, with the granularity (byte level) of the flash memory even lower level than containing in SSD, can rewrite and erasable (flash memory containing in SSD only can be rewritten and wipe " piece " at every turn---for NOR flash memory, big or small bottom line is 64K byte, and big or small bottom line is 16K byte for nand flash memory);
(5) can be used as system storage and distributed whole system memory addresses space or part system memory addresses space;
(6) can use transaction protocol by bus coupling to processor (support transaction identifiers (ID) to distinguish the agreement of different affairs, those affairs can be completed disorderly) and for example allow, in the particle size fraction access that is small enough to support operation as the NVRAM of system storage (cache line size, such as 64 bytes or 128 bytes).For example, bus can be memory bus (such as DDR bus, such as DDR3, DDR4 etc.), and by described bus operation transaction protocol, this forms contrast with the normal non-transaction protocol using.As another example, bus can be by the bus of its normal operation transaction protocol (the machine transaction protocol), for example, such as PCI (PCIE) bus, Desktop Management Interface (DMI) bus or utilize transaction protocol and any other type bus of enough minor matter business useful load size (cache line size, such as 64 bytes or 128 bytes) fast; And
(6) one or more as follows:
A) than non-volatile memory/storage technology (such as flash memory) writing rate faster;
B) very high read rate (faster than flash memory, and approach or be equivalent to DRAM read rate);
C) directly can write (rather than before writing data, needing to wipe (using 1 overwrite) (such as the flash memory using) in SSD);
D) before fault, for example, compared with high (2 or 3) order of magnitude, write the duration (larger than the guiding ROM and the flash memory that use) in SSD; And/or
As mentioned above, form and contrast (flash memory must be rewritten and wipe whole " piece ") at every turn with flash memory, in any given realization, access other type bus that NVRAM particle size fraction used can be depending on concrete Memory Controller and concrete memory bus or coupling NVRAM.For example, at NVRAM, be used as in some realizations of system storage, although capability is to visit with byte granularity, still can for example, with the granularity of cache line (cache line of 64 bytes or 128 bytes), visit NVRAM, this is because cache line is the level used of memory sub-system reference-to storage.Thereby, in the time of in NVRAM is deployed in memory sub-system, can for example, with the identical particle size fraction of the DRAM with using (" near storer "), visit NVRAM in same memory sub-system.Nonetheless, by Memory Controller and memory bus or other type bus, the particle size fraction of the access of NVRAM is also less than to the block size that used by flash memory and the particle size fraction of the controller of I/O subsystem and the access size of bus.
NVRAM also can consider the following fact in conjunction with consume homogenising algorithm: the storage unit in storage level far away starts to exhaust (wear out) several times write access after, particularly such as can occur writing in a large number in system storage realization in the situation that.Because high cycle count piece in this way most possibly exhausts, therefore consume homogenising and spread and write in storage unit far away by exchanging the address of high cycle count piece and low cycle count piece.Point out, it is transparent that most of address exchanges common application programs, and this is because it for example, is disposed by hardware, low level software (low-level driver or operating system) or the combination of the two.
B. storer far away
The storer far away 122 use NVRAM of some embodiments of the invention realize, but are not necessarily confined to any concrete memory technology.Storer 122 far away is had any different in its characteristic and/or its application aspect and other instruction and data storage/memory technology in storage/memory level.For example, storer 122 far away is different from:
1) static RAM (SRAM), it can be used for being exclusively used in respectively level 0 and level 1 internal processor high-speed cache 101a-b, 102a-b, 103a-b, 103a-b and the 104a-b of each processor core in processor core 101-104, and the relatively low-level cache of being shared by processor core (LLC) 105;
2) dynamic RAM (DRAM), it is for example configured to, for example, at the high-speed cache 106 of processor 100 inside (on the tube core identical from processor 106) and/or one or more high-speed cache 107-109 of being configured in processor outside (in the encapsulation identical with processor 100 or different encapsulation); And
3) flash memory/disk/optical disc, it is applied as mass storage device (not shown); And
4) storer (such as flash memory or other ROM (read-only memory) (ROM)), it is applied as firmware memory (it can guide and lead ROM, BIOS flash memory and/or TPM flash memory).(not shown).
Storer 122 far away can be used as can be directly by processor 100 addressing and can fully catch up with the instruction and data memory storage of processor 100, it contrasts with the flash memory/disk/optical disc formation that is applied as mass storage device.In addition, as discussed above and describe in detail below, storer 122 far away can be placed in memory bus, and can with Memory Controller direct communication, described Memory Controller again with processor 100 direct communications.
Storer 122 far away can be for example, with other instruction and data memory technology (DRAM) combination to form mixing memory (also referred to as PCM and the DRAM of colocated; First order storer and second level storer; FLAM (flash memory and DRM)).Point out, at least some technology (comprising PCM/PCMS) in above technology can be used for mass storage device, replace system storage or additional as system storage, and when applying in this way, need not to be can be by processor random access, byte addressing or directly address.
For convenience of explanation, the major part in the application's remainder will be quoted " NVRAM ", or more especially " PCM " or " PCMS " is used as the choice of technology for storer 122 far away.Thereby term NVRAM, PCM, PCMS and storer far away are used interchangeably in the following discussion.Yet, should be appreciated that as discussed above, for storer far away, also can utilize different technologies.Also have, that NVRAM is not limited to as storer far away.
C. nearly storer
" near storer " the 121st, is configured in storer 122 far away intermediate storer above, and it has lower read/write access stand-by period and/or more symmetrical read/write access stand-by period (having the read time that is roughly equivalent to the time of writing) with respect to storer far away.In certain embodiments, nearly storer 121 has than the remarkable low write latency of storer 122 far away but has reading the stand-by period of similar (for example lower slightly or equal); For example, nearly storer 121 can be volatile memory (such as volatile random access memory (VRAM)), and can comprise DRAM or other high-speed memory based on capacitor.Yet, point out, ultimate principle of the present invention is not limited to these particular memory type.Additionally, nearly storer 121 can have relatively low density, and/or may manufacture more expensive than storer 122 far away.
In one embodiment, nearly storer 121 is configured between storer 122 far away and internal processor high-speed cache 120.In some embodiment that are described below, nearly storer 121 is configured to one or more memory side high-speed caches (MSC) 107-109 to shelter the performance of storer far away and/or to use restriction, and it comprises for example read/write stand-by period restriction and the restriction of storer degradation.In these are realized, MSC 107-109 is usingd and is similar to, is equivalent to or surpass only to use DRAM as the performance level operation of the system of system storage with the combination of storer 122 far away.As discussed in more detail below, although be shown as " high-speed cache " in Fig. 1, but nearly storer 121 can comprise following pattern, and wherein it also plays the part of other role except playing the part of high-speed cache role, or substitute and play the part of high-speed cache role and play the part of other role.
Nearly storer 121 can be positioned at (as one or more high-speed caches 106) on processor tube core and/or be positioned at processor tube core outside (as high-speed cache 107-109) (for example being positioned in CPU encapsulation, be for example positioned at, on the independent tube core of CPU package outside (having the high-bandwidth link encapsulating to CPU), on storer dual inline memory modules (DIMM), adapter (riser)/mezzanine (mezzanine) or computer motherboard).Nearly storer 121 can use single or multiple high-bandwidth link (such as the high-bandwidth link (described in detail as follows) of DDR or other affairs) and processor 100 communicative couplings.
Example system memory allocation scheme
Fig. 1 illustration in embodiments of the present invention how with respect to system physical address (SPA) space 116-119 configuration various levels of cache 101-109.As mentioned, this embodiment comprises the processor 100 with one or more core 101-104, and wherein each core has its special-purpose upper-level cache (L0) 101a-104a and (L1) high-speed cache 101b-104b of intermediate high-speed cache (MLC).Processor 100 also comprises shared LLC 105.The fine understanding of operation of these various levels of cache, and will not be described in detail at this.
In Fig. 1, the high-speed cache 107-109 of illustration can be exclusively used in concrete system memory addresses scope or discontinuous address range set closes.For example, high-speed cache 107 is exclusively used in the MSC serving as for system memory addresses scope # 1 116, and high-speed cache 108 and 109 is exclusively used in the MSC serving as for the non-overlapping part of system memory addresses scope # 2 117 and # 3 118.A rear realization can be used for following system: the SPA space of being used by processor 100 is interleaved in the address space for example, being used by high-speed cache 107-109 (when being configured to MSC).In certain embodiments, this rear address space is called as space, address, storage channel (MCA).In one embodiment, internally cached 101a-106a carries out cache operations to whole SPA space.
The system storage using be herein the software for carrying out on processor 100 visible and/or can be directly by the storer of its addressing; And cache memory 101a-109 can operate pellucidly software in following meaning: they do not form the direct addressable part of system address space, but these cores also can support instruction to carry out to allow software to provide certain to control (configuration, strategy, prompting etc.) to some high-speed caches or all high-speed caches.System storage is subdivided into the part (for example, by system designer) that region 116-119 can be used as system configuration process and manually carries out, and/or can be automatically performed by software.
In one embodiment, use storer far away (for example PCM) and with the nearly storer that is configured to system storage, realize system storage region 116-119 in certain embodiments.System memory addresses scope # 4 represents the address realm that uses higher speed storer (such as DRAM) to realize, and higher speed storer can be the nearly storer that is configured in system storage pattern (relative with cache mode).
Fig. 2 illustration according to the embodiment of the present invention for nearly storer 144 and the storage/memory level 140 of NVRAM and different configurable operations patterns.Storage/memory level 140 has multistage, and it comprises: (1) level cache 150, it (for example can comprise processor high speed buffer memory 150A fig. 1in high-speed cache 101A-105) and comprise alternatively nearly storer and be used as the high-speed cache 150B (in some operator scheme discussed in this article) for storer far away, (2) system storage level 151, it can comprise storer 151B far away (NVRAM for example when nearly storer exists, such as PCM) (or when nearly storer does not exist, can just comprise NVRAM as system storage 174) and comprise alternatively the nearly storer (in some operator scheme described herein) as system storage 151A operation, (3) mass storage device level 152, it can comprise flash memory/magnetic/light mass storage device 152B and/or NVRAM mass storage device 152A (a for example part of NVRAM 142), and (4) firmware memory level 153, it can comprise BIOS flash memory 170 and/or BIOS NVRAM 172 and comprise alternatively credible platform module (TPM) NVRAM 173.
As indicated, nearly storer 144 can be embodied as in various different mode operations, and it comprises: first mode, and in this pattern, it operates as the high-speed cache for storer far away (nearly storer is as the high-speed cache 150B for FM); The second pattern, in this pattern, it operates as system storage 151A, and occupies the part (being sometimes referred to as nearly storer " directly access " pattern) in SPA space; And one or more additional operations patterns, such as working storage storer 192 or as write buffer 193.In some embodiments of the invention, nearly storer can be divided, wherein each subregion operation of the different mode in institute's support mode simultaneously; And different embodiment for example can support, for example, for example, by hardware (fuse, pin), firmware and/or software (by one group of programmable range register in MSC controller 124, for example can store different binary codes within it to identify each pattern and subregion) configure partition (size, pattern).
fig. 2in system address space A 190 operation when nearly storer is configured to the MSC 150B for storer far away for illustration.In this configuration, system address space A 190 represents total system address space (and system address space B 191 does not exist).Alternatively, system address space B 191 is for illustrating the realization when to all or part of nearly storer assignment part system address space.In this embodiment, system address space B 191 represents the scope of nearly storer 151 A of system address space middle finger dispensing, and system address space A 190 represents the scope of system address space middle finger dispensing NVRAM 174.
In addition, when serve as for storer far away high-speed cache 150B time, nearly storer 144 can be in each spermotype operation under the control of MSC controller 124.In each pattern in these patterns, nearly memory address space (NMA) is transparent for software in following meaning: nearly storer does not form the direct addressable part of system address space.These patterns include but not limited to following:
(1) write back cache pattern: in this pattern, all or part of near storer that serves as FM high-speed cache 150B is used as the high-speed cache for NVRAM storer far away (FM) 151B.Although writing back pattern, each write operation points to the nearly storer 150B (supposition is write cache line pointed described in high-speed cache existence) as the high-speed cache for FM at first.When only the cache line in the nearly storer of the high-speed cache 150B as for FM will be replaced by another cache line, just carry out corresponding write operation to upgrade NVRAM FM 151B (the write through pattern formation that is propagated into immediately NVRAM FM 151B with each write operation the following describes contrasts).
(2) nearly storer bypass mode: in this pattern, the NM 150B of FM high-speed cache is served as in all read and writes all bypass, and directly goes to NVRAM FM 151B.For example, when application is not while being high-speed cache close friend, or need to pay data lastingly with the granularity of cache line time, can use this quasi-mode.In one embodiment, by processor high speed buffer memory 150A, operated independently of one another with the high-speed cache that the NM 150B that serves as FM high-speed cache carries out.Thereby, in the data of processor high speed buffer memory 150A high speed buffer memory (and it may not permitted at processor high speed buffer memory 150A high speed buffer memory in some cases), can not serve as the NM 150B high speed buffer memory of FM high-speed cache and vice versa.Thereby some data that can be designated as " not cacheable " in processor high speed buffer memory can be internally cached at the NM 150B that serves as FM high-speed cache.
(3) nearly read-high-speed cache of storer is write bypass mode: this is the modification of upper surface model, wherein allows to read high-speed cache persistant data (that is, persistant data at the nearly storer 150B high speed buffer memory as for the high-speed cache of storer far away to carry out read-only operation) from NVRAM FM 151B.When most of persistant data be " read-only " and application purpose be high-speed cache close friend time, this is useful.
(4) nearly read-high-speed cache of storer write through pattern: this is the modification that read-high-speed cache of nearly storer is write bypass mode, wherein except reading high-speed cache, goes back high-speed cache and writes and hit.Writing each time of nearly storer to the high-speed cache 150B as for FM causes writing FM 151B.Thereby the write through character due to high-speed cache, has still guaranteed cache line persistence.
When working in nearly direct memory access (DMA) pattern, all or part of directly visible for software as the nearly storer of system storage 151A, and the part in formation SPA space.This type of storer can be completely under software control.This type of scheme can create the memory block, non-uniform memory address (NUMA) for software, and wherein it obtains the performance higher with respect to NVRAM system storage 174 from nearly storer 144.As example, and unrestricted, this type of use can be used for carrying out very some high-performance calculation (HPC) and the graphical application of fast access to some data structure.
In alternative, nearly direct memory access (DMA) pattern " arranges pin (pinning) " by some cache line in nearly storer (having the cache line that is also stored in the data in NVRAM 142) simultaneously and realizes.This type of arranges pin and can be effectively in large, multichannel and in organizing associated high-speed cache, carries out.
The part that Fig. 2 goes back illustration NVRAM 142 can be used as firmware memory.For example, BIOS NVRAM 172 parts can be used for storing BIOS reflection (replace BIOS information to be stored in BIOS flash memory 170, or conduct being to being stored in BIOS information adding in BIOS flash memory 170).BIOS NVRAM part 172 can be the part in SPA space, and can be by the software directly address of carrying out on processor core 101-104, and BIOS flash memory 170 can carry out addressing by I/O subsystem 115.As another example, credible platform module (TPM) NVRAM 173 parts can be used for protecting sensory system information (for example encryption key).
Thereby, as indicated, NVRAM 142 can be embodied as in various different mode operations, it for example comprises, as storer 151B far away (when the nearly exist/operation of storer 144, and no matter closely whether storer utilizes MSC control piece 124 to serve as high-speed cache for FM (directly access and do not have MSC control piece 124) after one or more high-speed cache 101A-105); NVRAM system storage 174 (not as storer far away, this is because there is no nearly storer/operation, and there is no MSC control piece 124 and access) just; NVRAM mass storage device 152A; BIOS NVRAM 172; And TPM NVRAM 173.Although different embodiment can stipulate NVRAM pattern differently, fig. 3the use of decoding table 333 has been described.
fig. 3illustration can realize the illustrative computer system 300 of the embodiment of the present invention thereon.Computer system 300 comprises processor 310 and storage/memory subsystem 380, and this storage/memory subsystem 380 has for system storage, mass storage device and the NVRAM 142 of firmware memory alternatively.In one embodiment, NVRAM 142 comprises by computer system 300 for storing data, instruction, state and other total system storer and memory storage level lasting and non-persistent information.As previously discussed, NVRAM 142 can be configured to realize the role of system storage, mass storage device and firmware memory, TPM storer etc. in typical memory and memory storage level.? fig. 3embodiment in, NVRAM 142 is divided into FM 151B, NVRAM mass storage device 152A, BIOS NVRAM 173 and TPM NVRAM 173.Also considered to have the memory storage level of different role, and the application of NVRAM 142 is not limited to role above-mentioned.
As example, the operation when the write back cache as the nearly storer 150B of the high-speed cache for FM has been described.In one embodiment, although as the nearly storer for the high-speed cache 150B of FM in write back cache pattern above-mentioned, but first read operation will arrive MSC controller 124, it searches to determine at the nearly storer serving as for the high-speed cache 150B of FM whether have asked data (for example utilizing tag cache 342) by execution.If existed, it can turn back to data by I/O subsystem 115 CPU, core 101-104 or the I/O device of the request of sending.If data do not exist, MSC controller 124 can send to request NVRAM controller 332 together with system memory addresses.NVRAM controller 332 will convert system memory addresses to NVRAM physical unit address (PDA) with decoding table 333, and read operation be pointed to this region of storer 151B far away.In one embodiment, decoding table 333 comprises indirect addressing table (AIT) assembly, and NVRAM controller 332 uses described AIT assembly to change between system memory addresses and NVRAM PDA.In one embodiment, AIT is updated to being embodied as distributed storage accessing operation and reducing thus the part of consuming on NVRAM FM 151B of consume homogenising algorithm.Alternatively, AIT can be the independent table being stored in NVRAM controller 332.
When receiving asked data from NVRAM FM 151B, NVRAM controller 332 can turn back to MSC controller 124 by asked data, MSC controller 124 can be stored in described data in the nearly storer 150B of MSC that serves as FM high-speed cache, and described data is sent to processor core 101-104 or the I/O device of the request of sending by I/O subsystem 115.Request subsequently for these data can directly be served from serving as the nearly storer 150B of FM high-speed cache, until it is by a certain other NVRAM FM data replacement.
As mentioned, in one embodiment, first memory write operation also goes to MSC controller 124, and MSC controller 124 writes it in the nearly storer 150B of MSC that serves as FM high-speed cache.In write back cache pattern, when receiving write operation, can data directly not sent to NVRAM FM 151B.For example, when the position of only storing data in the nearly storer 150B of MSC that serves as FM high-speed cache must be used further to store data for different system storage address, just data can be sent to NVRAM FM 151B.When this occurs, MSC controller 124 notices that data are current not in NVRAM FM 151B, thereby and will from serve as the nearly storer 150B of FM high-speed cache, retrieve it, and it is sent to NVRAM controller 332.NVRAM controller 332 is searched the PDA for system memory addresses, and then data is write to NVRAM FM 151B.
? fig. 3in, show that NVRAM controller 332 uses three independent connections to FM 151B, NVRAM mass storage device 152A and BIOS NVRAM 172.Yet this not necessarily means, exists three independent physical bus or communication channel NVRAM controller 332 to be connected to these parts of NVRAM 142.But, in certain embodiments, common storage bus or other type bus (such as below with respect to fig. 4 A-Mthe bus of describing) for by correspondence NVRAM controller 332 being coupled to FM 151B, NVRAM mass storage device 152A and BIOS NVRAM 172.For example, in one embodiment, fig. 3in three-line be expressed as follows bus (such as memory bus (buses such as DDR3, DDR4)), by this bus NVRAM controller 332, realized the transaction protocol of communicating by letter with NVRAM 142.NVRAM controller 332 also can for example, by supporting the bus (such as PCI high-speed bus, Desktop Management Interface (DMI) bus or utilize transaction protocol and any other type bus of enough minor matter business useful load size (cache line size, such as 64 bytes or 128 bytes)) of the machine transaction protocol to communicate by letter with NVRAM 142.
In one embodiment, computer system 300 comprises the integrated memory controller (IMC) 331 of carrying out for the treatment of the central memory access control of device 310, it is coupled to: 1) memory side high-speed cache (MSC) controller 124, to control serving as the access of nearly storer (NM) 150B of memory cache far away; And 2) NVRAM controller 332, to control the access to NVRAM 142.Although fig. 3middle illustration is independent unit, but MSC controller 124 and NVRAM controller 332 logically can form a part of IMC 331.
In the embodiment of illustration, MSC controller 124 comprises a class range register 336, range registers 336 regulations for serve as memory cache far away NM150B operator scheme (such as above-described write back cache pattern, nearly storer bypass mode etc.).In the embodiment of institute's illustration, DRAM 144 is used as for serving as the memory technology for the NM 150B of the high-speed cache of storer far away.In response to memory access request, MSC controller 124 can (according to the operator scheme of regulation in range registers 336) determine whether to serve this request from the NM150B serving as for the high-speed cache of FM, or whether must send this request to NVRAM controller 332, then controller 332 can serve this request from storer far away (FM) the part 151B of NVRAM 142.
With PCMS, realizing in the embodiment of NVRAM 142, NVRAM controller 332 is to use the agreement consistent with PCMS technology to carry out the PCMS controller of access.As previously discussed, inherently, can access PCMS storer with byte granularity.However, NVRAM controller 332 can visit the storer 151B far away based on PCMS with more rudimentary granularity (for example, such as cache line (64 or 128 s' cache line)) or any other grade granularity consistent with memory sub-system.Ultimate principle of the present invention is not limited to for accessing any concrete level granularity of the storer 151B far away based on PCMS.Yet, generally speaking, when the storer 151B far away based on PCMS is used to form system address space a part of, this particle size fraction will be higher than traditionally for the particle size fraction of other nonvolatile memory device technologies (such as flash memory), it only can carry out heavy write and erase operation in " piece " level (the big or small bottom line for NOR flash memory is 64K byte, and is 16K byte for nand flash memory bottom line).
In the embodiment of institute's illustration, NVRAM controller 332 can read configuration data with the pattern for NVRAM 142 described before establishing from decoding table 333, size etc., or alternatively, can be dependent on from the decoded result of IMC 331 and 315 transmission of I/O subsystem.For example, during fabrication or at the scene, computer system 300 can be to decoding table 333 programming the zones of different of NVRAM 142 being labeled as to system storage, the mass storage device exposing via SATA interface, via the USB mass storage device that only piece transmission (BOT) interface exposes, in addition the encryption memory storage of supporting TPM storage also has other.The different subregions of NVRAM device 142 are handled to access means used and utilize decode logic.For example, in one embodiment, in decoding table 333, define the address realm of each subregion.In one embodiment, when IMC 331 receives request of access, the destination address of this request is decoded is to point to storer, NVRAM mass storage device or I/O to disclose this request.If it is memory requests, IMC 331 and/or MSC controller 124 further determine that according to destination address this request is point to conduct for the NM 150B of the high-speed cache of FM or point to FM 151B.For FM 151B access, this request is forwarded to NVRAM controller 332.For example, if this asks directed I/O (non-memory storage I/O device and memory storage I/O device), IMC 331 is delivered to I/O subsystem 115 by this request.I/O subsystem 115 further decodes to determine that to this address this address is for NVRAM mass storage device 152A, BIOS NVRAM 172 or other non-memory storage I/O device and memory storage I/O device.If this address is for NVRAM mass storage device 152A or BIOS NVRAM 172, I/O subsystem 115 is forwarded to NVRAM controller 332 by this request.If this address is for TMP NVRAM 173, I/O subsystem 115 is delivered to TPM 334 to carry out secure access by this request.
In one embodiment, each request that is forwarded to NVRAM controller 332 has the attribute (also referred to as " transaction types ") of indication access type concurrently.In one embodiment, NVRAM controller 332 can imitate the access protocal of institute's request access type, makes the remainder of platform keep not understanding a plurality of roles that played the part of by NVRAM 142 in storer and memory storage level.In alternative, NVRAM controller 332 can be carried out the memory access to NVRAM 142, and no matter which kind of transaction types it is.Be appreciated that decoding path can be different from described above.For example, IMC 331 can decode to the destination address of request of access, and determines whether it points to NVRAM 142.If it points to NVRAM 142, IMC 331 generates attribute according to decoding table 333.Based on this attribute, then IMC 331 is forwarded to this request the data access that suitable downstream logic (for example NVRAM controller 332 and I/O subsystem 135) is asked to carry out.In another embodiment, if corresponding attribute for example, does not transmit from upper outbound logic (IMC 331 and I/O subsystem 315), NVRAM controller 332 can be decoded to destination address.Also can realize other decoding path.
Exist all frameworks of new memory as described herein that abundant new possibility is provided.Although further discussed with big-length more below, some in these possibilities have immediately been emphasized below very soon.
According to a kind of, may realize, NVRAM 142 serves as in system storage for total replacement of traditional DRAM technology or supplements.In one embodiment, NVRAM 142 represents to introduce second level system storage (for example this system storage can be regarded as having and comprise that nearly storer is as the first order system storage and the second level system storage that comprises storer far away (FM) 151B (part of NVRAM 142) of high-speed cache 150B (part for DRAM device 340)).
According to some embodiment, NVRAM 142 serves as total replacement of flash memory/magnetic/light mass storage device 152B or supplements.As previously described herein, in certain embodiments, even if NVRAM 152A has the addressable ability of byte level, but NVRAM controller 332 still can be pressed multibyte piece (such as 64K byte, 128K byte etc.) access NVRAM mass storage device 152A according to realization.By NVRAM controller 332, from the ad hoc fashion of NVRAM mass storage device 152A visit data, can be transparent for the software of being carried out by processor 310.For example, even if NVRAM mass storage device 152A can be different from the mode of flash memory/magnetic/light mass storage device 152A, visit, operating system still can be regarded NVRAM mass storage device 152A as standard mass storage device device (for example mass storage device device of serial ATA hard drives or other canonical form).
At NVRAM mass storage device 152A, serve as in the embodiment of total replacement of flash memory/magnetic/light mass storage device 152B, for block addressable memory storage, access there is no need to use storage device drive.From memory storage access, removing storage device drive expense can increase access speed and save power.Expectation NVRAM mass storage device 152A for OS and/or application, look like block accessible and with the alternative of flash memory/magnetic/light mass storage device 152B undistinguishable in, the storage device drive of imitating for example can be used for, by block accessible interface (only piece transmission of USB (universal serial bus) (USB) (BOT), 1.0; Serial advanced technology attachment (SATA), 3.0 etc.) be exposed to for accessing the software of NVRAM mass storage device 152A.
In one embodiment, NVRAM 142 serve as for firmware memory (such as BIOS flash memory 362 and TPM flash memory 372) ( fig. 3middle is optional with dotted line illustration to indicate them) total replacement or supplement.For example, NVRAM 142 can comprise BIOS NVRAM 172 parts to supplement or to replace BIOS flash memory 362, and can comprise TPM NVRAM 173 parts to supplement or to replace TPM flash memory 372.Firmware memory also can be stored for example, the system permanent state for the protection of sensory system information (encryption key) by TPM 334.In one embodiment, for firmware memory, use NVRAM 142 to remove for following needs: third party's flash memory is partly stored system is operated to crucial code and data.
Then it is right to continue fig. 3the discussion of system, in certain embodiments, the framework of computer system 100 can comprise a plurality of processors, but in Fig. 3 for the single processor 310 of having simplified illustration.Processor 310 can be any categorical data processor, comprises universal or special CPU (central processing unit) (CPU), special IC (ASIC) or digital signal processor (DSP).For example, processor 310 can be general processor, and such as Core i3, i5, i7,2Duo and Quad, Xeon or Itanium processor, they all can obtain from the Intel company of Santa Clara, California.Alternatively, processor 310 can be from another company, such as the ARM incorporated company of California Sani Wei Er, MIPS Technologies Inc. of California Sani Wei Er etc.Processor 310 can be application specific processor, such as for example network processing unit or communication processor, compression engine, graphic process unit, coprocessor, flush bonding processor etc.Processor 310 can be realized on the one or more chips in being included in one or more encapsulation.Processor 310 can be a part for one or more substrates, and/or can use the arbitrary technology in some technologies (such as for example BiCMOS, CMOS or NMOS) to realize on one or more substrates.? fig. 3shown in example in, processor 310 has system on chip (SOC) configuration.
In one embodiment, processor 310 comprises integrated graphics unit 311, and it comprises for carrying out the logic of graph command (such as 3D or 2D graph command).Although embodiments of the invention are not limited to any concrete integrated graphics unit 311, but in one embodiment, graphic element 311 can be carried out industrial standard graph command, such as for example, order by OpenGL and/or direct X application programming interface (API) (OpenGL 4.1 and directly X11) regulation.
Processor 310 also can comprise one or more core 101-104, but again in order to know that object exists fig. 3illustration single core.In many examples, core 101-104 comprises internal functional blocks, such as one or more performance elements, retired unit, one group of general-purpose register and special register etc.If fruit stone is multithreading or hyperthread, each hardware thread also can be regarded as " logic " core.Core 101-104 aspect framework and/or instruction set, can be homogeneity or heterogeneous.For example, it is orderly that some of them are endorsed, and other core is unordered.As another example, two or more in described core endorse to carry out same instruction set, and other endorses only to carry out subset or the different instruction set of that instruction set.
Processor 310 also can comprise one or more high-speed caches, such as the high-speed cache 313 that can be embodied as SRAM and/or DRAM.In unshowned many embodiment, realize the additional caches be different from high-speed cache 313, make to have multilevel cache between performance element in core 101-104 and storage arrangement 150B and 151B.For example, this group shared cache unit can comprise upper-level cache (such as level 1 (L1) high-speed cache), intermediate high-speed cache (such as level 2 (L2), level 3 (L3), level 4 (L4)) or other grade of high-speed cache (LLC) and/or their various combination.In different embodiment, high-speed cache 313 can be assigned differently, and in different embodiment, can have one of many different sizes.For example, high-speed cache 313 can be 8 megabyte (MB) high-speed cache, 16MB high-speed cache etc.Additionally, in different embodiment, high-speed cache can be direct mapping cache, complete associative cache, multichannel set-associative cache or the high-speed cache with another type mapping.In comprising other embodiment of a plurality of cores, high-speed cache 313 can be included in a large part of sharing between all core, or can be divided into a plurality of independent function fragments (for example fragment of each core).High-speed cache 313 also can be included in a part of sharing between all core and be a plurality of other parts of the independent function fragment of each core.
Processor 310 also can comprise home agent 314, and home agent 314 comprises those assemblies of coordinating and operating core 101-104.Home agent unit 314 for example can comprise power control unit (PCU) and display unit.PCU can be or can comprise adjusting required logic and assembly and the integrated graphics unit 311 of core 101-104 power rating.Display unit is for driving one or more outside displays that connect.
As mentioned, in certain embodiments, processor 310 comprises integrated memory controller (IMC) 331, nearly memory cache (MSC) controller and NVRAM controller 332, and all these all can or be connected on the independent chip and/or encapsulation of processor 310 on the chip identical with processor 310.DRAM device 144 can be on the chip identical with MSC controller 124 from IMC 331 or different chip; Thereby a chip can have processor 310 and DRAM device 144; Chip can have processor 310 and another chip can have DRAM device 144 (and these chips can in identical or different encapsulation); Chip can have core 101-104 and another chip can have IMC 331, MSC controller 124 and DRAM 144 (these chips can in identical or different encapsulation); Chip can have core 101-104 and another chip can have IMC 331 and MSC controller 124 and another chip can have DRAM 144 (these chips can in identical or different encapsulation); Etc..
In certain embodiments, processor 310 comprises the I/O subsystem 115 that is coupled to IMC 331.I/O subsystem 115 makes can communicate between processor 310 and following serial or parallel I/O device: one or more networks 336 (such as LAN (Local Area Network), wide area network or the Internet), memory storage I/O device (such as flash memory/magnetic/light mass storage device 152B, BIOS flash memory 362, TPM flash memory 372) and one or more non-memory storage I/O devices 337 (such as display, keyboard, loudspeaker etc.).I/O subsystem 115 can comprise platform controller center (PCH) (not shown), and this PCH further comprises a plurality of I/O adapters 338 and other I/O circuit so that the access to memory storage and non-memory storage I/O device and network to be provided.In order to complete in this respect, I/O subsystem 115 can have for each utilized I/O at least one integrated I/O adapter 338.I/O subsystem 115 can be on the chip identical with processor 310, or is being connected on the independent chip and/or encapsulation of processor 310.
I/O adapter 338 is the agreement with concrete I/O device compatibility by the main-machine communication protocol conversion one-tenth in processor 310 interior utilizations.For flash memory/magnetic/light mass storage device 152B, the convertible some of them protocol package of I/O adapter 338 contains: periphery component interconnection (PCI)-(PCI-E) fast, 3.0; USB, 3.0; SATA, 3.0; Small computer system interface (SCSI), super-640; And IEEE (IEEE) 1394 " fire wall ", in addition also have other.For BIOS flash memory 362, the convertible some of them protocol package of I/O adapter 338, containing serial peripheral interface (SPI), micro-line, in addition also has other.Additionally, may there are one or more wireless protocols I/O adapters.The example of wireless protocols is used in a territory net (in addition also has other), such as IEEE 802.15 and bluetooth 4.0; Be used in WLAN (wireless local area network), such as the wireless protocols based on IEEE 802.11; And cellular protocol.
In certain embodiments, I/O subsystem 115 is coupled to TPM control piece 334 to control the access to system permanent state, such as secure data, encryption key, platform configuration information etc.In one embodiment, these system permanent states are stored in TPM NVRAM 173, and visit via NVRAM controller 332.
In one embodiment, TPM 334 is the safe microcontrollers with cryptographic functionality.TPM 334 has some trust GLs; For example, the SEAL ability that only can use for same TPM for the data of guaranteeing protected by TPM.TPM 334 can be used its cryptographic capabilities protected data and key (for example secret).In one embodiment, TPM 334 has uniqueness and secret RSA key, and it allows its authentication hardware unit and platform.For example, TPM 334 can verify that seeking the system that the data to being stored in computer system 300 conduct interviews is contemplated system.TPM 334 also can report the integrality of platform (for example computer system 300).This allows external source (for example server on network) to determine the confidence level of platform, but does not stop user to access this platform.
In certain embodiments, I/O subsystem 315 also comprises management engine (ME) 335, and it is to allow system manager monitor, safeguard, upgrade, upgrade and repair the microprocessor of computer system 300.In one embodiment, system manager can carry out Remote configuration computer system 300 in the following way: the content by ME 335 via network 336 editor's decoding tables 333.
For convenience of explanation, the remainder of application is called PCMS device by NVRAM 142 sometimes.That PCMS device comprises is non-volatile, have the PCM cell array of low-power consumption and the revisable multilayer of level in place (vertical stacking).Thereby term NVRAM device and PCMS device are used interchangeably in the following discussion.Yet, should be realized that, as discussed above, for NVRAM 142, also can utilize the different technologies except PCMS.
Should be appreciated that, computer system can be by NVRAM 142 for system storage, mass storage device, firmware memory and/or other storer and memory storage object (even if the processor of that computer system does not have all said modules of processor 310, or having than the assembly of processor more than 310).
In specific embodiment shown in Figure 3, MSC controller 124 and NVRAM controller 332 are positioned in the tube core identical with processor 310 or encapsulation (being called CPU encapsulation).In other embodiments, MSC controller 124 and/or NVRAM controller 332 can be positioned at outside the outer or CPU encapsulation of tube core, by bus (such as memory bus (such as DDR bus (such as DDR3, DDR4 etc.)), PCI high-speed bus, Desktop Management Interface (DMI) bus or any other type bus), are coupled to processor 310 or CPU encapsulation.
Exemplary pcm bus and package arrangements
Fig. 4 A-M illustration various different deployment, wherein processor, nearly storer and storer far away configure by different way and encapsulate.Specifically, in Fig. 4 A-M, a series of platform memory configurations of illustration can be used new non volatile system memory, such as PCM technology, or more specifically, PCMS technology.
Although use some of them same numbers mark on several figure in Fig. 4 A-M, this not necessarily means that the structure that identified by those figure notations is always the same.For example, although identify integrated memory controller (IMC) 331 and CPU 401 by same numbers in several figure, these assemblies can be realized by different way in different figure.Some in these differences are not emphasized, reason is that they are not correlated with for understanding ultimate principle of the present invention.
Although multiple different system platform collocation method is described below, these methods fall into two broad sense classifications: separation architecture and unified shader.In brief, in separation architecture scheme, memory side high-speed cache (MSC) controller (being for example arranged on the independent tube core of processor tube core or CPU encapsulation) is tackled all system storage requests.Have two independent interfaces, these two independent interfaces " flow to downstream " and leave CPU from that controller and encapsulate to be coupled to nearly storer and storer far away.Each interface is special for particular type storer, and each storer can be at independence convergent-divergent aspect performance and capacity.
In unified shader scheme, single memory interface leaves processor tube core or CPU encapsulation, and all memory requests are all sent to this interface.MSC controller is incorporated on this individual interface together with nearly memory sub-system and memory sub-system far away.The memory performance that this memory interface must be met processor by special one-tenth is wanted, and must support affairs unordered protocol, and this is not at least because PCMS device may be processed read request in order.According to general categories above, can adopt following particular platform configuration.
The embodiment the following describes comprises various types of bus/passages.Term " bus " and " passage " are synonymously being used herein.The storage channel quantity of each DIMM socket will depend on the concrete CPU encapsulation (3 storage channels of each socket are for example supported in some of them CPU encapsulation) of using in computer system.
Additionally, in the embodiment of the use DRAM being described below, in fact, can use any type DRAM storage channel, as example, and unrestricted, it comprises DDR passage (such as DDR3, DDR4, DDR5 etc.).Thereby although DDR is favourable because of it in industrial extensive approval, consequent price point etc., ultimate principle of the present invention is not limited to any particular type DRAM or volatile memory.
Fig. 4 A illustration an embodiment of separation architecture, its in CPU encapsulation 401 (on processor tube core or on tube core separately) comprises as serving as one or more DRAM device 403-406 and one or more NVRAM device (such as residing in the PCM storer that serves as storer far away on DIMM 450-451) of the nearly storage operation of the high-speed cache (being MSC) for FM.High-bandwidth link 407 in CPU encapsulation 401 is interconnected to processor 310 by single or multiple DRAM device 403-406, processor 310 trustship integrated memory controller (IMC) 331 and MSC controllers 124.Although illustration is independent unit in Fig. 4 A and other accompanying drawing that the following describes, MSC controller 124 can be integrated in Memory Controller 331 in one embodiment.
DIMM 450-451 is used DDR slot and electrical connection, and it defines DDR passage 440 (for example, by the associating defined DDR3 of the electron device engineering council (JEDEC) or DDR4 standard) with DDR address, data circuit and operation circuit and voltage.PCM device on DIMM 450-451 provides the memory span far away of this separation architecture, and wherein the DDR passage 440 to CPU encapsulation 401 can carry DDR and transaction protocol.The DDR agreement making an immediate response is ordered and is received in processor 310 or other logic (for example IMC 331 or the MSC controller 124) transmission compared in wherein CPU encapsulation, for the transaction protocol of communicating by letter with PCM device, allow a series of affairs of CPU 401 issue, each is identified by unique affairs ID.These orders are by serving as the PCM controller on take over party's PCM DIMM in PCM DIMM, and it sends it back CPU encapsulation 401 by response, is unordered potentially.Processor 310 or other logic in CPU encapsulation 401 are identified each transaction response by its affairs ID sending together with response.More than configure permission system and support the DIMM based on DRAM of standard DDR (using the DDR agreement in DDR electrical connection) and the configuration of the DIMM based on PCM (using transaction protocol in identical DDR electrical connection).
Fig. 4 B illustration use the separation architecture that forms the nearly storer that serves as MSC by the DDR DIMM 452 based on DRAM of DDR passage 440 coupling.Processor 310 managed memory controllers 331 and MSC controller 124.NVRAM device (such as PCM storage arrangement) resides on the DIMM 453 based on PCM, and DIMM 453 is used electrical connection and the DDR slot on the additional DDR passage 442 outside CPU encapsulation 401.DIMM 453 based on PCM provides the memory span far away of this separation architecture, and wherein the DDR passage 442 to CPU encapsulation 401 is electrically connected and can carries DDR and transaction protocol based on DDR.This permission system for example, is configured to obtain expected capacity and/or performance point with DDR DRAM DIMM 452 (DDR4 DIMM) and the PCM DIMM 453 of variable number.
Fig. 4 C illustration in CPU encapsulation 401 (on processor tube core or on tube core separately) trustship serve as the separation architecture of the nearly storer 403-406 of memory side high-speed cache (MSC).Use the high-bandwidth link 407 in CPU encapsulation that single or multiple DRAM device 403-406 are interconnected to processor 310, processor 310 managed memory controllers 331 and MSC controller 124, as defined in separation architecture.NVRAM(is such as PCM storage arrangement) reside on PCI rapid card or adapter 455, they use the quick electrical connection of PCI and PCI fast protocol or on PCI high-speed bus 454, use different transaction protocols.PCM device on PCI rapid card or adapter 455 provides the memory span far away of this separation architecture.
Fig. 4 D is the separation architecture of using DDR DIMM 452 based on DRAM and DDR passage 440 to form the nearly storer that serves as MSC.Processor 310 managed memory controllers 331 and MSC controller 124.NVRAM(is such as PCM storage arrangement 455) reside on PCI rapid card or adapter, they use the quick electrical connection of PCI and PCI fast protocol or in PCI rapid link 454, use different transaction protocols.PCM device on PCI rapid card or adapter 455 provides the memory span far away of this separation architecture, and wherein the storage channel interface outside CPU encapsulation 401 provides a plurality of DDR passages 440 for DDR DRAM DIMM 452.
Fig. 4 E illustration on PCI rapid card or adapter 456 trustship serve as the nearly storer of MSC and the unified shader of storer NVRAM (such as PCM) far away, PCI rapid card or adapter 456 are used PCI electrical connection and PCI fast protocol or on PCI high-speed bus 454, use different transaction protocols fast.Processor 310 trustship integrated memory controllers 331, but in this unified shader situation, MSC controller 124 resides on card or adapter 456 together with NVRAM storer far away with the nearly storer of DRAM.
Fig. 4 F illustration on the DIMM 458 that uses DDR passage 457 trustship serve as the nearly storer of MSC and the unified shader of storer NVRAM (such as PCM) far away.Nearly storer in this unified shader comprises DRAM on each DIMM 458, and it serves as the memory side high-speed cache of the PCM device on that identical DIMM 458, and described PCM device forms the storer far away of that concrete DIMM.MSC controller 124 resides on each DIMM 458 together with storer far away with nearly storer.A plurality of storage channels of DDR bus 457 are provided in this embodiment, outside CPU encapsulation.The DDR bus 457 of this embodiment has realized the transaction protocol in DDR electrical connection.
Fig. 4 G illustration composite liberation framework, at this, MSC controller 124 resides on processor 310, and nearly memory interface and memory interface far away are shared identical DDR bus 410.DDR DIMM 411a based on DRAM is as the nearly storer that serves as MSC in this configuration use, and wherein the DIMM 411b based on PCM (being storer far away) is used DDR slot and NVRAM (such as PCM storage arrangement) to reside on the same memory passage of DDR bus 410.DDR agreement and transaction protocol are carried with the nearly storer DIMM of addressing 411a and storer DIMM 411b far away respectively in the storage channel of this embodiment simultaneously.
Fig. 4 H illustration serve as memory side high-speed cache nearly storer 461a with the DDR DIMM form based on DRAM, reside in the unified shader on mezzanine or adapter 461.Memory side high-speed cache (MSC) controller 124 is arranged in DDR and the PCM controller 460 of adapter, and it can have two or more storage channels that are connected to DDR DIMM passage 470 and are interconnected to CPU by high performance interconnect 462 (such as different memory link) on mezzanine/adapter 461.Associated storer 461b far away is seated on same mezzanine/adapter 461, and is formed by the DIMM that uses DDR passage 470 and be assembled with NVRAM (such as PCM device).
Fig. 4 I illustration can be used as be connected to the unified shader of the DDR memory sub-system of CPU encapsulation 401 and the expansion of the memory span of DIMM 464 by DDR bus 471 on its DDR memory sub-system.For additional capacity based on NVM in this configuration, the nearly storer that serves as MSC resides on mezzanine or adapter 463 with the form of the DDR DIMM 463a based on DRAM.MSC controller 124 is arranged in DDR and the PCM controller 460 of adapter, and it can have two or more storage channels that are connected to DDR DIMM passage 470 and are interconnected to CPU by high performance interconnect 462 (such as different memory link) on mezzanine/adapter.Associated storer 463b far away is seated on same mezzanine/adapter 463, and is formed by the DIMM 463b that uses DDR passage 470 and be assembled with NVRAM (such as PCM device).
To be the nearly storer that serves as memory side high-speed cache (MSC) reside in the unified shader on each DIMM 465 with the form of DRAM to Fig. 4 J.DIMM 465 is on high performance interconnect/passage 462 (such as difference memory link) that CPU encapsulation 401 and the MSC controller 124 being positioned on DIMM are coupled.Associated storer far away is seated on identical DIMM 465 and by NVRAM (such as PCM device) and forms.
Fig. 4 K illustration serve as MSC nearly storer with the form of DRAM, reside in the unified shader on each DIMM 466.These DIMM are encapsulating CPU on the 401 high performance interconnect/passages 470 that are connected with the MSC controller 124 being positioned on DIMM.Associated storer far away is seated on identical DIMM 466 and by NVRAM (such as PCM device) and forms.
Fig. 4 L illustration in DDR bus 471, use DDR DIMM 464 based on DRAM to form the separation architecture of the necessary nearly storer that serves as MSC.Processor 310 trustship integrated memory controllers 331 and memory side director cache 124.NVRAM (such as PCM storer) forms to reside in and uses the card of high performance interconnect 468 or the storer far away on adapter 467, and high performance interconnect 468 is used transaction protocol to communicate by letter with CPU encapsulation 401.The card of trustship storer far away or adapter 467 trustships can be controlled a plurality of storeies based on PCM or be connected to the single impact damper/controller of a plurality of DIMM based on PCM on that adapter.
Fig. 4 M illustration can form with the DRAM on card or adapter 469 unified shader of the necessary nearly storer that serves as MSC.NVRAM (such as PCM storage arrangement) forms and also resides in the storer far away on card or adapter 469, and card or adapter 469 use the high performance interconnect 468 of CPU encapsulation 401.The card of trustship storer far away or adapter 469 trustships can be controlled a plurality of storeies based on PCM or the single impact damper/controller of a plurality of DIMM based on PCM on that adapter 469, and integrated memory side director cache 124.
In the some of them embodiment describing in the above, in the embodiment such as illustration in Fig. 4 G, DRAM DIMM 411a and the DIMM 411b based on PCM reside on the same memory passage.Thereby, use same group address/control and data circuit that CPU is connected to DRAM storer and PCM storer.In order to reduce by the data business volume of CPU grid interconnect, in one embodiment, the DDR DIMM having on the common storage channel of the DIMM based on PCM is configured to serve as unique MSC of the data in the DIMM being stored in based on PCM.In this type of configuration, be stored in the only nearly storer high speed of the DDR DIMM buffer memory in the same memory passage of memory data far away in the DIMM based on PCM, thus memory transaction is confined to that concrete storage channel.
Additionally, in order to realize embodiment above, system address space can logically be segmented between different memory passage.For example, if there are 4 storage channels, can distribute to each storage channel 1/4 system address space.If DIMM and a DDR DIMM based on PCMS is provided to each storage channel, DDR DIMM can be configured to serve as the MSC of that 1/4 part system address space.
The selection of system storage and mass storage device device can be depending on the e-platform type that adopts the embodiment of the present invention thereon.For example, at personal computer, flat computer, notebook, smart phone, mobile phone, functional mobile phone, personal digital assistant (PDA), portable electronic device, portable type game device, game console, digital camera, switch, center, router, Set Top Box, digital video recorder or have in other device that relatively little mass storage device requires, mass storage device can only use NVRAM mass storage device 152A to realize alone, or realize with NVRAM mass storage device 152A in conjunction with flash memory/magnetic/light mass storage device 152B.For example, in having other e-platform that relatively large mass storage device requires (extensive server), mass storage device can use any combination of magnetic memory apparatus (for example hard drives) or magnetic memory apparatus, light storage device, holographic memory device, mass storage device flash memory and NVRAM mass storage device 152A to realize.In such cases, be responsible for system hardware and/or the software of memory storage and can realize various intelligent persistent storage distribution techniques, with efficient or otherwise useful mode the piece of lasting program code and data is distributed between FM 151B/NVRAM memory storage 152A and flash memory/magnetic/light mass storage device 152B.
For example, in one embodiment, high-power server disposes nearly storer (for example DRAM), PCMS device and magnetic mass storage device device (for a large amount of persistent storage).In one embodiment, notebook computer configuration has nearly storer and PCMS device, and it plays the part of role's (that is, it is logically divided into and plays the part of these roles as shown in Figure 3) of storer far away and mass storage device device.An embodiment of family or office's desk-top computer is configured to notebook similarly, but also can comprise one or more magnetic memory apparatus devices so that a large amount of persistent storage abilities to be provided.
An embodiment of flat computer or honeycomb telephone device disposes PCMS storer, but there is no potentially nearly storer and not additional mass storage device (to saving cost/power).Yet flat computer/phone may be configured with detachable mass storage device device, such as flash memory or PCMS memory stick.
Various other types of devices can configure as mentioned above.For example, portable electronic device and/or personal digital assistant (PDA) can configure by the mode that is similar to above-described flat computer/phone, and game console can configure by the mode that is similar to desk-top computer or laptop computer.Other device that can similar configuration comprises digital camera, router, Set Top Box, digital video recorder, TV and automobile.
The embodiment of MSC framework
In one embodiment of the invention, the most of DRAM in system storage replace with PCM.As previously discussed, PCM provides the remarkable improvement in power system capacity with the remarkable low cost of relative DRAM, and is non-volatile.Yet, some PCM characteristic (such as asymmetric read to write performance, write the circulating continuancing time limit with and non-volatile character) to make it to direct replacement DRAM, not cause great software change challenging.The embodiments of the invention that are described below provide the software transparent mode of integrated PCM, also by software, are strengthened and can be realized newer purposes simultaneously.These embodiment have promoted the successful transformation in memory sub-system framework, and the mode of using single PCM pond to merge storer and memory storage is provided, thereby reduce the needs for independent Nonvolatile memory devices layer in platform.
In Fig. 5 A, the specific embodiment of illustration comprises and respectively has for generating one or more processor cores 501 of internal memory management unit (MMU) 502 of memory requests and the one or more innernal CPU high-speed caches 503 that come program code stored row and data for cache management strategy according to the rules.As mentioned before, cache management strategy can comprise exclusive type cache management strategy (being wherein present in any capable not being present in any other level cache in a concrete level cache in level) or comprise type cache management strategy (cache line wherein repeating is stored in the not at the same level of cache hierarchy).The fine understanding of those skilled in the art, can adopt specific cache management strategy for managing internal high-speed cache 503, and thereby, at this, will not be described in detail.Ultimate principle of the present invention is not limited to any concrete cache management strategy.
The home agent 505 of having gone back illustration in Fig. 5 A, the address, storage channel (MCA) that home agent 505 is used for memory requests by generation provides the access to MSC 510.Home agent 505 is in charge of predetermined memory address space, and solves the memory access conflict of pointing to that storage space.Thereby if any core need to be accessed given address space, it will send request to that home agent 505, then this home agent will send this request to concrete MMU 502.In one embodiment, each MMU 502 distributes a home agent 505; Yet in certain embodiments, single home agent 505 can be served more than one Memory Management Unit 502.
As institute's illustration in Fig. 5 A, MSC 510 be configured in storer far away 519 based on PCM before.MSC 510 access of management to nearly storer 518, and (for example, can not serve these requests from nearly storer 518 time) forwards memory access request (for example read and write) to Memory Controller 521 far away in due course.MSC 510 comprises high-speed cache control module 512, the tag cache 511 of the label of the cache line that high-speed cache control module 512 contains in response to the nearly storer 518 of storaging mark and operating.In operation, when high-speed cache control module 512 determines that can serve memory access request (for example, in response to cache hit) from nearly storer 518 time, it generates nearly storage address (NMA) to identify the data that are stored in nearly storer 518.NMA is explained in nearly Memory Controller unit 515, and generates electric signal as response, to access nearly storer 518.As mentioned before, in certain embodiments, nearly storer is dynamic RAM (DRAM).In such cases, electric signal can comprise row address strobe (RAS) signal and column address strobe (CAS) signal.Yet, be noted that ultimate principle of the present invention is not limited to DRAM for nearly storer.
Another assembly of guaranteeing the application of software transparent memory is the PCM Memory Controller 521 far away of optimizing, and its administration PC M storer 530 characteristics far away still provide required performance simultaneously.In one embodiment, PCM controller 521 comprises indirect addressing table 520, and it converts the MCA being generated by high-speed cache control module 515 to PDA for directly address PCM storer 530 far away.These conversions normally " piece " granularity of 5KB occur.In one embodiment, when Memory Controller 521 far away spread all over PCM unit address space continuous mobile PC M piece with do not guarantee not due to the high frequency of any specific is write cause exhaust focus time need this conversion.As previously described, this type of technology is called as " consume homogenising " sometimes herein.
Thereby MSC 510 is by 512 management of high-speed cache control module, high-speed cache control module 512 allows MSC 510 to absorb, combine and for example filter, to the affairs of PCM storer 530 far away (read and write).All data mobiles and the coherence request of high-speed cache control module 512 management between nearly storer 518 and PCM storer 530 far away.Additionally, in one embodiment, MSC director cache 512 and cpu i/f, and provide the standard of using in the conventional memory subsystems based on DRAM synchronous load/store interface.
To demonstration read and write operation be described in the framework context shown in Fig. 5 A now.In one embodiment, first read operation will arrive MSC controller 512, and MSC controller 512 searches to determine whether to exist asked data (for example utilizing tag cache 511) by execution.If existed, it can turn back to data CPU, core 501 or the I/O device (not shown) of the request of sending.If data do not exist, MSC controller 512 can send to PCM Memory Controller 521 far away by this request together with system memory addresses (in this article also referred to as address, storage channel or MCA).PCM controller 521 becomes PDA by use indirect addressing table 520 by this address translation and read operation is pointed to this region of PCM.When receiving asked data from PCM storer 530 far away, PCM controller 521 can turn back to MSC controller 512 by asked data, MSC controller 512 can store data in the nearly storer 518 of MSC, and data is sent to CPU core 501 or the I/O device of the request of sending.Request subsequently for these data can directly be served from nearly storer 518, until it is by a certain other PCM data replacement.
In one embodiment, first memory write operation also goes to MSC controller 512, and MSC controller 512 writes it in the nearly storer 518 of MSC.In this embodiment, when receiving write operation, can data directly not sent to PCM storer 530 far away.For example, when the position of only storing data in the nearly storer 518 of MSC must be used further to store the data of different system storage address, just data can be sent to PCM storer 530 far away.When this occurs, MSC controller 512 notices that data are current not in PCM storer 530 far away, thereby and will from nearly storer 518, retrieve it, and it is sent to PCM controller 521.PCM controller 521 is searched the PDA for this system memory addresses, and then data is write to PCM storer 530 far away.
In one embodiment, the large young pathbreaker of the nearly storer 518 of MSC is arranged with memory performance far away by operating load memory requirement and nearly storer.For the MSC based on DRAM, large I is arranged to 1/10 operating load memory usage space size or PCM storer 530 sizes far away.This type of MSC compares very large with the conventional high-speed cache of finding in current processor/system architecture.As example, and unrestricted, for the PCM memory size far away of 128GB, the large I of the nearly storer of MSC reaches 16GB.
Fig. 5 B illustration with an additional detail that embodiment is associated of MSC 510.This embodiment comprises one group of logical block being responsible for order and addressing, and it comprises the cache access pattern examination unit 544 that the order for buffers command/address cushions tracking cell 542 and selects MSC operator scheme in response to the control signal from MSC range registers (RR) unit 545.Several demonstration patterns are described below.In brief, these can comprise following pattern: wherein nearly storer is used in pattern in traditional cache role and the pattern of nearly storer 518 forming section system storages wherein.Label inspection/command scheduler 550 use determine from the label of tag cache 511 whether concrete cache line is stored in nearly storer 518, and nearly Memory Controller 515 generates channel address signal (for example CAS and RAS signal).
This embodiment also comprises one group of logical block being responsible for data route and processing, and it comprises for storing one group of data buffer 546 of the data of taking from nearly storer or storing nearly storer into.In one embodiment, also comprise prefetch data high-speed cache 547, it is for storing from the data of nearly storer and/or memory pre-fetch far away.Yet prefetch data high-speed cache 547 is optional, and dispensable for deferring to ultimate principle of the present invention.
Error correcting code (ECC) maker/detector unit 552 generation and checking ECC write or the data read from nearly storer are faultless to nearly storer guaranteeing.As discussed below, in one embodiment of the invention, ECC maker/detector unit 552 is modified to storage cache label.The specific ECC of the fine understanding of those of ordinary skills, and therefore at this, do not describe in detail.The channel controller 553 nearly data bus of storer 518 is coupled to MSC 510, and generates electric signaling for nearly storer 518 necessity of access RAS and the CAS signaling of the nearly storer of DRAM (for example for).
In Fig. 5 B, gone back illustration for MSC 510 being coupled to the storer control interface 548 far away of storer far away.Specifically, storer control interface 548 far away generates the required MCA of addressing storer far away, and transmits data between data buffer 546 and storer far away.
As mentioned, it is very large that the nearly storer 518 adopting is in one embodiment compared with the conventional high-speed cache of finding in current processor/system architecture.Thereby saved system storage address may be also very large to the tag cache 511 of the conversion of nearly storage address.The cost of storing and search MSC label may be the remarkable obstruction that builds large high-speed cache.Thereby in one embodiment of the invention, this problem exploitation of innovation scheme has solved, this innovation scheme is stored in by cache tag the memory storage that is arranged in MSC and carries out ECC protection, has substantially removed thus the memory storage cost for label.
In Fig. 5 C in general manner illustration this embodiment, it shows for store/manage cache tag, storage ECC data and carries out integrated label high-speed cache and the ECC unit 554 of ECC operation.As exemplified, when carrying out label inspection operation (for example, to determine whether concrete data block is stored in nearly memory cache 518), should ask stored label to offer label inspection/command scheduler 550.
Fig. 5 D illustration the tissue of demonstration group of data 524 and corresponding ECC 523 and label 522.As exemplified, label 522 and ECC 523 by colocated for example, in the storer (being DDR DRAM in one embodiment) of tag cache/ECC unit 554.In this example, a plurality of data blocks that amount to 64 bytes have been read in tag cache/ECC unit 554.ECC inspection/maker unit 554a usage data 525 generates ECC, and compares by the ECC of generation with the existing ECC 523 of data correlation.In this example, for the data 525 of 64 bytes, generate the ECC of 4 bytes.Yet ultimate principle of the present invention is not limited to the ECC of any particular type or size.Additionally, be noted that term " data " is broadly used to refer to executable program code and data in this article, they the two all can be stored in the data storage device 525 shown in Fig. 5 D.
In one embodiment, together with the position assignment of 3 bytes (24) label 522 and illustration in Fig. 5 D, use.Particularly, position 00 to 16 is to provide the address bit of the upper address bits of cache line.For the system address for example, with 56 (SPA[55:00]), thereby position 00 to 16 is mapped to the minimum cache size that the position 55-29(of system address allows 512 MB).Turn back to 3 byte tag, reserved place 17-19; Position 20-21 is catalogue position, and the information (for example providing about the indication of other CPU of this row of high-speed cache thereon) of the remote cpu high-speed cache of relevant cache line is provided for they; (for example 00=is clean for the current state of position 21-22 indication cache line; 01=is dirty; 10 and 11=do not use); And whether effectively (for example 1=is effective for position 23 indication cache lines; 0=is invalid).
Utilize direct mapping cache framework described above (its permission is directly extracted nearly storage address from system storage), reduce or eliminated the stand-by period cost of searching tag storage equipment before can read MSC 510, significantly having improved thus performance.And, checking that cache tag has also been eliminated to judge the time whether MSC 510 has desired data, this is because it checks parallel carrying out with the ECC of the data that read from MSC.
Under certain conditions, label is stored and can be produced the problem of writing together with data.Write first read data, to guarantee that it is for a certain other address overwrite data not.This type of before at every turn writing cost of reading to become is very high.One embodiment of the present of invention adopt the dirty row tag cache of the label of the nearly storage address (NMA) of preserving recent visit.Due to many addresses of writing for recent visit, therefore rationally little tag cache can obtain efficient hit rate, to filter major part before writing, reads.
In Fig. 5 E illustration with an additional detail that embodiment is associated of PCM DIMM 519, it comprises PCM Memory Controller 521 far away and one group of PCM memory module 530a-i far away.In one embodiment, dynamically share between system storage purposes and memory storage purposes in the single pond of PCM storer 530a-i far away.In this embodiment, whole PCM pond 530a-i can be subdivided into " piece " of 4KB size.PCM descriptor table (PDT) 565 each PCM piece of sign are as the use of storer or memory storage.For example, every row PDT can represent concrete piece, the wherein use of concrete each piece of row sign (1=storer for example; 0=memory storage).In this embodiment, starter system configuration can be divided the PCM piece (for example, by PDT 565 is programmed) in PCM 530a-i between memory storage use and storer use.In one embodiment, with identical table, get rid of bad piece, and be provided for consuming the stand-by block of homogenising operation.In addition, PDT 565 also can comprise each PCMS piece to the mapping of " logic " block address by software application.The in the situation that of system storage, LBA (Logical Block Addressing) is identical with MCA or SPA.No matter the mobile PC MS piece due to consume homogenising when, upgrading indirect addressing table (AIT) 563 all needs this association.When this occurs, by the LBA (Logical Block Addressing) of software application, must be mapped to different PCMS unit addresses (PDA).In one embodiment, this mapping is stored in AIT, and upgrades when each consume homogenising moves.
As exemplified, PCM controller 521 comprises system physical address (SPA) to PCM mapper 556, and it operates in response to consuming administrative unit 555 and indirect addressing unit 563, SPA is mapped to PCM piece.In one embodiment, consume management logic 555 is realized consume homogenising algorithm to consider the following fact: after too many time is write and/or wipe access, the storage unit of PCM 530a-530i starts to exhaust.Consume homogenising spreads and writes and wipe in the storage unit of PCM device, and the data block for example by forcing with low cycle count moves once in a while, and allows thus high loop-around data piece to be placed in the storage unit of storage low-circulation number according to piece.Conventionally, several piece does not circulate mostly, but the failure of high cycle count piece most probable, and consume homogenising is exchanged the address of the address of high cycle count piece and low cycle count piece.Consume management logic 555 can with one or more counters and register follow the tracks of cycle count (for example, whenever circulation time being detected, counter can increase progressively 1 and result can be stored in this group register).
In one embodiment, indirect addressing logic 563 comprises indirect addressing table (AIT), the indication that it contains the PCM piece that write operation should be directed.AIT can be used for movable block between memory use and memory storage purposes automatically.From software angle, the access of all are used to legacy memory load/store semantic (consume homogenising and indirect addressing operation occurs pellucidly to software).In one embodiment, AIT is for converting the SPA by Software Create to PDA.When needs consume PCMS device equably, need this conversion, data need to be mobile to avoid any focus everywhere in PDA space.When this type of moves generation, the relation between SAP and PDA will change, and AIT will be updated to reflect this new conversion.
After SAP arrives the mapping of PCM, the basic PCM operation (for example read and/or write) of dispatcher unit 557 scheduling to PCM device 530a-l, and PCM protocol engine 558 generates the required electric signaling of execution read/write operation.EDC error detection and correction operation is carried out in ECC unit 562, and the temporary transient data of reading from PCM device 530a-l or the data of writing to PCM device 530a-l of cushioning of data buffer 562.Lasting write buffer 559 for example, for preserving even the data that also guarantee to be write back PCMS unexpected power fail (its uses Nonvolatile memory devices to realize) in the situation that.Comprising to refresh supports logic 560 for example, with updating data algorithm (after lasting write buffer reaches regulation threshold) periodically and/or according to the rules PCMS to be refreshed to lasting write buffer.
In one embodiment, MSC 510 is routed directly to PCM Memory Controller 521 far away by memory storage access automatically, and memory access is routed to MSC high-speed cache control module 512.Regular read and write is regarded in the memory storage access that arrives PCM Memory Controller 521 far away, and indirect addressing described herein is applied as usual with consume homogenising mechanism.Adopt in one embodiment of the invention additional optimizations, it can realize when data need to move between memory storage and storer.Owing to using public PCM pond 530a-1, therefore can for example, by changing simply pointer in conversion table (AIT), cancel or postpone data mobile.For example, when data are delivered to storer from memory storage, the pointer that identifies data in concrete physics PCM memory storage position can be updated to and indicate same physical PCM memory storage position is the memory location in system storage now.In one embodiment, this carries out in the transparent mode of software by hardware, so that benefit in performance and Power Benefit to be provided.
Except the transparent operator scheme of software, an embodiment of MSC controller 512 also provides by the indicated blocked operation pattern of MSC range registers (RR) 545.These operator schemes can be including but not limited to following:
1) the direct access to PCM storer for the application of memory storage class.This type of purposes is delivered to permanent state also requiring MSC controller 512 to guarantee to submit to writing in fact of PCM 519.
2) mixing of nearly storer 518 is used, and to software, exposes its part to directly use, and keeps remainder as MSC simultaneously.When a part for nearly storer 518 is exposed to software to directly uses, part can directly address in system address space.This allows some to be applied in the storer distribution of dividing clearly them between high-performance zonule (near storer 518) and relative low performance piece region (far storer 530).By contrast, a part that does not form system address space as the part of the internally cached distribution of MSC (and is served as the high-speed cache for storer 530 far away, as described herein) on the contrary.
As previously discussed, to become to make a plurality of different system division methods be possible to MSC architecture definition.These methods drop in two broad sense buckets:
(1) separation architecture: in this scheme, MSC controller 512 is arranged in CPU, and tackles all system storage requests.Have two independent interfaces, they for example leave CPU, for example, to be connected to nearly storer (DRAM) and storer (PCM) far away from MSC.Each interface is special for particular type storer, and each storer can be at independence convergent-divergent aspect performance and capacity.
(2) unified shader: in this scheme, single memory interface leaves CPU, and all memory requests are all sent to this interface.MSC controller 512 for example, for example, is incorporated on this individual interface in CPU outside together with nearly storer (DRAM) and storer (PCM) subsystem far away.In one embodiment, this memory interface is met the memory performance requirement of CPU by special one-tenth, and supports the unordered protocol of affairs.Nearly memory requirement and memory requirement far away meet in " unification " mode on each interface of these interfaces.
In the scope of bucket, a plurality of different portioning options are feasible, and some of them are described below in the above.
separated example:
Nearly storer: DRR5 DIMM
Nearly memory interface: one or more DDR5 passages
PCM controller/device on quick (PCIe) card of storer: PCI far away
Memory interface far away: third generation x16 PCIe
(2) unify example:
CPU memory interface: one or more KTMI (or QPMI) passage
Nearly storer/the storer far away on adapter card with MSC/PCM controller
Nearly memory interface outside MSC/PCM controller: DDR5 interface
Memory interface far away outside MSC/PCM controller: PCM device interface
The embodiment with different nearly storage operation patterns
As discussed above, second-level storage level can be used for introducing quick nonvolatile memory (such as PCM) as system storage, uses the very large nearly storer based on DRAM simultaneously.Nearly storer can be used as the high-speed cache of hardware management.Yet some application are not hardware cache close friends, and thereby, will benefit from the over-over mode of using this type of storer.Because may have at any given time a plurality of different application to move on server, so one embodiment of the present of invention allow to enable multi-usage pattern simultaneously.Additionally, an embodiment provides the ability for the nearly storer of each mode assignments in these purposes patterns of controlling.
In one embodiment, MSC controller 152 is provided for using the following pattern of nearly storer.As mentioned before, in one embodiment, current operation pattern can be by the operation code regulation being stored in MSC range registers (RR) 545.
(1) write back cache pattern: in this pattern, the nearly storer 518 of all or part is used as the high-speed cache for PCM storer 530.Although writing back pattern, each write operation points to nearly storer 518 (supposition exists this to write cache line pointed in high-speed cache) at first.When only the cache line in nearly storer 518 will be replaced by another cache line, just carry out corresponding write operation to upgrade PCM storer 530 far away (the write through pattern formation that is propagated into immediately storer 530 far away with each write operation the following describes contrasts).
In one embodiment, first read operation will arrive MSC director cache 512, and this controller 512 searches to determine in PCM storer 518 far away, whether have asked data (for example utilizing tag cache 511) by execution.If existed, it can turn back to data CPU, core 501 or the I/O device (not shown in Fig. 5 A) of the request of sending.If data do not exist, MSC director cache 512 can send to PCM Memory Controller 521 far away by this request together with system memory addresses.PCM Memory Controller 521 far away can convert system memory addresses to PCM physical unit address (PDA), and read operation is pointed to this region of storer 530 far away.As mentioned before, this conversion can utilize indirect addressing table (AIT) 563, and PCM controller 521 uses AIT 563 to change between system memory addresses and PCM PDA.In one embodiment, AIT is updated to a part for consume homogenising algorithm, and consume homogenising algorithm is implemented with distributed storage accessing operation and reduces thus the consume on PCM FM 530.
When receiving asked data from PCM FM 530, PCM FM controller 521 turns back to MSC controller 512 by asked data, MSC controller 512 stores data in the nearly storer 518 of MSC, and data is sent to processor core 501 or the I/O device (not shown in Fig. 5 A) of the request of sending.Request subsequently for these data can directly be served from nearly storer 518, until it is by a certain other PCM FM data replacement.
In one embodiment, first memory write operation also goes to MSC controller 512, and MSC controller 512 writes it in the nearly storer of MSC that serves as FM high-speed cache 518.In this embodiment, when receiving write operation, can data directly not sent to PCM FM 530.For example, when the position of only storing data in the nearly storer of MSC that serves as FM high-speed cache 518 must be used further to store the data of different system storage address, just data can be sent to PCM FM 530.When this occurs, MSC controller 512 notices that data are current not in PCM FM 530, thereby and will from serve as the nearly storer of FM high-speed cache 518, retrieve it, and it is sent to PCM FM controller 521.PCM controller 521 is searched the PDA for system memory addresses, and then data is write to PCM FM 530.
(2) nearly storer bypass mode: in this pattern, the NM of FM high-speed cache 518 is served as in all read and writes all bypass, and directly goes to PCM storer 530 far away.When application is not high-speed cache close friend, or need to pay data lastingly with the granularity of cache line time, for example, can use this quasi-mode.In one embodiment, by processor high speed buffer memory 503 and the high-speed cache that the NM that serves as FM high-speed cache 518 carries out, operated independently of one another.Thereby, in the data of processor high speed buffer memory 503 high speed buffer memorys (and it can not be permitted on processor high speed buffer memory 503 high speed buffer memorys in some cases), can not serve as the NM high speed buffer memory of FM high-speed cache 518, and vice versa.Thereby some data that can be designated as " not cacheable " in processor high speed buffer memory 503 can be internally cached at the NM that serves as FM high-speed cache 518.
(3) nearly read-high-speed cache of storer is write bypass mode: this is the modification of upper surface model, wherein allows to read high-speed cache persistant data (that is, persistant data at MSC 510 high speed buffer memorys to carry out read-only operation) from PCM 519.When most of persistant data be " read-only " and application purpose be high-speed cache close friend time, this is useful.
(4) nearly read-high-speed cache of storer write through pattern: this is the modification of last pattern, wherein except reading high-speed cache, goes back high-speed cache and writes and hit.Writing of the nearly storer 518 of MSC all caused to writing PCM storer 530 far away at every turn.Thereby the write through character due to high-speed cache, has still guaranteed cache line persistence.
(5) nearly direct memory access (DMA) pattern: in this pattern, the nearly storer of all or part is directly visible for software, and forms the part in system memory addresses space.This type of storer can be completely under software control.Any data mobile in this region from PCM storer 519 to nearly storer all needs clear and definite software copy.This type of scheme can create the memory block, non-uniform memory address (NUMA) for software, and wherein it obtains with respect to the higher performance of PCM storer 530 far away from nearly storer 518.This type of purposes can be used for need to be to some data structure very some high-performance calculation (HPC) and the graphical application of fast access.This nearly direct memory access (DMA) pattern is equivalent to some cache line in nearly storer " pin to be set ".This type of arranges pin and can in large, multichannel set-associative cache, effectively carry out.
Below Table A, summarized above-described each operator scheme.
Table A
For realizing processor and the chipset component of above operator scheme, comprise following:
(1) in two-layer storer (2LM) level, manage the memory side director cache 512 of nearly storer.
(2) in memory side high-speed cache 510, be identified for a class range register 545 (seeing Fig. 5 B) of the system address scope of each aforesaid operations pattern.
(3) confirm the mechanism of having write from PCM memory sub-system 519 to MSC controllers 515.
(5) make the invalid mechanism of row in nearly storer 518.
(5) dirty row expelled to PCM and make the invalid engine that refreshes in the regulation region of nearly memory address space.
In one embodiment, the memory range for each purposes pattern is continuous in system address space.Yet model identical can be used in a plurality of regions that separate.In one embodiment, each the model domain register in this group MSC RR 545 provides following information:
(1) operator scheme (such as writing back, nearly storer bypass mode etc.);
(2) scope in system address space basis (for example, with the granularity of 2MB or coarsegrain more); And
(3) scope of identified areas size is sheltered field.
In one embodiment, several patterns of supporting are to realize specifically, but suppose for each operator scheme, and only a continuous system address scope can be used.If stipulate nearly direct memory access (DMA) range registers, suppose that this will be mapped to following continuum, it originates in the bottom in nearly memory address space.This type of continuum must be less than the size of nearly storer.Additionally, if use any cache mode, directly access region size must be less than nearly memory size to consider the sufficient cache memory sizes for desired properties.For various patterns, to this type of distribution of nearly storer, can be can be by user configured.
In a word, one embodiment of the present of invention are according to realizing as next group operation:
(1) when any, read or write access while reaching memory side director cache 512, its examination scope register 545 (Fig. 5 B) is to determine current operation pattern.
(2) for any, read high-speed cache/write bypass access, MSC controller 512 checks to look at that this address is current whether has been cached.If it is that it must make this row invalid before will having write the source of sending it back.
(3) for any, write the direct PCM of bypass operation, MSC controller 512 waits for that completing of returning from PCM controller 521 guarantee that this is write and be delivered to the visible impact damper of the overall situation.
(4) any appropriate area of all pointing to nearly storer that reads or writes to direct access module space in nearly storer.Do not have affairs to be sent to PCM storer.
(5) any change that increases or reduce any existing region or interpolation new region in range registers configuration all need to refresh to PCM the region of suitable high-speed cache.For example, if software wishes to increase by dwindling write back cache region the size in direct access module region, it can, by first expelling the suitable part of nearly memory area and making it invalid and then change nearly direct memory access (DMA) model domain register, so be done.Then MSC controller 510 will know that high-speed cache in the future carries out for less nearly memory address space.
In Fig. 6 A illustration a specific embodiment of the present invention, wherein system physical address (SPA) space is carved up between a plurality of MSC.In the embodiment of institute's illustration, MSC high-speed cache 654 is associated with SAP region 667a with controller 656; MSC high-speed cache 655 is associated with SAP region 667b with controller 657; MSC high-speed cache 661 is associated with SAP region 667c with controller 663; And MSC high-speed cache 660 is associated with SAP region 667d with controller 662.Illustration two CPU 670 and 671, each CPU has 4 cores, be respectively 650 and 651, and there is home agent, be respectively 652 and 653.Two CPU 670 and 671 are coupled to public Memory Controller far away 666 via memory interface 659 far away and 665 respectively.
Thereby in Fig. 6 A, whole SAP storage space is subdivided into a plurality of regions, wherein each region is associated with concrete MSC and controller.In this embodiment, given MSC can have discontinuous SPA allocation of space, but any two MSC will not have overlapping SPA space.And, these MSC and not overlapping SAP space correlation, and do not need consistance technology between MSC.
On the framework shown in Fig. 6 A, can adopt above-described any nearly memory mode.For example, each MSC controller 656-657,662-663 can be configured to read at write back cache pattern, nearly storer bypass mode, nearly storer that high-speed cache is write bypass mode, nearly storer is read high-speed cache write through pattern or nearly direct memory access (DMA) pattern operation.As previously discussed, in range registers (RR) 655, for each MSC 610, stipulated concrete pattern.
In one embodiment, different MS C can realize different operation modes simultaneously.For example, the range registers of MSC controller 656 can be stipulated nearly direct memory access (DMA) pattern, the range registers of MSC controller 657 can be stipulated write back cache pattern, the range registers of MSC controller 662 can stipulate to read high-speed cache/write bypass mode, and MSC controller 663 can stipulate to read high-speed cache/write through pattern.In addition, in certain embodiments, each MSC can realize different operation modes simultaneously.For example, MSC controller 656 can be configured to realize nearly direct memory access (DMA) pattern and realize nearly storer bypass mode for other system address scope for some system address scope.
Aforementioned combination is only the illustration of the mode that can independently be programmed of MSC controller certainly.Basic principle of the present invention is not limited to these combinations or any other combination.
As with respect to above-described some of them embodiment described (such as describing with respect to Fig. 4 G), MSC and MSC controller thereof are for example configured to, in the upper operation in the identical storage channel (identical physics DDR bus) of the PCM DIMM with being responsible for that concrete SPA scope.Thereby, in this embodiment, occur in the memory transaction of specifying within the scope of SPA and be limited in the same memory passage, reduce by the data service of CPU grid interconnect thus.
Fig. 6 B provides according to the how diagrammatic representation of configuration-system memory imaging 620, nearly memory imaging 621 and PCM address mapping 622 of embodiments of the invention.As previously discussed, MSC controller 606 is in the pattern operation by range registers (RR) 605 signs.The 3rd region 605 that system memory map 620 has the first area 602 that minute is used in nearly direct memory access (DMA) pattern, minute be used in the second area 603 of nearly storer bypass mode and minute be used in write back cache pattern.MSC controller 606 provides the access to nearly storer by the indication of nearly memory imaging 621, and nearly memory imaging 621 comprises the first area 608 of distributing to write back cache pattern and the second area 609 of distributing to nearly direct memory access (DMA) pattern.As exemplified, nearly memory cache bypass operation is provided directly to according to the PCM controller 610 of PCM address mapping 622 operations, and PCM address mapping 622 comprises nearly storer by-pass area 611 (for nearly storer bypass mode) and write back cache region 612 (for write back cache pattern).Thereby, can shine upon 622 in AD HOC subdivision system memory mapped 620, nearly memory imaging 621 and PCM address based on being realized by MSC controller.
Fig. 6 C and 6D illustration the addressing technique adopting in one embodiment of the invention (some of them may have been described in general manner).Specifically, Fig. 6 C shows system physical address (SPA) 675 and how to be mapped to nearly storage address (NMA) or PCM unit address (PDA).Specifically, first SPA decodes to identify home agent 605 (for example this home agent is responsible for the address space of decoding) by the decode logic 676 in processor.The decode logic 677 associated with selected home agent 605 further decoded and with generating identification, distributed to the address, storage channel (MCA) of the suitable MSC director cache 612 in that concrete SAP space SAP 675 (or its part).Then selected director cache 612 is mapped to nearly storage address 678 by memory access request, afterwards optionally at 680 interlace operation (the following describes), or alternatively, at the optional interlace operation of 679 execution, be for example, by PCM Memory Controller mapping far away 681 to PCM unit address PDA (using indirect addressing described above and consume management) afterwards.
Fig. 6 D illustration an embodiment of optional interleaving process, it shows, and can how to use interweaves splits the software page on a plurality of MSC and PCM address space.In the example shown in Fig. 6 D, two page 682-683 in SPA space interweave in MCA space, to generate two groups of row 685-686 that interweave by the cache line logic 685 that interweaves.For example, all odd-numbered lines (such as row 1,3,5 etc.) from locked memory pages 682-683 can be sent to a MCA space 685, and can be sent to the 2nd MCA space 686 from all even number lines (such as row 2,5,6 etc.) of locked memory pages 682-683.In one embodiment, these pages are pages of 5K byte, but basic principle of the present invention is not limited to any page size.According to the PCM controller 687-688 of indirect addressing table (AIT) and the operation of consume management logic, then rearrange the cache line (as mentioned above) in PCM unit address (PDA) storage space.Can be used for the interweaving of this character operating load to be distributed on MSC 610 and/or PCM device 619 (for example as non-uniform memory address (NUMA) alternative).
Embodiments of the invention can comprise various steps, and these steps are described above.These step available machines used executable instructions are implemented, and machine-executable instruction can be used for making universal or special processor to carry out these steps.Alternatively, these steps can be by carrying out containing the specific hardware components that is useful on the firmware hardwired logic of carrying out these steps, or carried out by any combination of programmed computer components and custom hardware components.
As described herein, instruction can refer to the particular hardware configuration of carrying out some operation or having predetermined functional special IC (ASIC) such as being configured to, or is stored in the software instruction in the storer of implementing with non-transient state computer-readable medium.Thereby the technology shown in accompanying drawing can be used code and the data in the upper storage of one or more electronic installations (such as terminal station, network element etc.) and execution to realize.This type of electron device is used computer-readable medium (such as non-transient state computer machine readable storage medium storing program for executing (such as disk, CD, random access memory, ROM (read-only memory), flash memory device, phase transition storage) and the readable communication media of transient state computer machine (such as the transmitting signal of electricity, light, sound or other form---such as carrier wave, infrared signal, digital signal etc.)) storage and transmission (in inside and/or by network and other electron device) code and data.In addition, this type of electron device comprises the set of the one or more processors that are coupled to one or more other assemblies (for example, being connected with network such as one or more memory storage devices (non-transient state machinable medium), user's input/output device (keyboard, touch-screen and/or display)) conventionally.The coupling of processor sets and other assembly is conventionally by one or more buses and bridge (also referred to as bus controller).Memory storage device represents respectively one or more machinable mediums and machine readable communication media with the signal that carries Network.Thereby the common storage code of memory storage device of given electron device and/or data to carry out in the set of one or more processors of that electron device.Certainly, one or more parts of the embodiment of the present invention can realize with the various combination of software, firmware and/or hardware.Spread all over this and describe in detail, for illustrative purposes, set forth a large amount of specific detail, to provide thorough understanding of the present invention.Yet, for those skilled in the art, understand, in the situation that do not have some details in these specific detail also can put into practice the present invention.In some instances, the not detailed description of well-known 26S Proteasome Structure and Function, in order to avoid make theme of the present invention fuzzy.Therefore, scope and spirit of the present invention should be according to following claims judgement.

Claims (35)

1. a computer system, comprising:
Processor, described processor has a plurality of for carrying out the core of instruction deal with data and one or more for carry out the processor high speed buffer memory of high-speed cache instruction and data according to the first cache management strategy;
First memory passage, it comprises the first group address/control and the data circuit that is coupled to described processor;
Second memory passage, it comprises the second group address/control and the data circuit that is coupled to described processor;
First order first memory and first order second memory, first group of characteristic respectively with associated, first group of characteristic comprises the first read access speed and the first write access speed, first order first memory is coupled to first memory passage, and first order second memory is coupled to second memory passage; And
Second level first memory is coupled to first memory passage by correspondence, and second level second memory is coupled to second memory passage by correspondence, second level first memory and second level second memory have second group of characteristic of associated, second group of characteristic comprises the second read access speed and the second write access speed, the second read access speed is compared lower with the first read access speed or the first write access speed respectively with at least one in the second write access speed, non-volatile, if power supply is removed, second level first memory and second level second memory are preserved content, wherein at least a portion of first order first memory is configured to for being stored in the high-speed cache of the instruction and data of second level first memory, and at least a portion of first order second memory is configured to for being stored in the high-speed cache of the instruction and data of second level second memory.
2. the system as claimed in claim 1, wherein one of first group of characteristic comprises the first power consumption level, and second group of characteristic comprise the second power consumption level, the second power consumption level is compared lower with the first power consumption level.
3. the system as claimed in claim 1, wherein one of first group of characteristic comprises the first density, and second group of characteristic comprise the second density, the second density is compared higher with the first density.
4. the system as claimed in claim 1, wherein one of second group of characteristic comprises that second level first memory and second level second memory directly can write, so that need to before writing, not wipe available data.
5. the system as claimed in claim 1, wherein first order first memory and first order second memory comprise dynamic RAM (DRAM), and wherein said one or more processor high speed buffer memory comprises static RAM (SRAM).
6. system as claimed in claim 5, wherein second level first memory and second level second memory comprise phase transition storage (PCM).
7. system as claimed in claim 6, wherein said PCM storer comprises phase transition storage and switch (PCMS) storer.
8. the system as claimed in claim 1, further comprises:
Mass storage device device, it is for storing enduringly instruction and data, and described mass storage device device is coupled to first order first memory and first order second memory and second level first memory and second level second memory by correspondence by interface.
9. the system as claimed in claim 1, wherein first order first memory and first order second memory are logically subdivided into first and second portion, first is assigned with as system storage, and second portion is assigned with as be respectively used to be stored in the high-speed cache of the instruction and data in second level first memory and second level second memory according to the second cache management strategy.
10. the system as claimed in claim 1, wherein the first write access speed is compared highlyer with the second write access speed, but the first read access speed is similar to the second read access speed.
11. systems as claimed in claim 10, wherein the first write access speed is at least than the fast order of magnitude of the second write access.
12. the system as claimed in claim 1, wherein first group of characteristic comprises the first read access stand-by period and the first write access stand-by period, and second group of characteristic comprises the second read access stand-by period and the second write access stand-by period, the second read access stand-by period was compared higher with the first read access stand-by period or the second write access stand-by period respectively with at least one in the stand-by period of the second write access.
13. the system as claimed in claim 1, wherein second level first memory is compared per unit size with first order first memory with first order second memory with second level second memory and is manufactured more cheap.
14. the system as claimed in claim 1, wherein the first cache management strategy is independent of the second cache management strategy and operates.
15. the system as claimed in claim 1, wherein first memory passage and second memory passage comprise double data rate (DDR) storage channel.
16. systems as claimed in claim 15, wherein first order first memory and first order second memory comprise the first dual inline memory modules (DIMM) and the second dual inline memory modules (DIMM), and second level first memory and second level second memory comprise the 3rd DIMM and the 4th DIMM, the one DIMM and the 2nd DIMM are coupled to each slot on first memory passage, and the 2nd DIMM and the 3rd DIMM are coupled to each slot on second memory passage.
17. 1 kinds of computer systems, comprising:
Processor, described processor has a plurality of for carrying out the core of instruction deal with data and one or more for carry out the processor high speed buffer memory of high-speed cache instruction and data according to the first cache management strategy;
First memory passage, it comprises group address/control and a data circuit that is coupled to described processor;
First order storer, it has first group of characteristic of associated, and first group of characteristic comprises the first read access speed and the first write access speed, and first order storer is coupled to first memory passage by correspondence; And
Second level storer, it is coupled to first memory passage by correspondence, second level storer has second group of characteristic of associated, second group of characteristic comprises the second read access speed and the second write access speed, the second read access speed is compared lower with the first read access speed or the first write access speed respectively with at least one in the second write access speed, non-volatile, if power supply is removed, second level storer is preserved its content, be random access and there is byte addressing capability, the granularity that makes to be stored in the granularity that instruction wherein or data can use with the memory sub-system that is equivalent to described system is carried out random access.
18. systems as claimed in claim 17, wherein at least a portion of first order storer is configured to for being stored in the high-speed cache of the instruction and data of second level storer.
19. systems as claimed in claim 17, wherein one of first group of characteristic comprises the first power consumption level, and second group of characteristic comprise the second power consumption level, the second power consumption level is compared lower with the first power consumption level.
20. systems as claimed in claim 17, wherein one of first group of characteristic comprises the first density, and second group of characteristic comprise the second density, the second density is compared higher with the first density.
21. systems as claimed in claim 17, wherein one of second group of characteristic comprises that second level storer directly can write, so that need to before writing, not wipe available data.
22. systems as claimed in claim 17, wherein first order storer comprises dynamic RAM (DRAM), and wherein said one or more processor high speed buffer memory comprises static RAM (SRAM).
23. the system as claimed in claim 22, wherein second level storer comprises phase transition storage (PCM).
24. systems as claimed in claim 23, wherein said PCM storer comprises phase transition storage and switch (PCMS).
25. systems as claimed in claim 17, further comprise:
Mass storage device device, it is for storing enduringly instruction and data, and described mass storage device device is coupled to first order storer and second level storer by correspondence by interface.
26. systems as claimed in claim 17, wherein first order storer is logically subdivided into first and second portion, first is assigned with as system storage, and second portion be assigned with as according to the second cache management strategy for being stored in the high-speed cache of the instruction and data of second level storer.
27. systems as claimed in claim 17, wherein the first write access speed is compared highlyer with the second write access speed, but the first read access speed is similar to the second read access speed.
28. systems as claimed in claim 27, wherein the first write access speed is at least than the fast order of magnitude of the second write access.
29. systems as claimed in claim 17, wherein first group of characteristic comprises the first read access stand-by period and the first write access stand-by period, and second group of characteristic comprises the second read access stand-by period and the second write access stand-by period, the second read access stand-by period was compared higher with the first read access stand-by period or the second write access stand-by period respectively with at least one in the stand-by period of the second write access.
30. systems as claimed in claim 17, wherein second level storer is compared per unit size with first order storer and is manufactured more cheap.
31. systems as claimed in claim 17, wherein the first cache management strategy is independent of the second cache management strategy and operates.
32. systems as claimed in claim 17, wherein said storage channel comprises double data rate (DDR) storage channel.
33. systems as claimed in claim 32, wherein first order storer comprises the first dual inline memory modules (DIMM), and second level storer comprises the 2nd DIMM, a DIMM and the 2nd DIMM are coupled to each slot on first memory passage.
34. systems as claimed in claim 17, further comprise:
Second memory passage, comprises the second group address/control and the data circuit that is coupled to described processor;
First order second memory and second level second memory, described first order second memory has identical first group of characteristic of associated, and described second level second memory has identical second group of characteristic of associated, and first order storer and second level storer are coupled to second memory passage by correspondence.
35. systems as claimed in claim 34, the at least a portion that is wherein coupled to the first order storer of first memory passage is configured to for being stored in the high-speed cache of the instruction and data of the second level storer that is coupled to first memory passage, and at least a portion that is wherein coupled to the first order second memory of second memory passage is configured to for being stored in the high-speed cache of the instruction and data of the second level second memory that is coupled to second memory passage.
CN201180075093.9A 2011-09-30 2011-09-30 For realizing the device and method of multi-level store level on common storage channel Expired - Fee Related CN103946826B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2011/054436 WO2013048500A1 (en) 2011-09-30 2011-09-30 Apparatus and method for implementing a multi-level memory hierarchy over common memory channels

Publications (2)

Publication Number Publication Date
CN103946826A true CN103946826A (en) 2014-07-23
CN103946826B CN103946826B (en) 2019-05-31

Family

ID=47996231

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201180075093.9A Expired - Fee Related CN103946826B (en) 2011-09-30 2011-09-30 For realizing the device and method of multi-level store level on common storage channel

Country Status (5)

Country Link
US (1) US9317429B2 (en)
EP (1) EP2761480A4 (en)
CN (1) CN103946826B (en)
TW (1) TWI594182B (en)
WO (1) WO2013048500A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776358A (en) * 2015-10-07 2017-05-31 三星电子株式会社 DIMM SSD address performance technologies
CN107291392A (en) * 2017-06-21 2017-10-24 郑州云海信息技术有限公司 A kind of solid state hard disc and its reading/writing method
CN108292262A (en) * 2015-12-03 2018-07-17 华为技术有限公司 Computer storage management method and system
WO2018141174A1 (en) * 2017-02-03 2018-08-09 Huawei Technologies Co., Ltd. Systems and methods for utilizing ddr4-dram chips in hybrid ddr5-dimms and for cascading ddr5-dimms
CN111831216A (en) * 2019-04-18 2020-10-27 三星电子株式会社 Memory module including mirror circuit and method of operating the same
CN116483288A (en) * 2023-06-21 2023-07-25 苏州浪潮智能科技有限公司 Memory control equipment, method and device and server memory module

Families Citing this family (117)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5520747B2 (en) * 2010-08-25 2014-06-11 株式会社日立製作所 Information device equipped with cache and computer-readable storage medium
WO2013048493A1 (en) 2011-09-30 2013-04-04 Intel Corporation Memory channel that supports near memory and far memory access
WO2013089685A1 (en) 2011-12-13 2013-06-20 Intel Corporation Enhanced system sleep state support in servers using non-volatile random access memory
US9448922B2 (en) 2011-12-21 2016-09-20 Intel Corporation High-performance storage structures and systems featuring multiple non-volatile memories
US20130166865A1 (en) * 2011-12-22 2013-06-27 Alex Lemberg Systems and Methods for Managing Parallel Access to Multiple Storage Systems
KR101572403B1 (en) 2011-12-22 2015-11-26 인텔 코포레이션 Power conservation by way of memory channel shutdown
US9396118B2 (en) 2011-12-28 2016-07-19 Intel Corporation Efficient dynamic randomizing address remapping for PCM caching to improve endurance and anti-attack
US20140229659A1 (en) * 2011-12-30 2014-08-14 Marc T. Jones Thin translation for system access of non volatile semicondcutor storage as random access memory
US9418700B2 (en) 2012-06-29 2016-08-16 Intel Corporation Bad block management mechanism
US20140101370A1 (en) 2012-10-08 2014-04-10 HGST Netherlands B.V. Apparatus and method for low power low latency high capacity storage class memory
CN104704569B (en) * 2012-12-19 2017-11-14 慧与发展有限责任合伙企业 NVRAM Path selections
US9652376B2 (en) 2013-01-28 2017-05-16 Radian Memory Systems, Inc. Cooperative flash memory control
US10445229B1 (en) 2013-01-28 2019-10-15 Radian Memory Systems, Inc. Memory controller with at least one address segment defined for which data is striped across flash memory dies, with a common address offset being used to obtain physical addresses for the data in each of the dies
CN103970219B (en) * 2013-01-30 2018-03-20 鸿富锦精密电子(天津)有限公司 Storage device and the mainboard for supporting the storage device
US9032099B1 (en) * 2013-12-12 2015-05-12 Intel Corporation Writeback mechanisms for improving far memory utilization in multi-level memory architectures
US9135184B2 (en) * 2013-12-12 2015-09-15 International Business Machines Corporation Load-through fault mechanism
WO2015116133A2 (en) * 2014-01-31 2015-08-06 Hewlett-Packard Development Company, L.P. Remapping memory locations in a memory array
US9773547B2 (en) 2014-01-31 2017-09-26 Hewlett Packard Enterprise Development Lp Non-volatile memory with multiple latency tiers
GB2524063B (en) 2014-03-13 2020-07-01 Advanced Risc Mach Ltd Data processing apparatus for executing an access instruction for N threads
JP6093322B2 (en) * 2014-03-18 2017-03-08 株式会社東芝 Cache memory and processor system
US10002044B2 (en) 2014-08-19 2018-06-19 Samsung Electronics Co., Ltd. Memory devices and modules
US10002043B2 (en) 2014-08-19 2018-06-19 Samsung Electronics Co., Ltd. Memory devices and modules
US9542118B1 (en) 2014-09-09 2017-01-10 Radian Memory Systems, Inc. Expositive flash memory control
US11232855B2 (en) * 2014-09-23 2022-01-25 Airstrip Ip Holdings, Llc Near-real-time transmission of serial patient data to third-party systems
US10126950B2 (en) * 2014-12-22 2018-11-13 Intel Corporation Allocating and configuring persistent memory
JP2016167215A (en) 2015-03-10 2016-09-15 株式会社東芝 Memory device
US10204047B2 (en) * 2015-03-27 2019-02-12 Intel Corporation Memory controller for multi-level system memory with coherency unit
US10115446B1 (en) 2015-04-21 2018-10-30 Spin Transfer Technologies, Inc. Spin transfer torque MRAM device with error buffer
US10073659B2 (en) 2015-06-26 2018-09-11 Intel Corporation Power management circuit with per activity weighting and multiple throttle down thresholds
US10387259B2 (en) 2015-06-26 2019-08-20 Intel Corporation Instant restart in non volatile system memory computing systems with embedded programmable data checking
US9916091B2 (en) 2015-07-13 2018-03-13 Samsung Electronics Co., Ltd. Memory system architecture
US10163479B2 (en) 2015-08-14 2018-12-25 Spin Transfer Technologies, Inc. Method and apparatus for bipolar memory write-verify
TWI662413B (en) * 2015-09-17 2019-06-11 慧榮科技股份有限公司 Storage device and data access method thereof
CN106547701B (en) * 2015-09-17 2020-01-10 慧荣科技股份有限公司 Memory device and data reading method
US10108549B2 (en) 2015-09-23 2018-10-23 Intel Corporation Method and apparatus for pre-fetching data in a system having a multi-level system memory
US10185501B2 (en) 2015-09-25 2019-01-22 Intel Corporation Method and apparatus for pinning memory pages in a multi-level system memory
US10261901B2 (en) 2015-09-25 2019-04-16 Intel Corporation Method and apparatus for unneeded block prediction in a computing system having a last level cache and a multi-level system memory
US10445003B2 (en) 2015-10-15 2019-10-15 SK Hynix Inc. Memory system for dualizing first memory based on operation mode
US11138120B2 (en) 2015-10-16 2021-10-05 SK Hynix Inc. Memory system
US9990283B2 (en) 2015-10-16 2018-06-05 SK Hynix Inc. Memory system
US10191664B2 (en) 2015-10-16 2019-01-29 SK Hynix Inc. Memory system
US9977605B2 (en) 2015-10-16 2018-05-22 SK Hynix Inc. Memory system
US9990143B2 (en) 2015-10-16 2018-06-05 SK Hynix Inc. Memory system
US10180796B2 (en) 2015-10-16 2019-01-15 SK Hynix Inc. Memory system
US9786389B2 (en) 2015-10-16 2017-10-10 SK Hynix Inc. Memory system
US10169242B2 (en) 2015-10-16 2019-01-01 SK Hynix Inc. Heterogeneous package in DIMM
US9977604B2 (en) 2015-10-16 2018-05-22 SK Hynix Inc. Memory system
US9977606B2 (en) 2015-10-16 2018-05-22 SK Hynix Inc. Memory system
US10466909B2 (en) 2015-10-16 2019-11-05 SK Hynix Inc. Memory system
US9792224B2 (en) 2015-10-23 2017-10-17 Intel Corporation Reducing latency by persisting data relationships in relation to corresponding data in persistent memory
US9824419B2 (en) * 2015-11-20 2017-11-21 International Business Machines Corporation Automatically enabling a read-only cache in a language in which two arrays in two different variables may alias each other
US10033411B2 (en) 2015-11-20 2018-07-24 Intel Corporation Adjustable error protection for stored data
US10303372B2 (en) 2015-12-01 2019-05-28 Samsung Electronics Co., Ltd. Nonvolatile memory device and operation method thereof
US10019367B2 (en) 2015-12-14 2018-07-10 Samsung Electronics Co., Ltd. Memory module, computing system having the same, and method for testing tag error thereof
US9847105B2 (en) 2016-02-01 2017-12-19 Samsung Electric Co., Ltd. Memory package, memory module including the same, and operation method of memory package
US10558570B2 (en) * 2016-03-14 2020-02-11 Intel Corporation Concurrent accesses of asymmetrical memory sources
US20180088853A1 (en) * 2016-09-26 2018-03-29 Intel Corporation Multi-Level System Memory Having Near Memory Space Capable Of Behaving As Near Memory Cache or Fast Addressable System Memory Depending On System State
US10866897B2 (en) * 2016-09-26 2020-12-15 Samsung Electronics Co., Ltd. Byte-addressable flash-based memory module with prefetch mode that is adjusted based on feedback from prefetch accuracy that is calculated by comparing first decoded address and second decoded address, where the first decoded address is sent to memory controller, and the second decoded address is sent to prefetch buffer
US10192601B2 (en) 2016-09-27 2019-01-29 Spin Transfer Technologies, Inc. Memory instruction pipeline with an additional write stage in a memory device that uses dynamic redundancy registers
US10628316B2 (en) * 2016-09-27 2020-04-21 Spin Memory, Inc. Memory device with a plurality of memory banks where each memory bank is associated with a corresponding memory instruction pipeline and a dynamic redundancy register
US10192602B2 (en) 2016-09-27 2019-01-29 Spin Transfer Technologies, Inc. Smart cache design to prevent overflow for a memory device with a dynamic redundancy register
US10437491B2 (en) 2016-09-27 2019-10-08 Spin Memory, Inc. Method of processing incomplete memory operations in a memory device during a power up sequence and a power down sequence using a dynamic redundancy register
US10546625B2 (en) 2016-09-27 2020-01-28 Spin Memory, Inc. Method of optimizing write voltage based on error buffer occupancy
US10446210B2 (en) 2016-09-27 2019-10-15 Spin Memory, Inc. Memory instruction pipeline with a pre-read stage for a write operation for reducing power consumption in a memory device that uses dynamic redundancy registers
US10437723B2 (en) 2016-09-27 2019-10-08 Spin Memory, Inc. Method of flushing the contents of a dynamic redundancy register to a secure storage area during a power down in a memory device
US10360964B2 (en) 2016-09-27 2019-07-23 Spin Memory, Inc. Method of writing contents in memory during a power up sequence using a dynamic redundancy register in a memory device
US10818331B2 (en) 2016-09-27 2020-10-27 Spin Memory, Inc. Multi-chip module for MRAM devices with levels of dynamic redundancy registers
US10366774B2 (en) 2016-09-27 2019-07-30 Spin Memory, Inc. Device with dynamic redundancy registers
US10460781B2 (en) 2016-09-27 2019-10-29 Spin Memory, Inc. Memory device with a dual Y-multiplexer structure for performing two simultaneous operations on the same row of a memory bank
US10229065B2 (en) * 2016-12-31 2019-03-12 Intel Corporation Unified hardware and software two-level memory
CN108733311B (en) * 2017-04-17 2021-09-10 伊姆西Ip控股有限责任公司 Method and apparatus for managing storage system
US10437482B2 (en) 2017-07-25 2019-10-08 Samsung Electronics Co., Ltd. Coordinated near-far memory controller for process-in-HBM
US10747463B2 (en) 2017-08-04 2020-08-18 Micron Technology, Inc. Apparatuses and methods for accessing hybrid memory system
US11188467B2 (en) 2017-09-28 2021-11-30 Intel Corporation Multi-level system memory with near memory capable of storing compressed cache lines
US10481976B2 (en) 2017-10-24 2019-11-19 Spin Memory, Inc. Forcing bits as bad to widen the window between the distributions of acceptable high and low resistive bits thereby lowering the margin and increasing the speed of the sense amplifiers
US10489245B2 (en) 2017-10-24 2019-11-26 Spin Memory, Inc. Forcing stuck bits, waterfall bits, shunt bits and low TMR bits to short during testing and using on-the-fly bit failure detection and bit redundancy remapping techniques to correct them
US10656994B2 (en) 2017-10-24 2020-05-19 Spin Memory, Inc. Over-voltage write operation of tunnel magnet-resistance (“TMR”) memory device and correcting failure bits therefrom by using on-the-fly bit failure detection and bit redundancy remapping techniques
US10529439B2 (en) 2017-10-24 2020-01-07 Spin Memory, Inc. On-the-fly bit failure detection and bit redundancy remapping techniques to correct for fixed bit defects
KR102406669B1 (en) 2017-11-08 2022-06-08 삼성전자주식회사 Memory controller and storage device including the same
US10860244B2 (en) 2017-12-26 2020-12-08 Intel Corporation Method and apparatus for multi-level memory early page demotion
US10395712B2 (en) 2017-12-28 2019-08-27 Spin Memory, Inc. Memory array with horizontal source line and sacrificial bitline per virtual source
US10360962B1 (en) 2017-12-28 2019-07-23 Spin Memory, Inc. Memory array with individually trimmable sense amplifiers
US10424726B2 (en) 2017-12-28 2019-09-24 Spin Memory, Inc. Process for improving photoresist pillar adhesion during MRAM fabrication
US10891997B2 (en) 2017-12-28 2021-01-12 Spin Memory, Inc. Memory array with horizontal source line and a virtual source line
US10811594B2 (en) 2017-12-28 2020-10-20 Spin Memory, Inc. Process for hard mask development for MRAM pillar formation using photolithography
US10395711B2 (en) 2017-12-28 2019-08-27 Spin Memory, Inc. Perpendicular source and bit lines for an MRAM array
US10424723B2 (en) 2017-12-29 2019-09-24 Spin Memory, Inc. Magnetic tunnel junction devices including an optimization layer
US10840436B2 (en) 2017-12-29 2020-11-17 Spin Memory, Inc. Perpendicular magnetic anisotropy interface tunnel junction devices and methods of manufacture
US10886330B2 (en) 2017-12-29 2021-01-05 Spin Memory, Inc. Memory device having overlapping magnetic tunnel junctions in compliance with a reference pitch
US10784439B2 (en) 2017-12-29 2020-09-22 Spin Memory, Inc. Precessional spin current magnetic tunnel junction devices and methods of manufacture
US10546624B2 (en) 2017-12-29 2020-01-28 Spin Memory, Inc. Multi-port random access memory
US10840439B2 (en) 2017-12-29 2020-11-17 Spin Memory, Inc. Magnetic tunnel junction (MTJ) fabrication methods and systems
US10367139B2 (en) 2017-12-29 2019-07-30 Spin Memory, Inc. Methods of manufacturing magnetic tunnel junction devices
US10438995B2 (en) 2018-01-08 2019-10-08 Spin Memory, Inc. Devices including magnetic tunnel junctions integrated with selectors
US10438996B2 (en) 2018-01-08 2019-10-08 Spin Memory, Inc. Methods of fabricating magnetic tunnel junctions integrated with selectors
US10446744B2 (en) 2018-03-08 2019-10-15 Spin Memory, Inc. Magnetic tunnel junction wafer adaptor used in magnetic annealing furnace and method of using the same
US10784437B2 (en) 2018-03-23 2020-09-22 Spin Memory, Inc. Three-dimensional arrays with MTJ devices including a free magnetic trench layer and a planar reference magnetic layer
US11107974B2 (en) 2018-03-23 2021-08-31 Spin Memory, Inc. Magnetic tunnel junction devices including a free magnetic trench layer and a planar reference magnetic layer
US11107978B2 (en) 2018-03-23 2021-08-31 Spin Memory, Inc. Methods of manufacturing three-dimensional arrays with MTJ devices including a free magnetic trench layer and a planar reference magnetic layer
US10529915B2 (en) 2018-03-23 2020-01-07 Spin Memory, Inc. Bit line structures for three-dimensional arrays with magnetic tunnel junction devices including an annular free magnetic layer and a planar reference magnetic layer
US11099995B2 (en) 2018-03-28 2021-08-24 Intel Corporation Techniques for prefetching data to a first level of memory of a hierarchical arrangement of memory
US11163707B2 (en) 2018-04-23 2021-11-02 International Business Machines Corporation Virtualization in hierarchical cortical emulation frameworks
US10411185B1 (en) 2018-05-30 2019-09-10 Spin Memory, Inc. Process for creating a high density magnetic tunnel junction array test platform
US10600478B2 (en) 2018-07-06 2020-03-24 Spin Memory, Inc. Multi-bit cell read-out techniques for MRAM cells with mixed pinned magnetization orientations
US10559338B2 (en) 2018-07-06 2020-02-11 Spin Memory, Inc. Multi-bit cell read-out techniques
US10593396B2 (en) 2018-07-06 2020-03-17 Spin Memory, Inc. Multi-bit cell read-out techniques for MRAM cells with mixed pinned magnetization orientations
US10692569B2 (en) 2018-07-06 2020-06-23 Spin Memory, Inc. Read-out techniques for multi-bit cells
US10650875B2 (en) 2018-08-21 2020-05-12 Spin Memory, Inc. System for a wide temperature range nonvolatile memory
US10699761B2 (en) 2018-09-18 2020-06-30 Spin Memory, Inc. Word line decoder memory architecture
US11621293B2 (en) 2018-10-01 2023-04-04 Integrated Silicon Solution, (Cayman) Inc. Multi terminal device stack systems and methods
US10971680B2 (en) 2018-10-01 2021-04-06 Spin Memory, Inc. Multi terminal device stack formation methods
US11107979B2 (en) 2018-12-28 2021-08-31 Spin Memory, Inc. Patterned silicide structures and methods of manufacture
US11055228B2 (en) 2019-01-31 2021-07-06 Intel Corporation Caching bypass mechanism for a multi-level memory
US11893276B2 (en) 2020-05-21 2024-02-06 Micron Technology, Inc. Apparatuses and methods for data management in a memory device
US11687468B2 (en) 2020-07-02 2023-06-27 International Business Machines Corporation Method and apparatus for securing memory modules
US11899590B2 (en) 2021-06-18 2024-02-13 Seagate Technology Llc Intelligent cache with read destructive memory cells
US11860670B2 (en) 2021-12-16 2024-01-02 Intel Corporation Accessing a memory using index offset information

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101237546A (en) * 2007-11-13 2008-08-06 东南大学 High-speed audio and video magnitude storage method and device for vehicular environment
US20080282032A1 (en) * 2006-07-18 2008-11-13 Xiaowei Shen Adaptive mechanisms and methods for supplying volatile data copies in multiprocessor systems
CN101957726A (en) * 2009-07-16 2011-01-26 恒忆有限责任公司 Phase transition storage in the dual inline type memory module
CN101989183A (en) * 2010-10-15 2011-03-23 浙江大学 Method for realizing energy-saving storing of hybrid main storage

Family Cites Families (84)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4713755A (en) 1985-06-28 1987-12-15 Hewlett-Packard Company Cache memory consistency control with explicit software instructions
US5974576A (en) 1996-05-10 1999-10-26 Sun Microsystems, Inc. On-line memory monitoring system and methods
JP3210590B2 (en) 1996-11-29 2001-09-17 株式会社日立製作所 Multiprocessor system and cache coherency control method
US5917743A (en) 1997-10-17 1999-06-29 Waferscale Integration, Inc. Content-addressable memory (CAM) for a FLASH memory array
JP3098486B2 (en) 1998-03-31 2000-10-16 山形日本電気株式会社 Nonvolatile semiconductor memory device
US6202129B1 (en) 1998-03-31 2001-03-13 Intel Corporation Shared cache structure for temporal and non-temporal information using indicative bits
US6038166A (en) 1998-04-01 2000-03-14 Invox Technology High resolution multi-bit-per-cell memory
US5912839A (en) 1998-06-23 1999-06-15 Energy Conversion Devices, Inc. Universal memory element and method of programming same
US6868472B1 (en) 1999-10-01 2005-03-15 Fujitsu Limited Method of Controlling and addressing a cache memory which acts as a random address memory to increase an access speed to a main memory
US8341332B2 (en) 2003-12-02 2012-12-25 Super Talent Electronics, Inc. Multi-level controller with smart storage transfer manager for interleaving multiple single-chip flash memory devices
US6259627B1 (en) 2000-01-27 2001-07-10 Multi Level Memory Technology Read and write operations using constant row line voltage and variable column line load
US6704840B2 (en) 2001-06-19 2004-03-09 Intel Corporation Computer system and method of computer initialization with caching of option BIOS
US6804799B2 (en) 2001-06-26 2004-10-12 Advanced Micro Devices, Inc. Using type bits to track storage of ECC and predecode bits in a level two cache
US7752423B2 (en) 2001-06-28 2010-07-06 Intel Corporation Avoiding execution of instructions in a second processor by committing results obtained from speculative execution of the instructions in a first processor
EP1387274A3 (en) 2002-07-31 2004-08-11 Texas Instruments Incorporated Memory management for local variables
DE60306488D1 (en) 2003-02-27 2006-08-10 St Microelectronics Srl Built-in test procedure in a flash memory
US7475174B2 (en) 2004-03-17 2009-01-06 Super Talent Electronics, Inc. Flash / phase-change memory in multi-ring topology using serial-link packet interface
US7269708B2 (en) 2004-04-20 2007-09-11 Rambus Inc. Memory controller for non-homogenous memory system
US7590918B2 (en) 2004-09-10 2009-09-15 Ovonyx, Inc. Using a phase change memory as a high volume memory
US20070005922A1 (en) 2005-06-30 2007-01-04 Swaminathan Muthukumar P Fully buffered DIMM variable read latency
US7600078B1 (en) 2006-03-29 2009-10-06 Intel Corporation Speculatively performing read transactions
US7913147B2 (en) 2006-05-08 2011-03-22 Intel Corporation Method and apparatus for scrubbing memory
CN101512661B (en) 2006-05-12 2013-04-24 苹果公司 Combined distortion estimation and error correction coding for memory devices
US7756053B2 (en) 2006-06-30 2010-07-13 Intel Corporation Memory agent with error hardware
US7493439B2 (en) 2006-08-01 2009-02-17 International Business Machines Corporation Systems and methods for providing performance monitoring in a memory system
WO2008040028A2 (en) * 2006-09-28 2008-04-03 Virident Systems, Inc. Systems, methods, and apparatus with programmable memory control for heterogeneous main memory
US7555605B2 (en) 2006-09-28 2009-06-30 Freescale Semiconductor, Inc. Data processing system having cache memory debugging support and method therefor
US8683139B2 (en) 2006-10-31 2014-03-25 Hewlett-Packard Development Company, L.P. Cache and method for cache bypass functionality
US7818489B2 (en) 2006-11-04 2010-10-19 Virident Systems Inc. Integrating data from symmetric and asymmetric memory
US7554855B2 (en) * 2006-12-20 2009-06-30 Mosaid Technologies Incorporated Hybrid solid-state memory system having volatile and non-volatile memory
TW200845014A (en) 2007-02-28 2008-11-16 Aplus Flash Technology Inc A bit line structure for a multilevel, dual-sided nonvolatile memory cell array
US20080270811A1 (en) 2007-04-26 2008-10-30 Super Talent Electronics Inc. Fast Suspend-Resume of Computer Motherboard Using Phase-Change Memory
WO2008139441A2 (en) 2007-05-12 2008-11-20 Anobit Technologies Ltd. Memory device with internal signal processing unit
WO2008150927A2 (en) 2007-05-30 2008-12-11 Schooner Information Technology System including a fine-grained memory and a less-fine-grained memory
US8296534B1 (en) * 2007-06-29 2012-10-23 Emc Corporation Techniques for using flash-based memory in recovery processing
TWI327319B (en) 2007-07-03 2010-07-11 Macronix Int Co Ltd Double programming methods of a multi-level-cell nonvolatile memory
KR101498673B1 (en) 2007-08-14 2015-03-09 삼성전자주식회사 Solid state drive, data storing method thereof, and computing system including the same
US8108609B2 (en) * 2007-12-04 2012-01-31 International Business Machines Corporation Structure for implementing dynamic refresh protocols for DRAM based cache
EP2077559B1 (en) 2007-12-27 2012-11-07 Hagiwara Solutions Co., Ltd. Refresh method of a flash memory
TWI373768B (en) 2008-02-05 2012-10-01 Phison Electronics Corp System, controller and method for data storage
EP2128195A1 (en) 2008-05-27 2009-12-02 Borealis AG Strippable semiconductive composition comprising low melt temperature polyolefin
US20090313416A1 (en) 2008-06-16 2009-12-17 George Wayne Nation Computer main memory incorporating volatile and non-volatile memory
CN102150147A (en) 2008-07-03 2011-08-10 惠普开发有限公司 Memory server
JP5581577B2 (en) 2008-08-29 2014-09-03 富士通株式会社 Data processing device
US9152569B2 (en) 2008-11-04 2015-10-06 International Business Machines Corporation Non-uniform cache architecture (NUCA)
KR101001147B1 (en) 2008-12-12 2010-12-17 주식회사 하이닉스반도체 Phase change memory device
US8375241B2 (en) 2009-04-02 2013-02-12 Intel Corporation Method and system to improve the operations of a registered memory module
US8331857B2 (en) 2009-05-13 2012-12-11 Micron Technology, Inc. Wireless interface to program phase-change memories
US8250282B2 (en) 2009-05-14 2012-08-21 Micron Technology, Inc. PCM memories for storage bus interfaces
US8180981B2 (en) 2009-05-15 2012-05-15 Oracle America, Inc. Cache coherent support for flash in a memory hierarchy
US8504759B2 (en) 2009-05-26 2013-08-06 Micron Technology, Inc. Method and devices for controlling power loss
US20100306453A1 (en) 2009-06-02 2010-12-02 Edward Doller Method for operating a portion of an executable program in an executable non-volatile memory
US8159881B2 (en) 2009-06-03 2012-04-17 Marvell World Trade Ltd. Reference voltage optimization for flash memory
US9123409B2 (en) 2009-06-11 2015-09-01 Micron Technology, Inc. Memory device for a hierarchical memory architecture
US9208084B2 (en) * 2009-06-29 2015-12-08 Oracle America, Inc. Extended main memory hierarchy having flash memory for page fault handling
JP2011022657A (en) 2009-07-13 2011-02-03 Fujitsu Ltd Memory system and information processor
WO2011007599A1 (en) 2009-07-17 2011-01-20 株式会社 東芝 Memory management device
US8077515B2 (en) 2009-08-25 2011-12-13 Micron Technology, Inc. Methods, devices, and systems for dealing with threshold voltage change in memory devices
WO2011042939A1 (en) 2009-10-09 2011-04-14 Hitachi, Ltd. Storage control device building a logical unit based on storage devices coupled with different switches
US8832415B2 (en) 2010-01-08 2014-09-09 International Business Machines Corporation Mapping virtual addresses to different physical addresses for value disambiguation for thread memory access requests
JP2011108306A (en) 2009-11-16 2011-06-02 Sony Corp Nonvolatile memory and memory system
US8230172B2 (en) 2009-12-03 2012-07-24 Intel Corporation Gather and scatter operations in multi-level memory hierarchy
US8489803B2 (en) 2009-12-14 2013-07-16 Smsc Holdings S.A.R.L. Efficient use of flash memory in flash drives
US8914568B2 (en) 2009-12-23 2014-12-16 Intel Corporation Hybrid memory architectures
US8612809B2 (en) 2009-12-31 2013-12-17 Intel Corporation Systems, methods, and apparatuses for stacked memory
US20110208900A1 (en) 2010-02-23 2011-08-25 Ocz Technology Group, Inc. Methods and systems utilizing nonvolatile memory in a computer system main memory
JP2011198091A (en) 2010-03-19 2011-10-06 Toshiba Corp Virtual address cache memory, processor, and multiprocessor system
US9189385B2 (en) 2010-03-22 2015-11-17 Seagate Technology Llc Scalable data structures for control and management of non-volatile storage
KR20110131781A (en) 2010-05-31 2011-12-07 삼성전자주식회사 Method for presuming accuracy of location information and apparatus for the same
US8649212B2 (en) 2010-09-24 2014-02-11 Intel Corporation Method, apparatus and system to determine access information for a phase change memory
US8838935B2 (en) 2010-09-24 2014-09-16 Intel Corporation Apparatus, method, and system for implementing micro page tables
US8612676B2 (en) 2010-12-22 2013-12-17 Intel Corporation Two-level system main memory
US20120221785A1 (en) 2011-02-28 2012-08-30 Jaewoong Chung Polymorphic Stacked DRAM Memory Architecture
US8462577B2 (en) 2011-03-18 2013-06-11 Intel Corporation Single transistor driver for address lines in a phase change memory and switch (PCMS) array
US8462537B2 (en) 2011-03-21 2013-06-11 Intel Corporation Method and apparatus to reset a phase change memory and switch (PCMS) memory cell
US8935484B2 (en) * 2011-03-31 2015-01-13 Hewlett-Packard Development Company, L.P. Write-absorbing buffer for non-volatile memory
US8607089B2 (en) 2011-05-19 2013-12-10 Intel Corporation Interface for storage device access over memory bus
CN102209262B (en) 2011-06-03 2017-03-22 中兴通讯股份有限公司 Method, device and system for scheduling contents
US8605531B2 (en) 2011-06-20 2013-12-10 Intel Corporation Fast verify for phase change memory with switch
US8463948B1 (en) 2011-07-01 2013-06-11 Intel Corporation Method, apparatus and system for determining an identifier of a volume of memory
US8767482B2 (en) 2011-08-18 2014-07-01 Micron Technology, Inc. Apparatuses, devices and methods for sensing a snapback event in a circuit
WO2013048470A1 (en) 2011-09-30 2013-04-04 Intel Corporation Statistical wear leveling for non-volatile system memory
US20130205065A1 (en) 2012-02-02 2013-08-08 Lsi Corporation Methods and structure for an improved solid-state drive for use in caching applications
US9311228B2 (en) 2012-04-04 2016-04-12 International Business Machines Corporation Power reduction in server memory system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080282032A1 (en) * 2006-07-18 2008-11-13 Xiaowei Shen Adaptive mechanisms and methods for supplying volatile data copies in multiprocessor systems
CN101237546A (en) * 2007-11-13 2008-08-06 东南大学 High-speed audio and video magnitude storage method and device for vehicular environment
CN101957726A (en) * 2009-07-16 2011-01-26 恒忆有限责任公司 Phase transition storage in the dual inline type memory module
CN101989183A (en) * 2010-10-15 2011-03-23 浙江大学 Method for realizing energy-saving storing of hybrid main storage

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776358A (en) * 2015-10-07 2017-05-31 三星电子株式会社 DIMM SSD address performance technologies
CN106776358B (en) * 2015-10-07 2021-10-26 三星电子株式会社 DIMM SSD addressing performance techniques
CN108292262A (en) * 2015-12-03 2018-07-17 华为技术有限公司 Computer storage management method and system
WO2018141174A1 (en) * 2017-02-03 2018-08-09 Huawei Technologies Co., Ltd. Systems and methods for utilizing ddr4-dram chips in hybrid ddr5-dimms and for cascading ddr5-dimms
US10628343B2 (en) 2017-02-03 2020-04-21 Futurewei Technologies, Inc. Systems and methods for utilizing DDR4-DRAM chips in hybrid DDR5-DIMMs and for cascading DDR5-DIMMs
CN107291392A (en) * 2017-06-21 2017-10-24 郑州云海信息技术有限公司 A kind of solid state hard disc and its reading/writing method
CN111831216A (en) * 2019-04-18 2020-10-27 三星电子株式会社 Memory module including mirror circuit and method of operating the same
CN111831216B (en) * 2019-04-18 2024-04-05 三星电子株式会社 Memory module including mirror circuit and method of operating the same
CN116483288A (en) * 2023-06-21 2023-07-25 苏州浪潮智能科技有限公司 Memory control equipment, method and device and server memory module

Also Published As

Publication number Publication date
TWI594182B (en) 2017-08-01
CN103946826B (en) 2019-05-31
EP2761480A4 (en) 2015-06-24
US9317429B2 (en) 2016-04-19
US20130275682A1 (en) 2013-10-17
TW201329857A (en) 2013-07-16
WO2013048500A1 (en) 2013-04-04
EP2761480A1 (en) 2014-08-06

Similar Documents

Publication Publication Date Title
CN103946826A (en) Apparatus and method for implementing a multi-level memory hierarchy over common memory channels
CN103946812B (en) Apparatus and method for realizing multi-level memory hierarchy
CN103946811B (en) Apparatus and method for realizing the multi-level store hierarchy with different operation modes
CN104115129A (en) System and method for intelligently flushing data from a processor into a memory subsystem
CN103999161B (en) Equipment and method for phase transition storage drift management
CN103946813B (en) Generation based on the remote memory access signals followed the trail of using statistic
CN103988183B (en) The dynamic part power-off of the memory side cache in 2 grades of hierarchy of memory
CN104025060B (en) Support the storage channel of nearly memory and remote memory access
CN104050112B (en) Beginning and the instruction of end for marking the non-transactional code area for needing to write back lasting storage
CN104106057B (en) The method and system of the summary responses changed to resting state is provided with nonvolatile RAM
CN103946816B (en) The nonvolatile RAM of replacement as conventional mass storage device(NVRAM)
CN103946810B (en) The method and computer system of subregion in configuring non-volatile random access storage device
CN104126181A (en) Thin translation for system access of non volatile semicondcutor storage as random access memory
CN104011691A (en) Non-volatile ram disk
CN104115230B (en) Computing device, method and system based on High Efficiency PC MS flush mechanisms

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190531

Termination date: 20210930

CF01 Termination of patent right due to non-payment of annual fee