CN117331859A - System, apparatus, and method for memory controller architecture - Google Patents

System, apparatus, and method for memory controller architecture Download PDF

Info

Publication number
CN117331859A
CN117331859A CN202310748981.7A CN202310748981A CN117331859A CN 117331859 A CN117331859 A CN 117331859A CN 202310748981 A CN202310748981 A CN 202310748981A CN 117331859 A CN117331859 A CN 117331859A
Authority
CN
China
Prior art keywords
memory
access request
channels
controller
memory access
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310748981.7A
Other languages
Chinese (zh)
Inventor
E·孔法洛涅里
S·S·帕夫洛夫斯基
P·埃斯特普
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Micron Technology Inc
Original Assignee
Micron Technology Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US18/202,802 external-priority patent/US20240004799A1/en
Application filed by Micron Technology Inc filed Critical Micron Technology Inc
Publication of CN117331859A publication Critical patent/CN117331859A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0893Caches characterised by their organisation or structure
    • G06F12/0897Caches characterised by their organisation or structure with two or more cache hierarchy levels
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0844Multiple simultaneous or quasi-simultaneous cache accessing
    • G06F12/0853Cache with multiport tag or data arrays
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/12Replacement control
    • G06F12/121Replacement control using replacement algorithms
    • G06F12/126Replacement control using replacement algorithms with special data handling, e.g. priority of data or instructions, handling errors or pinning

Abstract

The present disclosure relates to systems, devices, and methods for memory controller architecture. An apparatus may include a plurality of memory devices and a memory controller coupled to the plurality of memory devices via a plurality of memory channels. The plurality of memory channels is organized into a plurality of channel groups. The memory controller includes a plurality of memory access request/response buffer groups, and each memory access request/response buffer group of the plurality of memory access request/response buffer groups corresponds to a different one of the plurality of channel groups.

Description

System, apparatus, and method for memory controller architecture
PRIORITY INFORMATION
The present application claims the benefit of U.S. provisional application No. 63/357,562 to App. 30, 6, 2022, the contents of which are incorporated herein by reference.
Technical Field
The present disclosure relates generally to semiconductor memories and methods, and more particularly, to apparatus, systems, and methods for memory controller architecture.
Background
Memory devices are typically provided as internal semiconductor integrated circuits in a computer or other electronic system. There are many different types of memory, including volatile and non-volatile memory. Volatile memory may require power to maintain its data (e.g., host data, error data, etc.) and includes Random Access Memory (RAM), dynamic Random Access Memory (DRAM), static Random Access Memory (SRAM), synchronous Dynamic Random Access Memory (SDRAM), and Thyristor Random Access Memory (TRAM), among others. Nonvolatile memory may provide persistent data by retaining stored data when not powered and may include NAND flash memory, NOR flash memory, ferroelectric random access memory (FeRAM), and resistance variable memory, such as Phase Change Random Access Memory (PCRAM), resistive Random Access Memory (RRAM), and Magnetoresistive Random Access Memory (MRAM), such as spin torque transfer random access memory (STT RAM), and the like.
The memory device may be coupled to a host (e.g., a host computing device) to store data, commands, and/or instructions for use by the host when the computer or electronic system is operating. For example, data, commands, and/or instructions may be transferred between a host and a memory device during operation of a computing or other electronic system. The controller may be used to manage data, command, and/or instruction transfers between the host and the memory device.
Disclosure of Invention
An aspect of the present disclosure relates to an apparatus having a memory controller architecture, comprising: a plurality of memory devices; and a memory controller coupled to the plurality of memory devices via a plurality of memory channels; wherein the plurality of memory channels are organized into a plurality of channel groups; and wherein the memory controller comprises a plurality of memory access request/response buffer groups, and wherein each memory access request/response buffer group of the plurality of memory access request/response buffer groups corresponds to a different one of the plurality of channel groups.
Another aspect of the present disclosure relates to a memory controller, comprising: a front end portion configured to couple to a host via an interface; a back end portion configured to be coupled to a plurality of memory devices via a plurality of memory channels, wherein the plurality of memory channels are organized into a plurality of channel groups; and a central portion comprising a plurality of memory access request/response buffer sets, wherein each memory access request/response buffer set of the plurality of memory access request/response buffer sets corresponds to a different one of the plurality of channel groups.
Yet another aspect of the present disclosure relates to an apparatus having a memory controller architecture, comprising: a plurality of memory devices; and a memory controller coupled to the plurality of memory devices via a plurality of memory channels, wherein the plurality of memory channels are organized into a plurality of channel groups; wherein the memory controller comprises: a front end portion configured to: receiving a memory access request from a host according to a computing fast link (CXL) protocol; and providing a memory access response to the host according to the CXL protocol; and a central portion comprising a plurality of memory access request/response buffer sets, wherein each memory access request/response buffer set of the plurality of memory access request/response buffer sets corresponds to a different one of the plurality of channel groups.
Drawings
FIG. 1 is a block diagram of a computing system including a memory controller according to several embodiments of the present disclosure.
FIG. 2 is a block diagram of a memory controller coupled to a plurality of memory devices.
FIG. 3 is a block diagram of a memory controller having a cache architecture that may operate in accordance with several embodiments of the present disclosure.
Fig. 4A is a block diagram of a portion of a memory controller having a memory access request/response buffer architecture in accordance with several embodiments of the present disclosure.
Fig. 4B is a block diagram of a portion of a memory controller having a memory access request/response buffer architecture in accordance with several embodiments of the present disclosure.
Fig. 5 is a block diagram of a portion of a memory controller having a memory access request/response buffer architecture in accordance with several embodiments of the present disclosure.
Detailed Description
Systems, devices, and methods related to memory controller architecture are described. The memory controller may be within a memory system, which may be a memory module, a storage device, or a mix of memory modules and storage devices. In various embodiments, a memory controller may include a memory access request/response buffer architecture that may reduce access latency compared to existing methods. The memory controller may be coupled to a plurality of memory devices via a plurality of memory channels that may be organized into a plurality of channel groups. The memory controller may include a plurality of memory access request/response buffer groups, wherein each memory access request/response buffer group of the plurality of memory access request/response buffer groups corresponds to a different one of a plurality of channel groups. In various embodiments, the memory controller is configured to operate the plurality of channel groups as independent respective reliability, availability, and serviceability (RAS) channels. As further described herein, each channel group (e.g., RAS channels) may (or may not) include an associated independent cache used in association with accessing the memory device to which the memory controller is coupled.
In various previous approaches, the memory controller of the memory system includes a memory access request/response buffer (e.g., read and/or write queues) in a portion of the memory controller that interfaces with the host (e.g., front end portion). The memory access request mobile channel memory controller is then used to execute at a back end portion that interfaces with a medium, such as a memory device. As the memory system approaches a "load" condition in which the various queues become more full, the front-end queues may become congested, which may cause the front-end memory access queues to become bottlenecks of the memory controller and/or the memory system, adversely affecting (e.g., increasing) latency. As an example, the latency caused by front-end memory access queue congestion increases significantly as the transfer rate from the host to the memory system increases.
Various embodiments of the present disclosure provide a controller architecture that may provide benefits over existing approaches, such as improving (e.g., reducing) latency associated with memory accesses. Several embodiments include a memory controller having multiple memory access request/response buffer banks that are independently operable to service separate non-overlapping physical address ranges. The request/response buffer architecture described herein may operate effectively and efficiently with multiple host interface speeds and transfer rates.
As used herein, the singular forms "a," "an," and "the" include the singular and plural referents unless the content clearly dictates otherwise. Furthermore, the word "may" is used throughout this application to mean permission (i.e., possible, capable) rather than mandatory (i.e., must). The term "include" and its derivatives means "include (but are not limited to)". The term "coupled" means directly or indirectly connected. It should be understood that data may be transmitted, received, or exchanged via electronic signals (e.g., current, voltage, etc.), and the phrase "a signal indicative of data" means that the data itself is transmitted, received, or exchanged in a physical medium.
The figures herein follow a numbering convention in which the first or leading digit(s) correspond to the figure number and the remaining digits identify an element or component in the figure. Similar elements or components between different figures may be identified by the use of similar digits. For example, 110 may refer to element "10" in FIG. 1, and similar elements may be labeled 310 in FIG. 3. Similar elements in the figures may be referred to by hyphens and additional numbers or letters. See, for example, elements 130-1, 130-2, 130-N in FIG. 1. Such similar elements may be collectively referred to without hyphens or additional digits or letters. For example, elements 130-1, 130-2, 130-N may be collectively referred to as 130. As used herein, the identifiers "M" and "N" and "X" with particular reference to the reference numerals in the figures indicate that several particular features so labeled may be included. It should be appreciated that elements shown in the various embodiments herein may be added, exchanged, and/or eliminated so as to provide a number of additional embodiments of the present disclosure. In addition, it is to be understood that the proportions and relative dimensions of the elements provided in the figures are intended to illustrate certain embodiments of the present invention and should not be taken as limiting.
Fig. 1 is a block diagram of a computing system 101 including a memory controller 100 according to several embodiments of the present disclosure. The memory controller 100 includes a front end portion 104, a central controller portion 110, and a back end portion 119. The computing system 101 includes a host 103 coupled to a memory controller 100 and memory devices 130-1, …, 130-N. Computing system 101 may be, for example, a High Performance Computing (HPC) data center among various other types of computing systems, such as servers, desktop computers, laptop computers, mobile devices, and the like.
Although not shown in fig. 1, the front end portion 104 may include a physical layer (PHY) and a front end controller for interfacing with the host 103 via the bus 102, which bus 102 may include a number of input/output (I/O) lanes. The bus 102 may include various combinations of data, address, and control buses, which may be separate buses or one or more combined buses. In at least one embodiment, the interface between memory controller 100 and host 103 may be a peripheral component interconnect express (PCIe) physical and electrical interface that operates according to a computing express link (CXL) protocol. As non-limiting examples, bus 102 may be a PCIe 5.0 interface operating according to the CXL 2.0 specification or a PCIe 6.0 interface operating according to the CXL 3.0 specification.
CXL is a high-speed Central Processing Unit (CPU) -to-device and CPU-to-memory interconnect designed to accelerate next-generation data center performance. CXL technology maintains memory consistency between CPU memory space and memory on attached devices (e.g., accelerators, memory buffers, and intelligent I/O devices), which allows for resource sharing to improve performance, reduce software stack complexity, and reduce overall system cost. CXLs are designed as industry open standard interfaces for high speed communications, as accelerators are increasingly used to supplement CPUs to support emerging applications such as artificial intelligence and machine learning. CXL technology is built on the PCIe infrastructure, providing advanced protocols in the area, such as input/output (I/O) protocols, memory protocols (e.g., initially allowing hosts and accelerators to share memory), and coherence interfaces, with PCIe physical and electrical interfaces. CXL provides a protocol with PCIe-like I/O semantics (e.g., CXL.io), cache protocol semantics (e.g., CXL.cache), and memory access semantics (CXL.mem). CXLs can support different CXL device types (e.g., type 1, type 2, and type 3) that support a variety of CXL protocols. Embodiments of the present disclosure are not limited to a particular CXL device type.
In the example shown in fig. 1, the front end 104 includes a number of memory access request buffers 107 (REQ) and 108 (data_w) and a number of memory access response buffers 113 (RESP) and 114 (data_r). As an example, the request buffer 107 may be a read request buffer for queuing host read requests received from the host 103 for execution by the controller 100 (e.g., by a memory channel controller of the back end 119) to read data from the memory device 130. The request buffer 108 may be a write request buffer for queuing write requests and corresponding data received from a host for execution by the controller 100 to write data to the memory device 130. Response buffer 113 may be a write response buffer for queuing write responses to be provided from controller 100 to host 103. Response buffer 114 may be a read response buffer for queuing read responses and corresponding data to be provided from controller 100 to host 103. The buffer may be implemented as a first-in first-out (FIFO) buffer; embodiments are not limited to a particular buffer type. In a number of embodiments, buffers 107 and 108 may be referred to as master-to-slave (M2S) buffers because they involve transactions from host 103 (e.g., master) to memory controller 100 (e.g., slave), and buffers 113 and 114 may be referred to as slave-to-master (S2M) buffers because they involve transactions from controller 100 to host 103.
The central controller 110 may be responsible for controlling various operations associated with performing memory access requests (e.g., read commands and write commands) from the host 103. For example, although not shown in fig. 1, the central controller 110 may include a cache and various error circuitry (e.g., error detection and/or error correction circuitry) capable of generating error detection and/or error correction data for providing data reliability among other RAS functionality in association with writing data to the memory device 130 and/or reading data from the memory device 130. As further described herein, such error detection and/or correction circuitry may include Cyclic Redundancy Check (CRC) circuitry, error Correction Code (ECC) circuitry, redundant Array of Independent Disks (RAID) circuitry, and/or "chip kill" circuitry, for example. Also, as described further below, the cache may be implemented as multiple independent caches (e.g., a separate cache per channel group).
The back end portion 119 may include a number of memory channel controllers (e.g., media controllers) and a Physical (PHY) layer that couples the memory controller 100 to the memory device 130. As used herein, the term "PHY layer" generally refers to the physical layer in the Open Systems Interconnection (OSI) model of computing systems. The PHY layer may be the first (e.g., lowest) layer of the OSI model and may be used to transfer data via a physical data transmission medium. In various embodiments, the physical data transmission medium includes memory channels 125-1, …, 125-N. Memory channel 125 may be, for example, a 16-bit channel each coupled to a 16-bit (e.g., x 16) device, two 8-bit (x 8) devices; embodiments are not limited to a particular backend interface. As another example, among other possible bus configurations, channels 125 may each also include a dual pin Data Mask Inversion (DMI) bus. The back-end portion 119 may exchange data (e.g., user data and error detection and/or correction data) with the memory device 130 via physical pins corresponding to the respective memory channels 125. As further described herein, in several embodiments, the memory channels 125 may be organized into several channel groups, with the memory channels of each group being accessed in association with performing various memory access operations and/or error detection and/or correction operations.
Memory device 130 may be a Dynamic Random Access Memory (DRAM) device that operates according to a protocol such as low power double data rate (LPDDRx), which may be referred to herein as an LPDDRx DRAM device, an LPDDRx memory, or the like. "x" in LPDDRx refers to any of several protocols (e.g., LPDDR 5). However, embodiments are not limited to a particular type of memory device 130. For example, the memory device 130 may be a FeRAM device.
In some embodiments, the memory controller 100 may include a management unit 134 for initializing, configuring, and/or monitoring characteristics of the memory controller 100. The management unit 134 may include: an I/O bus for managing out-of-band data and/or commands; a management unit controller for executing instructions associated with initializing, configuring, and/or monitoring characteristics of the memory controller; and a management unit memory for storing data associated with initializing, configuring, and/or monitoring characteristics of the memory controller 100. As used herein, the term "out-of-band" generally refers to a transmission medium that is different from the primary transmission medium of the network. For example, out-of-band data and/or commands may be data and/or commands transferred to the network using a different transmission medium than that used to transfer data within the network.
In various examples, memory access request/response buffers 107, 108, 113, and 114 may become congested, which may result in increased latency associated with host read and/or write access requests. As further described below in association with fig. 4A, 4B, and 5, various embodiments of the present disclosure may include implementing multiple sets of memory access request/response buffers, which may reduce or mitigate the latency associated with buffers 107, 108, 113, and 114. For example, the multiple sets may correspond to respective channel groups.
Fig. 2 is a block diagram of a memory controller 200 coupled to a plurality of memory devices 230. As shown in fig. 2, the controller 200 includes a front end portion 204, a central portion 210, and a back end portion 219. The controller 200 may be a controller, such as the controller 100 depicted in fig. 1.
The front end portion 204 includes a front end PHY 205 for interfacing with a host via the communication link 202 (which may be, for example, a CXL link). The front end 204 includes a front end controller 206 to manage the interface and communicate with a central controller 210. In embodiments in which link 202 is a CXL link, front-end controller 206 is configured to receive memory access requests for memory device 230 (e.g., from a host) according to the CXL protocol and to provide memory access responses corresponding to the memory access requests according to the CXL protocol (e.g., to the host).
Front-end controller 206 may include memory access request/response buffers 207, 208, 213, and 214, which may be similar to the respective buffers 107, 108, 113, and 114 described in fig. 1.
The controller 200 is coupled to a memory device 230 via a number of memory channels 225. In this example, the memory channels 225 are organized into a number of channel groups 240-1, 240-2, …, 240-X. In this example, each channel group 240 includes "M" memory channels 225. For example, channel group 240-1 includes memory channels 225-1-1, 225-1-2, …, 225-1-M, channel group 240-2 includes memory channels 225-2-1, 225-2-2, …, 225-2-M, and channel group 240-X includes memory channels 225-X-1, 225-X-2, …, 225-X-M. Although each channel group is shown to include the same number of memory channels 225, embodiments are not so limited.
In this example, the back end portion 219 of the controller 200 includes a plurality of Memory Channel Controllers (MCCs) 228 for interfacing with memory devices 230 corresponding to respective memory channels 225. As shown in FIG. 2, memory channel controllers 228-1-1, 228-1-2, …, 228-1-M corresponding to channel group 240-1 are coupled to memory device 230 via respective channels 225-1-1, 225-1-2, …, 225-1-M. In another example, the memory channel controllers 228-1, 228-2, …, 228-M may be implemented as a single memory channel controller that drives "M" memory channels. Although not shown in fig. 2, backend 219 includes a PHY memory interface for coupling to memory device 230.
The respective channels 225 of the channel groups 240-1, 240-2, …, 240-X operate together for the purpose of one or more RAS schemes. Thus, the channel group 240 may be referred to as a "RAS channel". In this example, the channel groups 240-1, 240-2, …, 240-X include respective error circuitry (RAS channel circuitry) 242-1, 242-2, …, 242-X. Error circuitry 242 may include various circuitry for error detection and/or error correction, which may include data recovery. Error circuitry 242 may also include CRC circuitry, ECC, circuitry, RAID circuitry, and/or chip kill circuitry, including various combinations thereof. The channel groups 240-1, 240-2, …, 240-X may be independently operated by the central controller 210 such that memory access requests and/or error operations may be performed separately (and concurrently) on the memory devices 230 corresponding to the respective channel groups 240.
The term "chip kill" generally refers to a form of error correction that protects a memory system (e.g., memory system 101 shown in fig. 1) from any single memory device 230 (chip) failure and multi-bit errors from any portion of a single memory chip. The chip kill circuitry may collectively correct errors in the data across a subset of memory devices 230 (e.g., corresponding to a subset of respective channel groups 240) with a desired chip kill protection.
An example chip kill implementation for a channel group 240 that includes eleven memory channels 225 (e.g., "M" =11) corresponding to a bus width of 176 bits (16 bits/channel x 11 channels) may include a memory device 230 that writes data to eight of the eleven memory channels 225 and a memory device 230 that writes parity data to three of the eleven memory channels 225. Four codewords may be written, each consisting of eleven four-bit symbols, with each symbol belonging to a different memory channel/device. The first codeword may comprise a first four-bit symbol of each memory device 230, the second codeword may comprise a second four-bit symbol of each memory device 230, the third codeword may comprise a third four-bit symbol of each memory device 230, and the fourth codeword may comprise a fourth four-bit symbol of each memory device 230.
Three parity symbols may allow chip kill circuitry (e.g., 242) to correct at most one symbol error and detect at most two symbol errors in each codeword. If only two parity symbols are added instead of three, the chip hunting circuitry may correct at most one symbol error but only one symbol error. In various embodiments, data symbols and parity symbols may be written to or read from the memory devices of eleven channels (e.g., 225-1-1 through 225-1-11) concurrently. If each bit symbol in the die fails, only the bit symbol from that memory device 230 in the codeword will fail. This allows memory content to be reconstructed despite the complete failure of one memory device 230. The aforementioned chip hunting operation is regarded as "correction in operation" because data is corrected by performing a repair operation without affecting performance. The embodiments are not limited to the specific example chip hunting operations described above. In contrast to chip hunting operations, which may not involve repair operations, various RAID methods are considered "check and restore correction" because the repair process is initiated to restore data subject to errors. For example, if an error in the sign of a RAID stripe is determined to be uncorrectable, the corresponding data may be recovered/reconstructed by reading the remaining user data of the stripe and XOR with the corresponding parity data of the stripe.
As shown in fig. 2, each of the channel groups 240 may include memory channel data path circuitry (mem_ch) 226 associated with a corresponding memory channel 225 of the particular channel group 240. For example, the channel group 240-1 includes memory channel data path circuitry 226-1-1, 226-1-2, …, 226-1-M corresponding to the respective channels 225-1-1, 225-1-2, …, 225-1-M. Similarly, channel group 240-2 includes memory channel data path circuitry 226-2-1, 226-2-2, …, 226-2-M corresponding to respective channels 225-2-1, 225-2-2, …, 225-2-M, and channel group 240-X includes memory channel data path circuitry 226-X-1, 226-X-2, …, 226-X-M corresponding to respective channels 225-X-1, 225-X-2, …, 225-X-M. The data path circuitry 226 may include error circuitry corresponding to error detection or error correction on a particular memory channel 225. For example, data path circuitry 226 may include CRC circuitry or ECC circuitry. That is, the error circuitry of the data path circuitry 226 may be associated with a particular memory channel 225 or dedicated to a particular memory channel 225 as compared to the error circuitry 242 that may be associated with multiple channels 225 within the channel group 240.
As shown in fig. 2, the central controller 210 may include a Media Management Layer (MML) 212 that may be used to translate memory access requests according to a particular protocol, such as CXL compliant requests, into a protocol that is compliant with a particular central controller architecture and/or a particular type of memory media, such as memory device 230. Unlike the controller 100 shown in FIG. 1, which does not illustrate a cache in the central controller 110, the central controller 210 includes a cache 211 and associated cache controllers. The cache 211 may be used, for example, to temporarily store frequently accessed data (e.g., by a host).
The cache 211 may add latency to memory operations depending on various factors (e.g., transaction load, hit rate, etc.). For example, the cache 211 may operate efficiently from a particular transfer rate of the host (e.g., 32 GT/s); however, if the transfer rate from the host increases (e.g., to 64 GT/s) such that the clock speed corresponding to the cache 211 cannot keep pace with the increased transfer rate, the cache 211 may become a bottleneck. As another example, if the transfer rate between the front end 204 and the host (e.g., host transfer rate) increases relative to the transfer rate between the front end 204 and the central controller 210, a memory access request queue (not shown) in the front end 204 of the controller 200 and/or a cache lookup request queue (not shown) in the central controller 210 may become full or overloaded.
As described further below, various embodiments of the present disclosure may provide a cache architecture that may reduce adverse effects (e.g., on latency) that may be caused, for example, by an increase in host transfer rate. For example, as shown in fig. 3, 4A, and 4B, various embodiments may include providing multiple separate caches (e.g., per channel group) that may be operated independently (e.g., by a central controller) to service more memory access requests per unit time than a single cache (e.g., multiple cache lookup operations may be performed in parallel on caches of respective channel groups).
Fig. 3 is a block diagram of a memory controller 300 having a cache architecture that may operate in accordance with several embodiments of the present disclosure. Memory controller 300 is similar to memory controller 200 shown in FIG. 2, except that cache 211 in FIG. 2 is replaced with a plurality of separate and independently operated caches 311-1, 311-2, …, 311-X corresponding to respective channel groups (e.g., RAS channels) 340-1, 340-2, …, 340-X. In embodiments in which the channel group includes a separate and independently operating cache, the channel group may be referred to as a "cached RAS channel".
Thus, as shown in fig. 3, the controller 300 includes a front end portion 304, a central portion 310, and a rear end portion 319. The front end portion 304 includes a front end PHY 305 for interfacing with a host via the communication link 302 (which may be, for example, a CXL link). The front end 304 includes a front end controller 306 to manage the interface and communicate with a central controller 310. In an embodiment in which link 302 is a CXL link, front-end controller 306 is configured to receive a memory access request for memory device 330 (e.g., from a host) in accordance with the CXL protocol and to provide a memory access response corresponding to the memory access request (e.g., to the host) in accordance with the CXL protocol. Front-end controller 306 may include memory access request/response buffers 307, 308, 313, and 314, which may be similar to respective buffers 107, 108, 113, and 114 described in FIG. 1.
The controller 200 is coupled to a memory device 330 via a number of memory channels 325. In this example, memory channel 325 is organized into a number of channel groups 340-1, 340-2, …, 340-X. In this example, each channel group 340 includes "M" memory channels 325. For example, channel group 340-1 includes memory channels 325-1-1, 325-1-2, …, 325-1-M, channel group 340-2 includes memory channels 325-2-1, 325-2-2, …, 325-2-M, and channel group 340-X includes memory channels 325-X-1, 325-X-2, …, 325-X-M.
The back end portion 319 of the controller 300 includes a plurality of Memory Channel Controllers (MCCs) 328 for interfacing with memory devices 330 corresponding to respective memory channels 325. As shown in FIG. 3, memory channel controllers 328-1-1, 328-1-2, …, 328-1-M corresponding to channel group 340-1 are coupled to memory device 330 via respective channels 325-1-1, 325-1-2, …, 325-1-M. Although not shown in fig. 3, the back end 319 includes a PHY memory interface for coupling to the memory device 330.
The respective tunnels 325 of the tunnel groups 340-1, 340-2, …, 340-X operate together for the purpose of one or more RAS schemes. Thus, the channel group 340 may be referred to as a "RAS channel". In this example, the channel groups 340-1, 340-2, …, 340-X include respective error circuitry (RAS channel circuitry) 342-1, 342-2, …, 342-X. Error circuitry 342 may include various circuitry for error detection and/or error correction (which may include data recovery). Error circuitry 342 may also include CRC circuitry, ECC, circuitry, RAID circuitry, and/or chip kill circuitry, including various combinations thereof. The channel groups 340-1, 340-2, …, 340-X may be independently operated by the central controller 310 such that memory access requests and/or error operations may be performed separately (and concurrently) on the memory devices 330 corresponding to the respective channel groups 340.
As shown in fig. 3, each of the channel groups 340 may include memory channel data path circuitry (mem_ch) 326 associated with a corresponding memory channel 325 of the particular channel group 340. For example, channel group 340-1 includes memory channel data path circuitry 326-1-1, 326-1-2, …, 326-1-M corresponding to respective channels 325-1-1, 325-1-2, …, 325-1-M. Similarly, channel group 340-2 includes memory channel data path circuitry 326-2-1, 326-2-2, …, 326-2-M corresponding to respective channels 325-2-1, 325-2-2, …, 325-2-M, and channel group 340-X includes memory channel data path circuitry 326-X-1, 326-X-2, …, 326-X-M corresponding to respective channels 325-X-1, 325-X-2, …, 325-X-M. Data path circuitry 326 may include error circuitry corresponding to error detection or error correction on a particular memory channel 325. For example, the data path circuitry 326 may include CRC circuitry or ECC circuitry. That is, the error circuitry of the data path circuitry 326 may be associated with a particular memory channel 325 or dedicated to a particular memory channel 325 as compared to the error circuitry 342 that may be associated with multiple channels 325 within a channel group 340.
As shown in fig. 3, the central controller 310 may include a Media Management Layer (MML) 312 that may be used to translate memory access requests according to a particular protocol, such as CXL compliant requests, into a protocol that is compliant with a particular central controller architecture and/or a particular type of memory media, such as memory device 330.
The central controller 310 includes a plurality of caches 311-1, 311-2, …, 311-X corresponding to respective channel groups 340-1, 340-2, …, 340-X. The cache 311 includes an associated cache controller for independently operating the respective caches. Caches 311-1, 311-2, …, 311-X may be set-associative caches, for example. In various embodiments, the physical address regions associated with the caches 311 (e.g., assigned to the caches 311) do not overlap, which may ensure that all "X" caches 311 may access the memory device 330 concurrently.
A number of embodiments may include receiving a memory access request (e.g., a read or write request) from a host (e.g., host 103 shown in fig. 1) at the memory controller 300. The controller 300 may execute the memory access request by determining to which of the caches 311 the address corresponding to the access request corresponds. The controller may then execute the access request using the corresponding cache (e.g., 311-1), RAS channel circuitry (e.g., 342-1), memory channel data path circuitry (e.g., 326-1-1, 326-1-2, …, 326-1-M), and back-end memory channel controllers (e.g., 328-1-2, …, 328-1-M) to access the corresponding memory device 330 via the corresponding memory channel (e.g., 325-1-1, 325-1-2, …, 325-1-M).
Fig. 4A is a block diagram of a portion of a memory controller having a memory access request/response buffer architecture in accordance with several embodiments of the present disclosure. Fig. 4A illustrates a front end portion 404 and a central controller portion 410 of a memory controller, such as the memory controller 300 shown in fig. 3. The back end portion (e.g., 319) is omitted for clarity.
The front end 404 includes a PHY 405 and a front end controller 406 for interfacing with a host via the link 402. In contrast to the examples shown in fig. 1, 2, and 3, the front end 404 of the embodiment shown in fig. 4A does not include a memory access request/response buffer. In particular, the memory access request/response buffers are implemented as multiple sets of memory access request/response buffers in the central controller 410.
In this example, the central controller 410 includes a first set of memory access request/response buffers 407-1, 408-1, 413-1 and 414-1, a second set of memory access request/response buffers 407-2, 408-2, 413-2 and 414-2, a third set of memory access request/response buffers 407-3, 408-3, 413-3 and 414-3 and a fourth set of memory access request/response buffers 407-4, 408-4, 413-4 and 414-4. The sets of memory access request/response buffers may correspond to respective channel groups (e.g., RAS channels) 340-1, 340-2, …, 340-X described in FIG. 3. As depicted in FIG. 3, each of the lane groups in FIG. 4A includes a respective corresponding independent cache 411-1, 411-2, 411-3, and 411-4 (corresponding to the RAS lanes of the 4 caches). Each channel group may also include a corresponding error manager 450-1, 450-2, 450-3, and 450-4 for performing error detection/correction operations associated with reading data from and writing data to the memory devices corresponding to the respective channel group. Error manager 450 may include various error circuitry such as, for example, CRC circuitry, ECC circuitry, RAID recovery circuitry, and chip kill circuitry described above, as well as other circuitry associated with providing data protection, reliability, integrity, authenticity, and the like.
In the example shown in fig. 4A, the central controller 410 includes a media management layer (MML 1) 412-1, which may include circuitry configured to translate requests from a host protocol that may not conform to the central controller 410 to a different protocol that conforms to the central controller 410. The central controller 410 includes additional media management layers 412-2-1, 412-2-2, 412-2-3, and 412-2-4 corresponding to respective channel groups. As shown in fig. 4A, media management layers 412-2-1, 412-2-2, 412-2-3, and 412-2-4 may include respective sets of memory access request/response buffers 407, 408, 413, and 414 and may include circuitry configured to provide additional functionality associated with, for example, translating requests between the buffer sets and corresponding caches 411-1, 411-2, 411-3, and 411-4.
Although central controller 410 illustrates a cached channel group (e.g., a cached RAS channel), embodiments are not so limited. For example, central controller may not include caches 411-1, 411-2, 411-3, and 411-4 (e.g., central controller 410 may be uncached).
The embodiment described in fig. 4A may provide various benefits over existing memory controller architectures. For example, removing the request/response buffer from the front end 404 may reduce or eliminate queue congestion associated with the front end 404 that previously caused the front end 404 to become a bottleneck for controller and/or memory system latency. Moreover, providing multiple sets of request/response buffers instead of a single set of buffers may reduce latency by providing increased parallelism because multiple sets of buffers may operate concurrently in association with executing memory access requests.
Fig. 4B is a block diagram of a portion of a memory controller having a memory access request/response buffer architecture in accordance with several embodiments of the present disclosure. The embodiment depicted in fig. 4B is identical to fig. 4A, except for the following differences: in addition to the multiple sets of buffers of the central controller 410, the front-end controller 406 also includes memory access request response buffers 407 (REQ), 408 (DATA_W), 413 (RESP), and 414 (DATA_R).
In several embodiments, the front-end buffers 407, 408, 413, and 414 may have a reduced depth as compared to the central controller buffer, and memory access requests may be passed to the central controller buffer through the front-end buffers to avoid front-end congestion. As an example, the front-end buffer may have a queue depth of 1. For example, in an example where the front end 404 is manufactured separately from the central controller 410, the embodiment shown in fig. 4B may be beneficial. In such examples, the front-end buffer may be included within the front-end, but it may be beneficial to operate the front-end buffer in a manner that, for example, facilitates using multiple buffer sets provided in the central controller 410 in order to provide improved system latency.
Fig. 5 is a block diagram of a portion of a memory controller having a memory access request/response buffer architecture in accordance with several embodiments of the present disclosure. Fig. 5 illustrates a front end portion 504 and a central controller portion 510 of a memory controller, such as the memory controller 300 shown in fig. 3. The back end portion (e.g., 319) is omitted for clarity.
The front end 504 includes a PHY 505 and a front end controller 506 for interfacing with a host via the link 502. In contrast to the examples described in fig. 1, 2, 3, 4A, and 4B, the front end 504 of the embodiment shown in fig. 5 includes multiple sets of memory access request/response buffers.
In this example, the front end 504 includes a first set of memory access request/response buffers 507-1, 508-1, 513-1, and 514-1, a second set of memory access request/response buffers 507-2, 508-2, 513-2, and 514-2, a third set of memory access request/response buffers 507-3, 508-3, 513-3, and 514-3, and a fourth set of memory access request/response buffers 507-4, 508-4, 513-4, and 514-4. The sets of memory access request/response buffers may correspond to respective channel groups (e.g., RAS channels) 340-1, 340-2, …, 340-X described in FIG. 3.
In FIG. 5, the channel groups include respective corresponding error managers 550-1, 550-2, 550-3, and 550-4 for performing error detection/correction operations associated with reading data from and writing data to memory devices corresponding to the respective channel groups. Error manager 550 may include various error circuitry such as, for example, CRC circuitry, ECC circuitry, RAID recovery circuitry, and chip kill circuitry described above, as well as other circuitry associated with providing data protection, reliability, integrity, authenticity, and the like. Although the group of channels shown in the example of fig. 5 does not include a respective corresponding cache (e.g., per RAS channel), embodiments are not so limited. For example, central controller 510 may include separate and independently operated caches per channel group, such as shown in fig. 4A and 4B.
In the example shown in fig. 5, front end 504 includes a media management layer (MML 1) 512-1, which may include a processor configured to translate requests from a host protocol (e.g., CXL protocol) that may not be compliant with operating multiple sets of memory access request/response buffers to a different protocol that is compliant with operating multiple sets of memory access request/response buffers. In this example, front end 504 includes additional media management layers 512-2-1, 512-2-2, 512-2-3, and 512-2-4 corresponding to respective channel groups. As shown in fig. 5, media management layers 512-2-1, 512-2-2, 512-2-3, and 512-2-4 may include respective sets of memory access request/response buffers 507, 508, 513, and 514 and may include circuitry configured to provide additional functionality associated with translating requests, for example, between buffer sets and central controller 510.
The embodiment described in FIG. 5 may provide various benefits over existing memory controller architectures. For example, providing multiple request/response buffers in the front end 504 may reduce or eliminate queue congestion associated with a single set of memory access request/response buffers that previously caused the front end 504 to become a bottleneck for controller and/or memory system latency.
The various methods described herein may be performed by processing logic that may comprise hardware (e.g., a processing device, circuitry, dedicated logic, programmable logic, microcode, hardware of the device, integrated circuits, etc.), software (e.g., instructions run or executed on a processing device), or a combination thereof. The order of the processes may be modified unless otherwise specified. Thus, the illustrated embodiments should be understood as examples only, and the described processes may be performed in a different order, and some processes may be performed in parallel. Additionally, in various embodiments, one or more processes may be omitted. Thus, not all processes are required in every embodiment. Other process flows are possible.
Although specific embodiments have been illustrated and described herein, it will be appreciated by those of ordinary skill in the art that an arrangement calculated to achieve the same results may be substituted for the specific embodiments shown. This disclosure is intended to cover adaptations or variations of one or more embodiments of the present disclosure. It is to be understood that the above description has been made in an illustrative manner, and not a restrictive one. Combinations of the above embodiments, and other embodiments not explicitly described herein, will be apparent to those of skill in the art upon reviewing the above description. The scope of one or more embodiments of the present disclosure includes other applications in which the above structures and processes are used. The scope of one or more embodiments of the disclosure should, therefore, be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.
In the foregoing detailed description, certain features have been grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the disclosed embodiments of the disclosure have to use more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus the following claims are hereby incorporated into the detailed description, with each claim standing on its own as a separate embodiment.

Claims (16)

1. An apparatus having a memory controller architecture, comprising:
a plurality of memory devices (130-1, …, 130-N;230; 330); a kind of electronic device with high-pressure air-conditioning system
A memory controller (100; 200; 300) coupled to the plurality of memory devices via a plurality of memory channels (125-1, …, 125-N);
wherein the plurality of memory channels are organized into a plurality of channel groups (340-1, 340-2, …, 340-X); and is also provided with
Wherein the memory controller comprises a plurality of memory access request/response buffer groups (307, 308, 313, 314), and wherein each memory access request/response buffer group of the plurality of memory access request/response buffer groups corresponds to a different one of the plurality of channel groups.
2. The apparatus of claim 1, wherein the memory controller is configured to access the plurality of memory access request/response buffer sets concurrently in association with performing memory access requirements on memory devices corresponding to respective channel groups.
3. The apparatus of claim 1, wherein each memory access request/response buffer set of the plurality of memory access request/response buffer sets comprises:
a read request buffer (407-1, 407-2, 407-3, 407-4);
write request buffers (408-1, 408-2, 408-3, 408-4);
a read response buffer (414-1, 414-2, 414-3, 414-4); a kind of electronic device with high-pressure air-conditioning system
Response buffers (413-1, 413-2, 413-3, 413-4) are written.
4. A device according to claim 3, wherein:
according to the computational fast link CXL protocol, the read request buffer is a master-to-slave M2S request buffer;
according to the CXL protocol, the write request buffer is an M2S request buffer;
according to the CXL protocol, the read response buffer is a slave-to-master S2M response buffer; and is also provided with
According to the CXL protocol, the write request buffer is an S2M response buffer.
5. The apparatus of any one of claims 1-4, wherein the plurality of channel groups comprise:
A first channel group comprising a first number of the plurality of memory channels; a kind of electronic device with high-pressure air-conditioning system
A second channel group comprising a second number of the plurality of memory channels;
wherein the first group of channels includes first error correction circuitry that is operated by the memory controller in association with accessing a memory device corresponding to the first number of the plurality of memory channels; and is also provided with
Wherein the second group of channels includes second error correction circuitry that is operated by the memory controller and independent of the first error correction circuitry in association with accessing memory devices corresponding to the second number of the plurality of memory channels.
6. The apparatus of claim 5, wherein the plurality of channel groups comprise:
a third channel group comprising a third number of the plurality of memory channels; and is also provided with
Wherein the third group of channels includes third error correction circuitry that is operated by the memory controller in association with accessing memory devices corresponding to the third number of the plurality of memory channels and independent of the first and second error correction circuitry.
7. The apparatus of any of claims 1-4, wherein the memory controller comprises a plurality of independent caches (311-1, 311-2, …, 311-X;411-1, 411-2, 411-3, 411-4), wherein each cache of the plurality of independent caches corresponds to a different one of the plurality of channel groups, and wherein each respective channel group comprises at least two memory channels of the plurality of memory channels.
8. The apparatus of any one of claims 1 to 4, wherein the memory controller comprises: a front end portion (104; 304;404; 504) comprising a signal path via a CXL link (102; 202;302;
402;502 A computing fast link CXL controller (206; 306. 406).
9. The apparatus of claim 8, wherein the front end portion does not include master-to-slave M2S and slave-to-master S2M buffers, and wherein the plurality of memory access request/response buffer sets are located in a central controller portion (410) of the memory controller.
10. The apparatus of claim 8, wherein the plurality of memory access request/response buffer sets are located in the CXL controller of the front-end portion of the memory controller.
11. The apparatus of any one of claims 1-4, wherein the memory controller is configured to:
operating the plurality of channel groups as independent respective reliability, availability, and serviceability RAS channels; a kind of electronic device with high-pressure air-conditioning system
One of a chip kill error correction scheme and a RAID error recovery scheme is implemented per RAS channel.
12. A memory controller (100; 200; 300) includes:
a front end portion (104; 304;404; 504) configured to be coupled to a host (103) via an interface (102; 202;302;402; 502);
a back end portion (119; 219) configured to be coupled to a plurality of memory devices (130-1, …, 130-N;230; 330) via a plurality of memory channels (125-1, …, 125-N), wherein the plurality of memory channels are organized into a plurality of channel groups (340-1, 340-2, …, 340-X); a kind of electronic device with high-pressure air-conditioning system
A central portion (410) comprising a plurality of memory access request/response buffer sets, wherein each memory access request/response buffer set of the plurality of memory access request/response buffer sets corresponds to a different one of the plurality of channel groups.
13. The memory controller of claim 12, wherein the front-end portion includes a single different memory access request/response buffer group having a queue depth that is less than a queue depth corresponding to the plurality of memory access request/response buffer groups.
14. The memory controller of claim 12, wherein the plurality of memory access request/response buffer sets are configured to operate in parallel in association with performing memory access requests in parallel on memory devices corresponding to the respective channel groups.
15. The memory controller of claim 12, wherein the central portion includes, for each respective channel group:
first error circuitry configured to implement error detection/correction across the plurality of memory channels corresponding to the respective channel group; a kind of electronic device with high-pressure air-conditioning system
Second error circuitry configured to implement error detection/correction per memory channel.
16. An apparatus having a memory controller architecture, comprising:
a plurality of memory devices (130-1, …, 130-N;230; 330); a kind of electronic device with high-pressure air-conditioning system
A memory controller (100; 200; 300) coupled to the plurality of memory devices via a plurality of memory channels (125-1, …, 125-N), wherein the plurality of memory channels are organized into a plurality of channel groups (340-1, 340-2, …, 340-X);
wherein the memory controller comprises:
a front end portion (104; 304;404; 504) configured to:
Receiving a memory access request from a host (103) according to a computing fast link CXL protocol; a kind of electronic device with high-pressure air-conditioning system
Providing a memory access response to the host according to the CXL protocol; a kind of electronic device with high-pressure air-conditioning system
A central portion (410) comprising a plurality of memory access request/response buffer sets, wherein each memory access request/response buffer set of the plurality of memory access request/response buffer sets corresponds to a different one of the plurality of channel groups.
CN202310748981.7A 2022-06-30 2023-06-21 System, apparatus, and method for memory controller architecture Pending CN117331859A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US63/357,562 2022-06-30
US18/202,802 2023-05-26
US18/202,802 US20240004799A1 (en) 2022-06-30 2023-05-26 Memory controller architecture

Publications (1)

Publication Number Publication Date
CN117331859A true CN117331859A (en) 2024-01-02

Family

ID=89289118

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310748981.7A Pending CN117331859A (en) 2022-06-30 2023-06-21 System, apparatus, and method for memory controller architecture

Country Status (1)

Country Link
CN (1) CN117331859A (en)

Similar Documents

Publication Publication Date Title
US9921914B2 (en) Redundant array of independent disks (RAID) write hole solutions
US11928025B2 (en) Memory device protection
US11741034B2 (en) Memory device including direct memory access engine, system including the memory device, and method of operating the memory device
US10990291B2 (en) Software assist memory module hardware architecture
KR20160110148A (en) Memory devices and modules
US11869626B2 (en) Internal and external data transfer for stacked memory dies
US11687273B2 (en) Memory controller for managing data and error information
US5771247A (en) Low latency error reporting for high performance bus
US20240004791A1 (en) Controller cache architeture
KR102334739B1 (en) Memory module, system, and error correction method thereof
US20230280940A1 (en) Memory controller for managing raid information
CN116486891A (en) Shadow DRAM with CRC+RAID architecture for high RAS features in CXL drives, system and method
US20220261363A1 (en) Controller for managing multiple types of memory
CN117331859A (en) System, apparatus, and method for memory controller architecture
US20240004799A1 (en) Memory controller architecture
US20240086090A1 (en) Memory channel disablement
US11960770B2 (en) Access request management using sub-commands
US20230236933A1 (en) Shadow dram with crc+raid architecture, system and method for high ras feature in a cxl drive
US20240126441A1 (en) Controller architecture for reliability, availability, serviceability access
US20230395184A1 (en) Post package repair management
US20240126477A1 (en) Read data alignment
US20220358042A1 (en) Coherent memory system
US11960776B2 (en) Data protection for stacks of memory dice
US11934270B2 (en) Write command execution for data protection and recovery schemes
CN117331866A (en) Controller cache architecture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication