WO2025106273A1 - Dynamically configurable multi-ported dram - Google Patents

Dynamically configurable multi-ported dram Download PDF

Info

Publication number
WO2025106273A1
WO2025106273A1 PCT/US2024/054013 US2024054013W WO2025106273A1 WO 2025106273 A1 WO2025106273 A1 WO 2025106273A1 US 2024054013 W US2024054013 W US 2024054013W WO 2025106273 A1 WO2025106273 A1 WO 2025106273A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
memory
port
command
width
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2024/054013
Other languages
French (fr)
Inventor
Wendy Elsasser
Michael Raymond MILLER
Brent Steven Haukness
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Rambus Inc
Original Assignee
Rambus Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rambus Inc filed Critical Rambus Inc
Publication of WO2025106273A1 publication Critical patent/WO2025106273A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/10Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers
    • G11C7/1075Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers for multiport memories each having random access ports and serial ports, e.g. video RAM
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1605Handling requests for interconnection or transfer for access to memory bus based on arbitration
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1668Details of memory controller
    • G06F13/1689Synchronisation and timing concerns
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C11/00Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor
    • G11C11/21Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements
    • G11C11/34Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices
    • G11C11/40Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors
    • G11C11/401Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors forming cells needing refreshing or charge regeneration, i.e. dynamic cells
    • G11C11/4063Auxiliary circuits, e.g. for addressing, decoding, driving, writing, sensing or timing
    • G11C11/407Auxiliary circuits, e.g. for addressing, decoding, driving, writing, sensing or timing for memory cells of the field-effect type
    • G11C11/409Read-write [R-W] circuits 
    • G11C11/4096Input/output [I/O] data management or control circuits, e.g. reading or writing circuits, I/O drivers or bit-line switches 
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/10Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers
    • G11C7/1051Data output circuits, e.g. read-out amplifiers, data output buffers, data output registers, data output level conversion circuits
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/10Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers
    • G11C7/1078Data input circuits, e.g. write amplifiers, data input buffers, data input registers, data input level conversion circuits
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/22Read-write [R-W] timing or clocking circuits; Read-write [R-W] control signal generators or management 

Definitions

  • DRAM Dynamic Random- Access Memory
  • DRAM utilizes a shared data port for both reading and writing operations, a process managed by an external host.
  • the data port cannot handle read and write operations simultaneously, so read transactions delay write transactions, and vice versa. Switching between read and write transactions also produces a transition period, a so-called "turnaround time,” during which the port is idle. Bidirectional data ports and their associated turnaround times impose undesirable latencies on memory transactions.
  • Figure 1 depicts a memory device 100 with a sixteen-pad data port 105 (signals DQ[15:0]) for sending and receiving read and write data from an array of bank groups BG[3:0], each of which includes four memory banks Bank[3:0] to store data.
  • Figure 2 is a timing diagram 200 illustrating a sequence to two read transactions for memory device 100 in the full- width mode illustrated in Figure 1.
  • Figure 3 depicts memory device 100, introduced in Figure 1, configured in a dualport mode such that port 105 is fractionalized into two eight-bit ports DQ[15:8] and DQ[7:0] that communicate data signals in thirty-two-bit bursts.
  • Figure 4 is a timing diagram 400 illustrating interleaved read and write transactions in the dual-port mode illustrated in Figure 3.
  • Figure 5 depicts a memory system 500 in accordance with an embodiment in which a host (e.g. a memory controller) 505 dynamically adjusts the width fraction of a data channel 510 to a memory 100 of the type introduced in Figure 1.
  • a host e.g. a memory controller
  • Figure 6 is a timing diagram 600 illustrating how memory system 500 of Figure 5 can dynamically switch from a full-width mode to one of two fractionalized modes.
  • a memory device has a multi-pad data port for sending and receiving bursts of read and write data, respectively, to and from a requesting host (e.g. a memory controller).
  • the host can fractionalize the multi-pad data port into narrower ports to simultaneously communicate read and write data.
  • the memory device increases burst lengths in proportion to reductions in port width to maintain access granularity.
  • the width fractions of the narrower data ports can be controlled by the host responsive to e.g. a ratio of read to write transactions.
  • Figure 1 depicts a memory device 100 with a sixteen-pad data port 105 (signals DQ[15:0]) for sending and receiving read and write data from an array of bank groups BG[3:0], each of which includes four memory banks Bank[3:0] to store data.
  • memory device 100 can fractionalize data port 105 into two eight-pad data ports, fractions of port 105 labeled DQ[15:8] and DQ[7:0], that can separately and simultaneously communicate data signals in the same or opposite directions.
  • Each of bank groups BG[3:0] includes a bank-group multiplexer/demultiplexer (mux) 110.
  • Each of banks Bank[3:0] includes a respective 256-bit input/output sense amplifier IOSA[3:0] with a data latch 107 and a memory -bank port to communicate read and write data in parallel as data symbols.
  • the memory banks include rows and columns of dynamic, randomaccess memory cells (DRAM) storing one-bit binary symbols in this example.
  • IO sense amplifiers lOSA[3:0] communicate a column of 256 bits for each read or write access.
  • Mux 110 couples a 256-bit-wide data port, physically partitioned into two 128-bit-wide data ports connected to respective data buses IO-Peri[255:128] and IO-Peri[127:0], to one bank at a time.
  • memory device 100 can include error-correction-code (ECC) circuitry that encodes write data with ECC bits. During read operations, the ECC circuitry can identify and fix bit errors in the encoded data, enhancing the reliability of the memory.
  • ECC error-correction-code
  • Data port 105 is part of a memory interface 115 with a command decoder 120 that decodes read and write commands on a command port CMD and responsively controls read and write transactions.
  • Timing circuitry 125 employs an external clock signal CLK to calibrate internal clock signals ICLK for command decoder 120 and ICLK_DLL for a pair of serializers 130 that convert 128-bit-wide read data on buses IO-Peri[255:128] and IO-Peri[127:0] to sixteen-bit bursts of eight-bit-wide read data on respective ports DQ[15:8] and DQ[7:0].
  • timing circuitry 125 issues a complementary strobe signal DQS[l:0] that accompanies read data to provide the requesting host with timing information for sampling the read data.
  • timing circuitry 125 derives an internal timing reference IDQS for a pair of deserializers 135 that convert sixteen-bit bursts of eight-bit- wide write data on ports DQ[15:8] and DQ[7:0] to 128-bit-wide write data on respective internal data buses 10- Peri[255: 128] and IO-Peri[127:0].
  • Interface 115 is equipped with equalizing receivers 140 and transmitters 145 that compensate for signal degradation, primarily caused by the channel (the physical medium through which the signal travels, like PCB traces or cables) between memory device 100 and the requesting host.
  • Muxes 110 illustrated as within bank groups BG[3:0], are configurable components of the data paths within interface 115.
  • a mode register 150 connected to command decoder 120 can be loaded with a value that places memory device 100 in one of two modes, a wide mode in which memory device 100 manages read and write transactions of sixteen-bit-wide data in sixteen-bit bursts on nodes DQ[15:0], and a fractionalized mode that divides data port 105 into a pair of eight-bit- wide ports that support separate and simultaneous eight-bit-wide read and write transactions in thirty-two- bit bursts on nodes DQ[15:8] and DQ[7:0].
  • Mode register 150 can be loaded once, such as during manufacturing or at start-up, or can be loaded responsive to host commands in support of dynamic reconfigurability based on e.g. the ratio of read to write transactions.
  • One-time programming can be accomplished using e.g. fuses or antifuses, while various forms of rewritable memory can be used in support of reconfigurability.
  • Memory device 100 is illustrated as supporting a read transaction in the wide mode by highlighting data paths in bold.
  • a read command on command port CMD specified a column address for an open row in bank BankO in bank group BG0, causing sense amplifiers IOSA0 in bank group BG0 to sense 256 bits of read data and convey them to the neighboring mux 110.
  • Mux 110 part of the data interface for bank group BG0, outputs these 256 bits in two parts, half to each of data ports DQ[15:8] and DQ[7:0] via intervening data buses IO-Peri[255: 128] and 10- Peri[ 127:0], serializers 130, and transmitters 145.
  • Figure 2 is a timing diagram 200 illustrating a sequence of two read transactions for memory device 100 in the full-width mode illustrated in Figure 1.
  • Signal names along the vertical axis correspond to ports or nodes in Figure 1.
  • a port is one or more circuit nodes where a signal occurs between communicating circuits.
  • data port 105 includes sixteen nodes that allow the transfer of sixteen bits of data simultaneously between memory device 100 and an external host.
  • Figure 3 depicts memory device 100, introduced in Figure 1, configured in a dualport mode such that port 105 is fractionalized into two eight-bit ports DQ[15:8] and DQ[7:0] that communicate data signals in thirty-two-bit bursts. Both ports can both read and write data as directed by the same command port CMD.
  • This illustration highlights separate read and write data paths using bold lines, a read path from bank BankO of bank group BG0 and a write path to bank Bank2 of bank group BG3.
  • mux 110 in bank group BG0 shifts out 256-bit data onto peripheral IO bus IO-Peri[255:128] as two successive 128-bit blocks to account for the halving of the number of traces.
  • Mux 110 in bank group BG3, operating as a demultiplexer, performs the opposite task of combining two successive 128-bit blocks into a single 256-bit block for the selected sense amplifier IOSA2.
  • the latch 107 associated with sense amplifier IOSA2 can latch write data to relieve write timing constraints.
  • FIG 4 is a timing diagram 400 illustrating interleaved read and write transactions in the dual-port mode illustrated in Figure 3.
  • Data port 105 is split into two independent half-width ports, a read port DQ[15 : 8] and a write port DQ[7:0].
  • the same command bus CMD services both of these ports. Commands directed to different ports can be issued back-to-back, as illustrated by the leftmost read and write commands RD and WR to banks BankO and Bank2. Commands directed to the same port are separated by a minimum command-to-command interval of tCCD_F, which is twice as long as interval tCCD_S in the full-width mode.
  • the burst length of data on ports DQ[ 15 : 8] and DQ[7:0] is doubled to thirty-two bits so the access granularity is unaffected by the halving of the data pads relative to the full-width mode of Figure 1.
  • the first transaction is initiated by a read command RD addressed to a bank BankO (of bank group BG0 in the highlighted read path of Figure 3).
  • Command decoder 120 decodes these command and address signals to provide the appropriate control signals on peripheral command bus CMD-Peri to enable BankO of bank group BG0 to prefetch a 256-bit column of data, which appears as a shaded block 405 on 256-bit IOSA[255:0](BG0).
  • the multiplexer 110 in bank group BG0 presents these data as consecutive sets of 128-bit-wide data 410A and 410B, shaded to match block 405, on peripheral data bus IO- Peri[255: 128].
  • Latch 107 ( Figure 1) holds at least the latter of the early and late portions 410A and 410B of the data burst on bus IO-Peri[255: 128] so the respective IOS A can begin servicing a subsequent transaction.
  • Blocks 410A and 410B can be any exclusive 128-bit subsets of block 405 (e.g., every other bit or the most- and least-significant 128 bits).
  • serializer 130 connected to 10-Peri[255:128] then converts data 410A and 410B into an eight-bit-wide, thirty- two-bit burst 415, shaded to match blocks 405 and 410A/B, for transmission data port DQ[15:8].
  • Two other read commands RD are illustrated using like-shaded collections of data blocks on the respective signal nodes.
  • the second transaction is initiated by a write command WR addressed to bank Bank2 of bank group BG3, again as illustrated in Figure 3.
  • a thirty-two-bit burst of write data 420 follows the write command, this by-eight port conveying 32x8-256 bits of write data.
  • the deserializer 135 associated with data port DQ[7:0] converts these data into consecutive sets of 128-bit-wide data 425A and 425B, shaded to match block 415, on peripheral data bus 10- Peri[ 127:0].
  • Command decoder 120 decodes the write command to provide the appropriate control signals on peripheral command bus CMD-Peri to enable the mux 110 in bank group BG3 to convert data 425A and 425B into 256-bit-wide data 430 for storage in bank Bank2.
  • Two other write commands WR are illustrated using like-shaded collections of data blocks on the respective signal nodes.
  • FIG. 5 depicts a memory system 500 in accordance with an embodiment in which a host (e.g. a memory controller) 505 dynamically adjusts the width fraction of a data channel 510 to a memory 100 of the type introduced in Figure 1 .
  • Host 505 includes command, timing, and data circuitry connected to command port CMD, strobe port DQS[l:0], and data ports DQ[15:8] and DQ[7:0], which connect to the like-identified ports on memory die 100 via a memory channel 510.
  • the data circuitry includes two data queues 515 and 520 that connect via a pair of multiplexer/demultiplexers 525 and 530 to ports DQ[ 15: 8] and DQ[7:0].
  • data queue 515 conveys internal data signals DQiA to all sixteen data lanes DQ[15:0] via both muxes 525 and 530 under control of a selection signal DMux.
  • data queue 515 communicates internal read and write data signals DQiA with data lanes DQ[15:8] via mux 525 or lanes DQ[7:0] via mux 530.
  • Data queue 520 likewise communicates a second set of internal data signals DQiB with the remaining eight data lanes via the other of muxes 525 and 530.
  • Data-signal timing is referenced with respect to strobe signal DQS[l :0] in both the read and write directions and under control of timing circuitry 535.
  • Other circuitry is timed to an external clock signal CLK, the function of which is well understood by those of skill in the art.
  • the control circuitry conveys internal command and address signals CMDi to command port CMD using command paths directed to dual-port control logic 540.
  • a mux 545 directs commands CMDi accumulated in a command queue 550 before a mux 555 under control of dual-port control logic 540 allows scheduling logic 560 to issue commands CMD.
  • commands CMDi accumulate in two command queues, queue 550 as before plus a second command port 565.
  • a second instance of scheduling logic 570 is also included.
  • Control logic 540 uses e.g. a command traffic profile, such as the ratio of read to write commands, queue depths, and latency requirements, to determine when to switch between dual- and single-port modes.
  • Host 505 then issues access commands that dynamically configure memory die 100 to manage each read and write transaction with a selected full- width or fractionalized port.
  • host 505 can issue separate commands to load mode register 150 on memory die 100 as necessary to accommodate mode changes via a separate command on port CMD or using a separate communication channel (not shown).
  • memory device 100 might be a DRAM die that sits in a memory hierarchy between a processor’s internal caches and larger, slower devices such as hard drives or solid-state drives (SSDs) used for longer-term storage. Every cache entry is read from device 100, but only changed cache entries are written back. Memory device 100 thus tends to perform more read transactions than write transactions. Moreover, many applications spend more time reading data than modifying or writing data. For instance, a user browsing the Internet may download and read much more content than they upload. Applications like streaming services, multimedia players, or document readers are more often retrieving and presenting data rather than writing new data. Memory system 500 can optimize overall performance by adjusting the ratio of write bandwidth to read bandwidth responsive to relative measures of read and write traffic.
  • SSDs solid-state drives
  • host 505 can maintain memory die 100 in the full-width mode for maximum read bandwidth, only occasionally making half of the data links available for write traffic. Read transactions can thus proceed, albeit at a reduced bandwidth, when memory system 500 services the occasional write transaction. A low read latency can thus be maintained.
  • Write transactions need not be rare, or even rare relative to read transactions, for host 505 to optimize channel 510 for whatever the ratio of read and write transactions; the two halfwidth channels can be used in support of independent reads or writes; and the fractionalized channels are not limited to two or to the same data widths in other embodiments.
  • the eight-bit data paths that service ports DQ[15:8] and DQ[7:0] in memory device 100 could further be divided into four-bit data paths.
  • FIG. 6 is a timing diagram 600 illustrating how memory system 500 of Figure 5 can dynamically switch from a full-width mode to one of two fractionalized modes.
  • the full-width mode conveys read and write data over all sixteen lanes DQ[15:0J.
  • the fractional mode one group of eight lanes is used for read data and the other for write data.
  • Each read and write command includes two bits (not shown) that designate the mode to memory device 100.
  • memory device 100 issues eight-bit-wide, sixteen-bit bursts for transmission over respective data ports DQ[15:8] and DQ[7:0].
  • the next read command is labeled RL (read, lower channel), which instructs memory device 100 to deliver the requested 256 bits of read data 605 on low-order data lanes DQ[7:0].
  • the multiplexer 110 in bank group BG1 ( Figure 1) presents these data as consecutive sets of 128-bit-wide data 610A and 610B, shaded to match block 605, on peripheral data bus IO- Peri[127:0].
  • the data latch 107 in the selected bank holds the prefetched data in the IOSA while that data is transmitted on IO-Peri[ 127:0] to extend the burst length.
  • This extension allows the IOSA to begin prefetching data responsive to the next command before the extended burst is read out, and thus allows the IOSA to make the next read data 620 from the same bank available in IO -Peri [127:0] (data 625) after time tCCD_L.
  • the inclusion of latches 107 thus enables back- to-back, same-bank bursts.
  • the serializer 130 connected to IO-Peri[127:0] then converts data 610A and 610B into an eight-bit-wide, thirty -two-bit burst, shaded to match blocks 605 and 610A/B, for transmission over data port DQ[7:0], Latches 107 are omitted in other embodiments, or a single latch can service a bank group.
  • Read command RL by electing to use only eight lanes DQ[7:0], leaves upper (high- order) lanes DQ[15:8] available for a subsequent write command WH (write, high-order), which instructs memory device 100 to accept 256 bits of write data as an eight-bit-wide, thirty-two-bit burst 615 on data lanes DQ[15:8J.
  • a turn-around time 625 separates the last read data from the incoming write data.
  • a subsequent read command RL produces read data 620 on lanes DQ[7 :0] .
  • Still more commands can switch memory device 100 back to the full width mode, in service of read or write transactions, or can reverse the roles of the high- and low-order DQ lanes so that write transactions can occur on lanes DQ[7:0].
  • data ports DQ[15:8] and DQ[7:0] can simultaneously communicate fractionalized data between host controller 505 and memory die 100 in the same or opposite directions across associated fractions of channel 510.
  • these subchannels can simultaneously service different hosts, conveying data in the same or opposite directions, while satisfying the same or different host requirements for e.g. quality of service.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Computer Hardware Design (AREA)
  • Multimedia (AREA)
  • Dram (AREA)

Abstract

A memory device has a relatively wide, multi-pad data port for sending and receiving bursts of read and write data, respectively, to and from a requesting host. The host can fractionalize the multi-pad data port into narrower bidirectional ports to simultaneously communicate read and write data. The memory device increases burst lengths in proportion to reductions in port width to maintain access granularity. The width fractions of the narrower data ports can be controlled by the host responsive to e.g. a ratio of read to write transactions.

Description

Dynamically Configurable Multi-ported DRAM
FIELD OF THE INVENTION rooon The subject matter presented herein relates generally to computer memory systems, controllers, and devices.
BACKGROUND
[0002] DRAM, or Dynamic Random- Access Memory, utilizes a shared data port for both reading and writing operations, a process managed by an external host. The data port cannot handle read and write operations simultaneously, so read transactions delay write transactions, and vice versa. Switching between read and write transactions also produces a transition period, a so-called "turnaround time," during which the port is idle. Bidirectional data ports and their associated turnaround times impose undesirable latencies on memory transactions.
BRIEF DESCRIPTION OF THE DRAWINGS
[0003] Figure 1 depicts a memory device 100 with a sixteen-pad data port 105 (signals DQ[15:0]) for sending and receiving read and write data from an array of bank groups BG[3:0], each of which includes four memory banks Bank[3:0] to store data.
[0004] Figure 2 is a timing diagram 200 illustrating a sequence to two read transactions for memory device 100 in the full- width mode illustrated in Figure 1.
[0005] Figure 3 depicts memory device 100, introduced in Figure 1, configured in a dualport mode such that port 105 is fractionalized into two eight-bit ports DQ[15:8] and DQ[7:0] that communicate data signals in thirty-two-bit bursts.
[0006] Figure 4 is a timing diagram 400 illustrating interleaved read and write transactions in the dual-port mode illustrated in Figure 3.
[0007] Figure 5 depicts a memory system 500 in accordance with an embodiment in which a host (e.g. a memory controller) 505 dynamically adjusts the width fraction of a data channel 510 to a memory 100 of the type introduced in Figure 1.
[0008] Figure 6 is a timing diagram 600 illustrating how memory system 500 of Figure 5 can dynamically switch from a full-width mode to one of two fractionalized modes. DETAILED DESCRIPTION
[0009] A memory device has a multi-pad data port for sending and receiving bursts of read and write data, respectively, to and from a requesting host (e.g. a memory controller). The host can fractionalize the multi-pad data port into narrower ports to simultaneously communicate read and write data. The memory device increases burst lengths in proportion to reductions in port width to maintain access granularity. The width fractions of the narrower data ports can be controlled by the host responsive to e.g. a ratio of read to write transactions.
[0010] Figure 1 depicts a memory device 100 with a sixteen-pad data port 105 (signals DQ[15:0]) for sending and receiving read and write data from an array of bank groups BG[3:0], each of which includes four memory banks Bank[3:0] to store data. At start-up, or at the direction of a host, memory device 100 can fractionalize data port 105 into two eight-pad data ports, fractions of port 105 labeled DQ[15:8] and DQ[7:0], that can separately and simultaneously communicate data signals in the same or opposite directions. In the full-width mode, memory device 100 communicates read or write data signals DQ[15:0] of width sixteen in sixteen-bit bursts, and thus provides an access granularity of 256 bits (the product of data width and burst length, or 16x16=256). In the fractionalized mode, each fractionalized port DQ[15:8] and DQ[7:0] has access to all four bank groups BG[3:0] and communicates read or write data signals of width eight in thirty-two-bit bursts, and thus again provides an access granularity of 256 bits (8x32=256).
[0011] Each of bank groups BG[3:0] includes a bank-group multiplexer/demultiplexer (mux) 110. Each of banks Bank[3:0] includes a respective 256-bit input/output sense amplifier IOSA[3:0] with a data latch 107 and a memory -bank port to communicate read and write data in parallel as data symbols. The memory banks include rows and columns of dynamic, randomaccess memory cells (DRAM) storing one-bit binary symbols in this example. IO sense amplifiers lOSA[3:0] communicate a column of 256 bits for each read or write access. Mux 110 couples a 256-bit-wide data port, physically partitioned into two 128-bit-wide data ports connected to respective data buses IO-Peri[255:128] and IO-Peri[127:0], to one bank at a time. Though not shown, memory device 100 can include error-correction-code (ECC) circuitry that encodes write data with ECC bits. During read operations, the ECC circuitry can identify and fix bit errors in the encoded data, enhancing the reliability of the memory.
[0012] Data port 105 is part of a memory interface 115 with a command decoder 120 that decodes read and write commands on a command port CMD and responsively controls read and write transactions. Timing circuitry 125 employs an external clock signal CLK to calibrate internal clock signals ICLK for command decoder 120 and ICLK_DLL for a pair of serializers 130 that convert 128-bit-wide read data on buses IO-Peri[255:128] and IO-Peri[127:0] to sixteen-bit bursts of eight-bit-wide read data on respective ports DQ[15:8] and DQ[7:0]. Also during read transactions, timing circuitry 125 issues a complementary strobe signal DQS[l:0] that accompanies read data to provide the requesting host with timing information for sampling the read data. In the write direction, timing circuitry 125 derives an internal timing reference IDQS for a pair of deserializers 135 that convert sixteen-bit bursts of eight-bit- wide write data on ports DQ[15:8] and DQ[7:0] to 128-bit-wide write data on respective internal data buses 10- Peri[255: 128] and IO-Peri[127:0]. A detailed discussion of timing circuitry 125 is omitted because the generation and management of timing references for memory devices is well known. Interface 115 is equipped with equalizing receivers 140 and transmitters 145 that compensate for signal degradation, primarily caused by the channel (the physical medium through which the signal travels, like PCB traces or cables) between memory device 100 and the requesting host. Muxes 110, illustrated as within bank groups BG[3:0], are configurable components of the data paths within interface 115.
[0013] A mode register 150 connected to command decoder 120 can be loaded with a value that places memory device 100 in one of two modes, a wide mode in which memory device 100 manages read and write transactions of sixteen-bit-wide data in sixteen-bit bursts on nodes DQ[15:0], and a fractionalized mode that divides data port 105 into a pair of eight-bit- wide ports that support separate and simultaneous eight-bit-wide read and write transactions in thirty-two- bit bursts on nodes DQ[15:8] and DQ[7:0]. Mode register 150 can be loaded once, such as during manufacturing or at start-up, or can be loaded responsive to host commands in support of dynamic reconfigurability based on e.g. the ratio of read to write transactions. One-time programming can be accomplished using e.g. fuses or antifuses, while various forms of rewritable memory can be used in support of reconfigurability.
[0014] Memory device 100 is illustrated as supporting a read transaction in the wide mode by highlighting data paths in bold. A read command on command port CMD specified a column address for an open row in bank BankO in bank group BG0, causing sense amplifiers IOSA0 in bank group BG0 to sense 256 bits of read data and convey them to the neighboring mux 110. Mux 110, part of the data interface for bank group BG0, outputs these 256 bits in two parts, half to each of data ports DQ[15:8] and DQ[7:0] via intervening data buses IO-Peri[255: 128] and 10- Peri[ 127:0], serializers 130, and transmitters 145.
[0015] Figure 2 is a timing diagram 200 illustrating a sequence of two read transactions for memory device 100 in the full-width mode illustrated in Figure 1. Signal names along the vertical axis correspond to ports or nodes in Figure 1. In this context, a port is one or more circuit nodes where a signal occurs between communicating circuits. For example, data port 105 includes sixteen nodes that allow the transfer of sixteen bits of data simultaneously between memory device 100 and an external host.
[0016] Considering signals CMD and Bank, the latter for addresses specifying a bank within a bank group, the first transaction is initiated by a read command RD addressed to an open row of bank BankO in bank group BG0. A column address is also included but is omitted from this example. Command decoder 120 decodes these command and address signals to provide the appropriate control signals on peripheral command bus CMD-Peri to enable BankO of bank group BG0 to prefetch a 256-bit column of data symbols, which appears as a shaded block 205 on 256-bit IOSA[255:0]BG0. The multiplexer 110 in bank group BG0 presents these data as two halves, each 128-bit-wide data on a respective one of peripheral data buses IO-Peri[255: 128] and IO-Peri[ 127:0]. Serializers 130 then convert these halves into eight-bit- wide, sixteen-bit bursts for transmission over respective data ports DQ[15:8] and DQ[7:0]. A second read command begins after a time tCCD_S to initiate a read to Bankl of a different bank group BG1. Time tCCD_S, for "Command to Command Delay, Short" is a timing parameter that defines the minimum interval required between two successive commands (read or write) to different bank groups. The terminal "S", for "short," distinguishes a longer column to column delay (tCCD_F of Figure 4) that memory device 100 employs for fractionalized data ports.
[0017] Figure 3 depicts memory device 100, introduced in Figure 1, configured in a dualport mode such that port 105 is fractionalized into two eight-bit ports DQ[15:8] and DQ[7:0] that communicate data signals in thirty-two-bit bursts. Both ports can both read and write data as directed by the same command port CMD. This illustration highlights separate read and write data paths using bold lines, a read path from bank BankO of bank group BG0 and a write path to bank Bank2 of bank group BG3. In this dual-port mode, mux 110 in bank group BG0 shifts out 256-bit data onto peripheral IO bus IO-Peri[255:128] as two successive 128-bit blocks to account for the halving of the number of traces. The successive blocks double the data burst length to account for the width reduction, and therefore maintain the same access granularity as in the full- width mode. Mux 110 in bank group BG3, operating as a demultiplexer, performs the opposite task of combining two successive 128-bit blocks into a single 256-bit block for the selected sense amplifier IOSA2. The latch 107 associated with sense amplifier IOSA2 can latch write data to relieve write timing constraints.
[0018] Figure 4 is a timing diagram 400 illustrating interleaved read and write transactions in the dual-port mode illustrated in Figure 3. Data port 105 is split into two independent half-width ports, a read port DQ[15 : 8] and a write port DQ[7:0]. The same command bus CMD services both of these ports. Commands directed to different ports can be issued back-to-back, as illustrated by the leftmost read and write commands RD and WR to banks BankO and Bank2. Commands directed to the same port are separated by a minimum command-to-command interval of tCCD_F, which is twice as long as interval tCCD_S in the full-width mode. The burst length of data on ports DQ[ 15 : 8] and DQ[7:0] is doubled to thirty-two bits so the access granularity is unaffected by the halving of the data pads relative to the full-width mode of Figure 1. Considering signals CMD and Bank, the first transaction is initiated by a read command RD addressed to a bank BankO (of bank group BG0 in the highlighted read path of Figure 3). Command decoder 120 decodes these command and address signals to provide the appropriate control signals on peripheral command bus CMD-Peri to enable BankO of bank group BG0 to prefetch a 256-bit column of data, which appears as a shaded block 405 on 256-bit IOSA[255:0](BG0). The multiplexer 110 in bank group BG0 presents these data as consecutive sets of 128-bit-wide data 410A and 410B, shaded to match block 405, on peripheral data bus IO- Peri[255: 128]. Latch 107 (Figure 1) holds at least the latter of the early and late portions 410A and 410B of the data burst on bus IO-Peri[255: 128] so the respective IOS A can begin servicing a subsequent transaction. Blocks 410A and 410B can be any exclusive 128-bit subsets of block 405 (e.g., every other bit or the most- and least-significant 128 bits). The serializer 130 connected to 10-Peri[255:128] then converts data 410A and 410B into an eight-bit-wide, thirty- two-bit burst 415, shaded to match blocks 405 and 410A/B, for transmission data port DQ[15:8]. Two other read commands RD are illustrated using like-shaded collections of data blocks on the respective signal nodes.
[0019] The second transaction is initiated by a write command WR addressed to bank Bank2 of bank group BG3, again as illustrated in Figure 3. A thirty-two-bit burst of write data 420 follows the write command, this by-eight port conveying 32x8-256 bits of write data. The deserializer 135 associated with data port DQ[7:0] converts these data into consecutive sets of 128-bit-wide data 425A and 425B, shaded to match block 415, on peripheral data bus 10- Peri[ 127:0]. Command decoder 120 decodes the write command to provide the appropriate control signals on peripheral command bus CMD-Peri to enable the mux 110 in bank group BG3 to convert data 425A and 425B into 256-bit-wide data 430 for storage in bank Bank2. Two other write commands WR are illustrated using like-shaded collections of data blocks on the respective signal nodes.
[0020] Figure 5 depicts a memory system 500 in accordance with an embodiment in which a host (e.g. a memory controller) 505 dynamically adjusts the width fraction of a data channel 510 to a memory 100 of the type introduced in Figure 1 . Host 505 includes command, timing, and data circuitry connected to command port CMD, strobe port DQS[l:0], and data ports DQ[15:8] and DQ[7:0], which connect to the like-identified ports on memory die 100 via a memory channel 510.
[0021] The data circuitry, at bottom, includes two data queues 515 and 520 that connect via a pair of multiplexer/demultiplexers 525 and 530 to ports DQ[ 15: 8] and DQ[7:0]. In the full- width, single-port mode, data queue 515 conveys internal data signals DQiA to all sixteen data lanes DQ[15:0] via both muxes 525 and 530 under control of a selection signal DMux. In the dual-port mode, data queue 515 communicates internal read and write data signals DQiA with data lanes DQ[15:8] via mux 525 or lanes DQ[7:0] via mux 530. Data queue 520 likewise communicates a second set of internal data signals DQiB with the remaining eight data lanes via the other of muxes 525 and 530. Data-signal timing is referenced with respect to strobe signal DQS[l :0] in both the read and write directions and under control of timing circuitry 535. Other circuitry is timed to an external clock signal CLK, the function of which is well understood by those of skill in the art.
[0022] The control circuitry, at top, conveys internal command and address signals CMDi to command port CMD using command paths directed to dual-port control logic 540. In the fullwidth mode, a mux 545 directs commands CMDi accumulated in a command queue 550 before a mux 555 under control of dual-port control logic 540 allows scheduling logic 560 to issue commands CMD. In the dual-port mode, commands CMDi accumulate in two command queues, queue 550 as before plus a second command port 565. A second instance of scheduling logic 570 is also included. This configuration allows for commands to be scheduled separately for the two half-width ports DQ[15:8] and DQ[7:0], In an alternative embodiment, a single command queue is used and commands are flagged to distinguish between the ports. Control logic 540 uses e.g. a command traffic profile, such as the ratio of read to write commands, queue depths, and latency requirements, to determine when to switch between dual- and single-port modes. Host 505 then issues access commands that dynamically configure memory die 100 to manage each read and write transaction with a selected full- width or fractionalized port. Alternatively, host 505 can issue separate commands to load mode register 150 on memory die 100 as necessary to accommodate mode changes via a separate command on port CMD or using a separate communication channel (not shown).
[0023] In a typical computer memory hierarchy, memory device 100 might be a DRAM die that sits in a memory hierarchy between a processor’s internal caches and larger, slower devices such as hard drives or solid-state drives (SSDs) used for longer-term storage. Every cache entry is read from device 100, but only changed cache entries are written back. Memory device 100 thus tends to perform more read transactions than write transactions. Moreover, many applications spend more time reading data than modifying or writing data. For instance, a user browsing the Internet may download and read much more content than they upload. Applications like streaming services, multimedia players, or document readers are more often retrieving and presenting data rather than writing new data. Memory system 500 can optimize overall performance by adjusting the ratio of write bandwidth to read bandwidth responsive to relative measures of read and write traffic. If write transactions are relatively rare, for example, then host 505 can maintain memory die 100 in the full-width mode for maximum read bandwidth, only occasionally making half of the data links available for write traffic. Read transactions can thus proceed, albeit at a reduced bandwidth, when memory system 500 services the occasional write transaction. A low read latency can thus be maintained.
[0024] Write transactions need not be rare, or even rare relative to read transactions, for host 505 to optimize channel 510 for whatever the ratio of read and write transactions; the two halfwidth channels can be used in support of independent reads or writes; and the fractionalized channels are not limited to two or to the same data widths in other embodiments. For example, the eight-bit data paths that service ports DQ[15:8] and DQ[7:0] in memory device 100 could further be divided into four-bit data paths. Muxes 110 would be modified to support a four-wide mode in which data is communicated in 64-bit bursts — four times the burst length used in the full- width mode — to match the 256-bit access granularity offered by the eight-bit and sixteen-bit widths. Interval tCCD_S would likewise be extended by a factor of four. [0025] Figure 6 is a timing diagram 600 illustrating how memory system 500 of Figure 5 can dynamically switch from a full-width mode to one of two fractionalized modes. The full-width mode conveys read and write data over all sixteen lanes DQ[15:0J. In the fractional mode, one group of eight lanes is used for read data and the other for write data. Each read and write command includes two bits (not shown) that designate the mode to memory device 100.
[0026] The first command RF is a read command at full width (RF=read, full width) directed to a bank BankO of bank group BG0. In the manner described in Figure 2, memory device 100 issues eight-bit-wide, sixteen-bit bursts for transmission over respective data ports DQ[15:8] and DQ[7:0], The next read command is labeled RL (read, lower channel), which instructs memory device 100 to deliver the requested 256 bits of read data 605 on low-order data lanes DQ[7:0]. The multiplexer 110 in bank group BG1 (Figure 1) presents these data as consecutive sets of 128-bit-wide data 610A and 610B, shaded to match block 605, on peripheral data bus IO- Peri[127:0]. The data latch 107 in the selected bank holds the prefetched data in the IOSA while that data is transmitted on IO-Peri[ 127:0] to extend the burst length. This extension allows the IOSA to begin prefetching data responsive to the next command before the extended burst is read out, and thus allows the IOSA to make the next read data 620 from the same bank available in IO -Peri [127:0] (data 625) after time tCCD_L. The inclusion of latches 107 thus enables back- to-back, same-bank bursts. The serializer 130 connected to IO-Peri[127:0] then converts data 610A and 610B into an eight-bit-wide, thirty -two-bit burst, shaded to match blocks 605 and 610A/B, for transmission over data port DQ[7:0], Latches 107 are omitted in other embodiments, or a single latch can service a bank group.
[0027] Read command RL, by electing to use only eight lanes DQ[7:0], leaves upper (high- order) lanes DQ[15:8] available for a subsequent write command WH (write, high-order), which instructs memory device 100 to accept 256 bits of write data as an eight-bit-wide, thirty-two-bit burst 615 on data lanes DQ[15:8J. A turn-around time 625 separates the last read data from the incoming write data. A subsequent read command RL produces read data 620 on lanes DQ[7 :0] . Still more commands can switch memory device 100 back to the full width mode, in service of read or write transactions, or can reverse the roles of the high- and low-order DQ lanes so that write transactions can occur on lanes DQ[7:0].
[0028] In the embodiment of Figures 5 and 6, data ports DQ[15:8] and DQ[7:0] can simultaneously communicate fractionalized data between host controller 505 and memory die 100 in the same or opposite directions across associated fractions of channel 510. In other embodiments, these subchannels can simultaneously service different hosts, conveying data in the same or opposite directions, while satisfying the same or different host requirements for e.g. quality of service.
[0029] While the invention has been described with reference to specific embodiments thereof, it will be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention. For example, features or aspects of any of the embodiments may be applied, at least where practicable, in combination with any other of the embodiments or in place of counterpart features or aspects thereof. Moreover, some components are shown directly connected to one another while others are shown connected via intermediate components. In each instance the method of interconnection, or “coupling,” establishes some desired electrical communication between two or more circuit nodes, or terminals. Such coupling may often be accomplished using a number of circuit configurations, as will be understood by those of skill in the art. Therefore, the spirit and scope of the appended claims should not be limited to the foregoing description. Only those claims specifically reciting “means for” or “step for” should be construed in the manner required under the sixth paragraph of 35 U.S.C. § 112.

Claims

CLAIMS What is claimed is:
1. A random-access memory comprising: a memory bank to store data, the memory bank including a memory-bank port to communicate the data in parallel as data symbols; and a memory interface coupled to the memory-bank port, the memory interface including: a multiplexer coupled to the memory-bank port to communicate the data symbols in parallel; a first data path extending from the multiplexer to a first data port; and a second data path extending from the multiplexer to a second data port; the multiplexer to communicate the data symbols at a first data width and a first burst length over the first and second data paths in parallel in a first mode and at a second data width and a second burst length over one of the first and second data paths in a second mode, wherein the product of the first data width and the first burst length equals the product of the second data width and the second burst length.
2. The memory of claim 1, further comprising a command decoder communicatively coupled to the multiplexer, the command decoder to decode commands and to switch the multiplexer between the first mode and the second mode responsive to the commands.
3. The memory of claim 2, further comprising a mode register coupled to the command decoder, the mode register to store a value indicative of one of the first mode and the second mode.
4. The memory of claim 3, the command decoder to load the mode register responsive to host commands.
5. The memory of claim 1, each memory bank including a latch to store at least a portion of the data over the second burst length.
6. The memory of claim 1, further comprising: a second memory bank to store second data, the second memory bank including a second memory-bank port to communicate the second data in parallel as second data symbols; the memory interface further including: a second multiplexer coupled to the second memory-bank port to communicate the second data symbols; the second multiplexer to communicate the second data symbols at the first data width and the first burst length over the first and second data paths in parallel in the first mode and at the second data width and the second burst length over one of the first and second data paths in the second mode.
7. The memory of claim 1, the first data path to communicate read data and the second data path to communicate write data in the second mode.
8. The memory of claim 7, wherein the first data path communicates the read data while the second data path communicates the write data.
9. The memory of claim 1, wherein each of the first data path and the second data path includes a serializer and a deserializer.
10. The memory of claim 1, wherein the first data width is twice the second data width.
11. A method of operation in a random-access memory, the method comprising: receiving a first command; responsive to the first command, communicating first data of a first data width and a first burst length via a first data port; receiving a second command; responsive to the second command, fractionalizing the first data port into first and second narrower data ports and communicating second data of a second data width and a second burst length via the first narrower data port; receiving a third command; and responsive to the third command, communicating third data of a third data width and a third burst length via the second narrower data port while communicating the second data via the first narrower data port; wherein the product of the first data width and the first burst length equals the product of the second data width and the second burst length and the product of the third data width and the third burst length.
12. The method of claim 11, wherein the second burst length includes an early portion and a late portion, the method further comprising latching a fraction of the second data and communicating the fraction of the second data as the late portion.
13. The method of claim 12, wherein the fraction is half.
14. The method of claim 11, further comprising reading the second data from the memory and writing the third data to the memory.
15. The method of claim 11, wherein the second and third data widths are equal.
16. The method of claim 11, further comprising loading a mode register responsive to the second command, the mode register storing a value to fractionalize the first data port.
17. The method of claim 16, further comprising loading the mode register responsive to the first and third commands.
18. A method of controlling a random-access memory by a memory controller, the method comprising: sending a first command; receiving, responsive to the first command, first read data of a first data width and a first burst length via a wide data port; sending a second command; receiving, responsive to the second command, second read data of a second data width and a second burst length via a first narrower data port; sending a third command; and sending, in association with the third command, write data of a third data width and a third burst length via a second narrower data port while receiving the second read data via the first narrower data port; wherein the product of the first data width and the first burst length equals the product of second data width and the second burst length and the product of the third data width and the third burst length.
19. The method of claim 18, wherein the second data width equals the third data width.
20. The method of claim 18, further comprising monitoring numbers of read and write commands and selecting between the wide data port and the first and second narrower data ports responsive to the numbers.
PCT/US2024/054013 2023-11-14 2024-10-31 Dynamically configurable multi-ported dram Pending WO2025106273A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202363598632P 2023-11-14 2023-11-14
US63/598,632 2023-11-14

Publications (1)

Publication Number Publication Date
WO2025106273A1 true WO2025106273A1 (en) 2025-05-22

Family

ID=95743339

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/054013 Pending WO2025106273A1 (en) 2023-11-14 2024-10-31 Dynamically configurable multi-ported dram

Country Status (1)

Country Link
WO (1) WO2025106273A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN121237143A (en) * 2025-12-03 2025-12-30 西安芯存半导体有限公司 Pseudo-static random access memory and its access methods, I/O interfaces, electronic devices

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220382691A1 (en) * 2016-11-16 2022-12-01 Rambus Inc. Multi-Mode Memory Module and Memory Component

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220382691A1 (en) * 2016-11-16 2022-12-01 Rambus Inc. Multi-Mode Memory Module and Memory Component

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN121237143A (en) * 2025-12-03 2025-12-30 西安芯存半导体有限公司 Pseudo-static random access memory and its access methods, I/O interfaces, electronic devices

Similar Documents

Publication Publication Date Title
US6834014B2 (en) Semiconductor memory systems, methods, and devices for controlling active termination
CN101702947B (en) Memory system with point-to-point request interconnect, memory controller and method
JP6018118B2 (en) Micro thread memory
KR100468761B1 (en) Semiconductor memory system having memory module connected to devided system bus
JP5231642B2 (en) Independently controlled virtual memory device in memory module
US9268719B2 (en) Memory signal buffers and modules supporting variable access granularity
US8707009B2 (en) Memory systems and methods for dividing physical memory locations into temporal memory locations
TWI689940B (en) Memory device and method for data power saving
US8270229B2 (en) Semiconductor memory apparatus
CN101401166B (en) Memory device and method having multiple address, data and command buses
KR20080104184A (en) Memory Devices with Mode-Select Prefetch and Clock-Core Timing
KR20120049735A (en) Pseudo-open drain type output driver having de-emphasis function and semiconductor memory device, and control method thereof
WO2025106273A1 (en) Dynamically configurable multi-ported dram
US12314607B2 (en) Variable memory access granularity
JP4016378B2 (en) Memory device and ordering method
JP5706060B2 (en) Semiconductor memory device and product development method
JP2001332090A (en) Semiconductor memory device and data transmission method
JP4405565B2 (en) Memory system and memory device
CN101903868B (en) Storage device and its control method
KR100696770B1 (en) Prefetch device for high speed DRAM
JP6186381B2 (en) Semiconductor memory device and product development method
CN121237143A (en) Pseudo-static random access memory and its access methods, I/O interfaces, electronic devices

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24892016

Country of ref document: EP

Kind code of ref document: A1