US11494308B2 - Methods and devices for bypassing the internal cache of an advanced DRAM memory controller - Google Patents

Methods and devices for bypassing the internal cache of an advanced DRAM memory controller Download PDF

Info

Publication number
US11494308B2
US11494308B2 US16/331,429 US201716331429A US11494308B2 US 11494308 B2 US11494308 B2 US 11494308B2 US 201716331429 A US201716331429 A US 201716331429A US 11494308 B2 US11494308 B2 US 11494308B2
Authority
US
United States
Prior art keywords
memory
address
addresses
cache
circuit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US16/331,429
Other languages
English (en)
Other versions
US20210349826A1 (en
Inventor
Jean-François Roy
Fabrice Devaux
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Upmem SAS
Original Assignee
Upmem SAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Upmem SAS filed Critical Upmem SAS
Assigned to UPMEM reassignment UPMEM ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DEVAUX, FABRICE, ROY, Jean-François
Publication of US20210349826A1 publication Critical patent/US20210349826A1/en
Application granted granted Critical
Publication of US11494308B2 publication Critical patent/US11494308B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0888Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using selective caching, e.g. bypass
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1668Details of memory controller
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement

Definitions

  • This application concerns a calculation system including a processor, memory, and a control interface between the processor and the memory.
  • a PIM processor (Processor In Memory) is a processor directly integrated into a memory circuit, for example in a DRAM memory circuit. In this document, this memory circuit is called a PIM circuit.
  • a PIM processor is controlled by a main processor, typically an Intel, ARM or Power processor.
  • this processor is called HCPU (Host CPU).
  • the PIM processor and the HCPU have access to the memory in which the PIM processor is integrated.
  • this memory is called PIM memory.
  • Each PIM processor has registers that allow the HCPU to control it. These registers accessible by the HCPU, are visible in the physical addressing space of the PIM circuit. In this document, these registers are called interface registers, all the interface registers of a PIM processor being called the interface of this PIM processor, and the software, running on the HCPU, controlling this interface being called the interface software.
  • the HCPU typically performs the following actions in order to use a PIM processor:
  • An HCPU has a cache system and this may delay an entry in the PIM circuit for an undetermined time.
  • the HCPU processors have instructions for cache management to force the updating in the main memory of a data specified by its address, which has only been written in the cache of the processor.
  • update instructions ensure that written data is “pushed” to the DRAM controller, but not that it is “pushed” to the memory circuit.
  • cache affected by the cache management instructions is named in this document “CPU cache”, the cache not affected by these instructions being named “DRAM cache”.
  • Delayed entries are also a problem for non-volatile memories that do not have a PIM processor.
  • MRAM memories magnetic memory
  • EVERSPIN company magnetic memory
  • DRAM-compatible interface allowing their use with a DRAM controller.
  • Another problem is that if a C1 data to be written to a certain address is stored in the DRAM cache, this data can be replaced by the arrival of a new C2 data for the same address, the DRAM cache considering it optimal not to write C1 in the memory circuit, C1 being supplanted by C2.
  • DRAM cache can change the order in which the entries of data are performed, which is problematic even when these entries have different addresses.
  • the HCPU When the HCPU reads a V2 value at an AV address, this V2 value having been generated by a PIM processor, it is important that the value read by the HCPU is this recent V2 value, and not a previous V1 value that was copied into a cache during a reading prior to the generation of V2, at this same AV address.
  • HCPU processors have cache management instructions, allowing a data specified by its address in the cache to be invalidated.
  • the value V1 is no longer present in the cache, and if the HCPU reads back a data to the AV address, it will obtain the V2 value from the memory circuit and the cache will then hide that value V2.
  • An invalidation instruction targeting the AV address guarantees that the V1 value is no longer present in it CPU cache, but does not guarantee that the V1 value is not still present in the DRAM cache.
  • the DRAM controller will just return the V1 value from its DRAM cache.
  • a data in the CPU cache is always the most recent because only the HCPU modifies the memory, whereas in a system with a PIM processor, the latter can also modify the memory.
  • An exemplary embodiment provides a calculation system including a: a calculation device having one or several instruction-controlled processing cores and a memory controller, the memory controller comprising a cache memory; and a memory circuit coupled to the memory controller via a data bus and an address bus, the memory circuit being adapted to have a first m-bit memory location accessible by a plurality of first addresses provided on the address bus, the computing device being configured to select, for each memory operation accessing the first m-bit memory location, an address from the plurality of first addresses.
  • the first m-bit memory location is accessible by a P plurality of first addresses, the computing device being configured to use one of the first addresses to access the first memory location during an N th and an (N+P) th operation for accessing the first memory location.
  • each address of the plurality of first addresses includes a first value of n bits and a second value of p bits
  • the calculation device being configured to carry out a data writing operation to the m bits of the first location memory by performing a reading operation of the first memory location by using one of the first addresses with a first selected n-bit value and a second p-bit value generated according to the writing data.
  • the memory circuit is adapted, in response to receiving a reading operation in its first memory location using one of the first addresses, to be written the second p-bit value of the address in it first memory location.
  • p and n are integers and n is equal to or greater than p.
  • the memory circuit is adapted to have a second memory location accessible through a plurality of second addresses provided on the address bus.
  • the first and second memory locations are part of a first memory location range of the memory circuit, the first memory location range being selected by a sliding address window, in which the memory locations of the first memory location range are addressable:
  • the memory circuit including an address conversion circuit adapted to convert addresses in the first and second range of addresses to corresponding addresses in the sliding address window.
  • the address conversion circuit includes at least one programmable register to define the location of the sliding address window.
  • said at least one address conversion circuit register is programmable to define the location and size of the sliding address window.
  • the memory controller is adapted to perform an operation of cache evacuation, the cache evacuation operation including one or several memory access instruction sequences performed by the memory controller with the following result:
  • the reading data stored in the memory cache of the memory controller is removed from the memory cache, the reading data comprising the data read from the memory circuit the evacuation operation of the cache;
  • the memory circuit further includes an auxiliary processor, and the sequence of memory access instructions includes only register access instructions to access one or more control registers of the memory circuit to control the auxiliary processor.
  • the memory circuit includes a monitoring circuit, accessible by the calculation device, and adapted to record memory access transactions performed in the memory circuit, the calculation device being configured to generate said one or more memory access instruction sequences based on transactions recorded by the monitoring circuit.
  • the memory circuit further includes an auxiliary processor, the first and second memory locations being auxiliary processor control registers.
  • the calculation device is configured to generate commands of a first type and of a second type
  • the memory circuit being adapted to modify the order of the commands received from the calculation device in such a way that, for a first group of commands of the second type generated by the calculation device between the first and second commands of the first type, the order of the first and second commands of the first type in relation to the command group of the second type is respected.
  • the memory circuit is adapted to modify the order of the commands based on an order value associated with at least each command of the first type, the order value of each order being included
  • the calculation device further includes a CPU cache memory that can be configured by cache management instructions, while the cache memory of the memory controller is not configurable by cache management instructions.
  • the memory circuit includes a non-volatile memory matrix.
  • Another embodiment provides for an access process to a memory circuit coupled with a memory controller of a calculation device via a data bus and an address bus, the calculation device having one or several processing cores and the memory controller comprising a cache memory, the process comprising selecting, by the calculation device, for each memory operation accessing a first m-bit memory location of the memory circuit, one from a plurality of first addresses, the first m-bit memory location being accessible by each of the plurality of first addresses provided on the address bus.
  • a system composed of a main circuit and at least one memory circuit; the main circuit comprising at least one main processor and a memory controller connected to the memory circuit; the memory controller comprising a cache which is not affected by the cache management instructions of the main processor; the memory circuit including at least one auxiliary processor; this auxiliary processor comprising an interface that is accessible to the main processor; this interface including registers, each interface register being accessible by the main processor through a plurality of addresses; the interface being controlled by software running on the main processor, the software choosing for each access to a given register of the interface, one address from the plurality of addresses corresponding to the given register.
  • the choice of the address to access a given interface register is made in such a way that an address, used during the N th access to this register, will be used during the (N+P) th access to this register, P being the number of addresses composing the plurality of addresses associated with this register.
  • the access address to at least one interface register is built by assembling a first n-bit field called major field, with a second p-bit field, called minor field, where the value of the major field is chosen among a plurality of values, and where the value of the minor field may have any value included between 0 and (2 ⁇ circumflex over ( ) ⁇ P) ⁇ 1, the reading of the interface register at the address ⁇ major field, minor field ⁇ involving its writing by the minor field value, the software using such readings to write values in the register interface.
  • the possible values of the minor field are restricted to values which can be written to the interface register.
  • the advantage of the use, during a reading operation, of an address field to transmit bits to be written into the memory is that the reading operation is not likely to be put on standby in the cache memory, as could be the case of a writing operation.
  • interface registers allow the position and if necessary the size to be configured, of an additional access window on an memory area of the memory circuit, this access window being accessible through a plurality of address ranges, and where the interface software provides access to the memory of the circuit memory by positioning the access window on the memory area concerned, and choses the access addresses among the plurality of address ranges allowing to access that access window.
  • the interface software selects the addresses to access to the access window in such a way that if the address of N th access to the access window is chosen from a given address range, the address of (N+P) th access will be selected in the same address range, P being the number of address ranges composing the plurality of address ranges.
  • the software controlling the interface uses an access sequence, chosen from a set of predetermined access sequences, to remove from the memory controller's cache the writing transactions issued prior to this sequence, thus forcing the effective execution of these writing transactions, the determination of predetermined access sequences being made from known characteristics or characteristics deduced from observation, from the memory controller's cache.
  • the software controlling the interface uses an access sequence, chosen from a set of predetermined access sequences, to remove from the controller memory cache the data read prior to this sequence, the determination of the predetermined sequences being made from characteristics either known or inferred from observation of the controller memory cache.
  • the software controlling the interface uses an access sequence, chosen among a set of predetermined access sequences, to remove from the controller memory cache the writing transactions and the data read prior to this sequence, the determination of predetermined sequences being based on characteristics either known or deduced from observation, of the memory controller cache.
  • the access sequence is reduced in such a way that it only guarantees the evacuation from the memory controller cache of the writing transactions or of read data, corresponding to a subset of the physical addresses associated with the memory circuit.
  • the predetermined access sequences only include access to interface registers.
  • the interface includes a mechanism for recording the last transactions having reached the memory circuit, this recording mechanism being accessible by the main processor via the interface itself.
  • the software controlling the interface uses the recording mechanism of the last transactions beforehand, to automatically determine the predetermined access sequences.
  • the interface includes at least one command register able to receive commands from the HCPU, in which these commands are classified between highly ordered commands and poorly ordered commands; the poorly ordered commands issued between two highly ordered commands forming a set of poorly ordered commands within which the poorly ordered commands can be executed out of order; the highly ordered orders being executed in order relative to the other highly ordered commands and with respect to the sets of poorly ordered commands.
  • the interface includes at least one command register capable of receiving commands from the HCPU, these commands all being highly ordered.
  • the commands are reordered thanks to the use of a number included in the commands themselves.
  • the commands are reordered thanks to the use of a number included in the command addresses.
  • the commands are reordered thanks to the use of numbers, part of a number being included in the command itself, the rest being included in the address of the command.
  • the auxiliary processor is not integrated in the memory circuit, but is embedded in a circuit connected to the memory circuit.
  • the memory is non-volatile.
  • FIG. 1 schematically shows a calculation system according to an embodiment
  • FIG. 2 illustrates in more detail a memory interface of the system in FIG. 1 according to an embodiment
  • FIG. 3 illustrates a memory interface in more detail of the system of FIG. 1 according to another embodiment.
  • the considerably large size of the CPU cache (up to several tens of Mb) means that it is unnecessary that the DRAM cache be of a significant size for the essential part of the performance gains it can bring to be achieved.
  • the HCPU is usually a high-performance processor thus capable of executing instructions out of order (Out Of Order processor: 000 processor), in addition to cache management instructions, the use of “memory barrier” instructions is possible to force the execution of instructions in an appropriate order.
  • a MB (memory barrier) instruction ensures that all accesses generated by instructions before the BM instructions are fully executed from the CPU cache point of view, before an access generated by an instruction after the BM instruction is performed.
  • the instruction set of an HCPU may include, for optimal performance, variations around this concept, with, for example, barrier instructions for writing only or reading only.
  • mapped register means that the register is accessible at a physical address.
  • a register can be mapped several times: it means that it is accessible at several different physical addresses.
  • FIG. 1 illustrates a calculation system including a processing device 102 coupled by a bus, for example of the DDR type (from the English “Double Data Rate”), to a memory circuit 104 .
  • the bus includes for example a data bus 106 A and an address bus 106 B.
  • Device 102 includes, for example, one or more processing cores 108 , a CPU (Central Processing Unit) cache 110 , and a memory controller 112 including a cache 114 .
  • Cache memory 114 is, for example, a DRAM cache (Dynamic Random Access Memory) in the case where the memory circuit 104 is a DRAM type memory.
  • the memory circuit 104 includes, for example, a memory 116 , a processing device 118 and an interface 120 .
  • the circuit 104 includes, for example, an address translation circuit 122 comprising one or more registers 124 and a monitoring circuit 126 .
  • each interface register is mapped a certain number of times, the number depending on the characteristics of the DRAM cache.
  • the interface of the PIM processor 118 can include only 3 directly accessible registers, allowing an indirect access to a much larger number of registers:
  • Such an interface provides indirect access to a large number of registers:
  • the memory circuit 104 contains 2 ⁇ circumflex over ( ) ⁇ N memory words, but to create many address, this one is stated as having 2 ⁇ circumflex over ( ) ⁇ (N+i) memory words, with i>0.
  • the boot code (BIOS/boot firmware) and the operating system (OS) must take into account the actual size of the memory and not the stated size.
  • the interface software uses, for each access to a register (e.g. accesses 208 to register 202 and accesses 210 to register 204 in FIG. 2 ), a different address mapping it; consequently:
  • an interface register can not be mapped an infinite number of times, so, past a certain number of accesses to this register, the set of addresses mapping it will be exhausted and addresses already used are going to be used again.
  • the minimum size of all addresses mapping an interface register is of course a function of the size and characteristics of the DRAM cache.
  • interface registers instead of being each associate with a private current address, can use a common current address, this being governed at each access in the following way:
  • a weakly ordered class including, for example, data writing commands, instructions and parameters
  • the belonging of a command to one or the other class is encoded in the command itself.
  • the ordering rules are as follows:
  • weakly ordered commands are not ordered one with respect to another:
  • the strongly ordered commands can be received in disorder by the receiving command registry, but, however, they must be executed in the order of their generation: For this purpose, strongly ordered commands are numbered during their generation.
  • each strongly ordered command destined to a given command register includes a n-bit field used to number it.
  • current command number a counter of n-bits, called current command number, containing the current number of the highly ordered command to be executed
  • control buffer 2 ⁇ circumflex over ( ) ⁇ n inputs, called a control buffer, each input being:
  • the interface software can read the current command number in order to acknowledge the last executed command, which allows to know how many commands have been executed, and thus how many new strongly ordered commands it can generate without exceeding the capacity of the command buffer. Thus, it can generate highly ordered commands as the previous ones are executed.
  • part of the address where the command is written is used as the command number.
  • the command register is associated with the following hardware resources:
  • the current command number a counter of n-bits, called the current command number, containing the current number of the highly ordered command to be executed
  • command buffer a memory of M entries called a command buffer, each entry being:
  • This method actually uses both previous methods, the command number being partly made up of an address field and in part of a command field.
  • a FIFO may be used to store the weakly ordered command, in the case where the rate of their arrival may be superior to the rate of their execution.
  • the DRAM cache can have a logic capable of detecting such an access pattern and deciding to read data in advance based on this access pattern.
  • the interface of the PIM processor does not include registers that are modified by their reading.
  • a solution is to use an access window 305 , called sliding window, of which the location 304 in the PIM memory 302 (and possibly the size) can be configured via interface registers, this sliding window being mapped many times in a large range of physical addresses 306 called multiple window.
  • the physical addresses of the PIM circuit could be organized as follows:
  • the physical address space conventionally mapping the PIM memory can be used by the HCPU to access the areas of the PIM memory that are not accessed by the PIM processor.
  • the PIM memory area on which the sliding window is positioned may or may not remain accessible through the conventional physical address space of the PIM memory.
  • An access in the sliding window (for example, accesses 308 , 310 and 312 in FIG. 3 ) will then be processed in a access in the PIM memory area on which this sliding window is commonly positioned (for example, accesses 308 ′, 310 ′ and 312 ′ in FIG. 3 ).
  • the sliding window is accessible via a plurality of physical address ranges, all these physical address ranges constituting the multiple window.
  • the sliding window is such that it is entirely included on a page from the DRAM, the position of the sliding window being possibly expressed as a couple ⁇ x, y ⁇ :
  • the interface software can now use, to access the PIM memory currently targeted by the sliding window, all the above-described solutions to access the interface register, including the use of a common current address.
  • the position of the sliding window can be expressed by a couple ⁇ x, y ⁇ :
  • the sliding window is thus associated to page x of the PIM memory, this association being programmable via interface registers.
  • Decoding this case is simple because it suffices to look at very few bits of a page number to determine that this page belongs to the multiple window.
  • the logic window mechanism may be planned to extend slightly the latency time of an activation operation, the value of this being programmable in the memory controller.
  • predetermined access sequences are used to fill the DRAM cache with transactions of no importance:
  • DCW_BARRIER (Dram Cache Write Barrier), writing barrier for DRAM caches: it ensures that all writings made before the start of DCW_BARRIER are effective (visible by the PIM processor) at the end of DCW_BARRIER.
  • DCR_BARRIER (Dram Cache Read Barrier), a reading barrier for DRAM caches that ensures that all data read after the end of DCR_BARRIER is more recent than the date on which DCR_BARRIER was started.
  • DCM_BARRIER Dram Cache Memory Barrier
  • Some DRAM cache architectures may allow the reduction of DCW_BARRIER, DCR_BARRIER and DCM_BARRIER sequences, and thus the reduction of their execution time, if the effect of these barriers only applies to a address range with specified parameters.
  • BARRIER access sequence variants may be used:
  • DCW_BARRIER(start_addr, end_addr) ensures that all writings performed before the start of
  • DCW_BARRIER(start-addr, end_addr) in the address range ⁇ start_addr, end_addr ⁇ are effective at the end of DCW_BARRIER(start_addr, end_addr).
  • DCR_BARRIER(start_addr, end_addr) ensures that all values read in the address range ⁇ start_addr, end addr ⁇ after the end of DCR_BARRIER(start_addr, end addr) are more recent than the date on which DCR_BARRIER(start_addr, end_addr) was started.
  • a non-volatile memory without a PIM processor, may nevertheless have an interface allowing it to use all aspects of the invention, notably those allowing:
  • the address sequences to be used in the invention depend on the characteristics of the DRAM cache.
  • Some analysis systems can analyze the traffic of the DRAM controller of a HCPU processor.
  • the interface may include physical means to record the N last transactions received, or at least a sufficient part of their characteristics, this recording being accessible via the interface itself.
  • a DRAM memory is organized in benches, pages and columns.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Memory System (AREA)
US16/331,429 2016-09-08 2017-09-06 Methods and devices for bypassing the internal cache of an advanced DRAM memory controller Active 2039-04-04 US11494308B2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
FR1658373 2016-09-08
FR1658373A FR3055715B1 (fr) 2016-09-08 2016-09-08 Methodes et dispositifs pour contourner le cache interne d'un controleur memoire dram evolue
PCT/FR2017/052368 WO2018046850A1 (fr) 2016-09-08 2017-09-06 Methodes et dispositifs pour contourner le cache interne d'un controleur memoire dram evolue

Publications (2)

Publication Number Publication Date
US20210349826A1 US20210349826A1 (en) 2021-11-11
US11494308B2 true US11494308B2 (en) 2022-11-08

Family

ID=57750074

Family Applications (1)

Application Number Title Priority Date Filing Date
US16/331,429 Active 2039-04-04 US11494308B2 (en) 2016-09-08 2017-09-06 Methods and devices for bypassing the internal cache of an advanced DRAM memory controller

Country Status (6)

Country Link
US (1) US11494308B2 (zh)
JP (1) JP2019531546A (zh)
KR (1) KR102398616B1 (zh)
CN (1) CN109952567B (zh)
FR (1) FR3055715B1 (zh)
WO (1) WO2018046850A1 (zh)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7218556B2 (ja) * 2018-12-06 2023-02-07 富士通株式会社 演算処理装置および演算処理装置の制御方法
US11048636B2 (en) * 2019-07-31 2021-06-29 Micron Technology, Inc. Cache with set associativity having data defined cache sets
US11481255B2 (en) * 2019-09-10 2022-10-25 International Business Machines Corporation Management of memory pages for a set of non-consecutive work elements in work queue designated by a sliding window for execution on a coherent accelerator
US11403111B2 (en) 2020-07-17 2022-08-02 Micron Technology, Inc. Reconfigurable processing-in-memory logic using look-up tables
US11355170B1 (en) 2020-12-16 2022-06-07 Micron Technology, Inc. Reconfigurable processing-in-memory logic
US11868657B2 (en) * 2021-02-08 2024-01-09 Samsung Electronics Co., Ltd. Memory controller, method of operating the memory controller, and electronic device including the memory controller
US11354134B1 (en) * 2021-03-25 2022-06-07 Micron Technology, Inc. Processing-in-memory implementations of parsing strings against context-free grammars
US11921634B2 (en) * 2021-12-28 2024-03-05 Advanced Micro Devices, Inc. Leveraging processing-in-memory (PIM) resources to expedite non-PIM instructions executed on a host

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5361391A (en) 1992-06-22 1994-11-01 Sun Microsystems, Inc. Intelligent cache memory and prefetch method based on CPU data fetching characteristics
US5459856A (en) 1991-12-06 1995-10-17 Hitachi, Ltd. System having independent access paths for permitting independent access from the host and storage device to respective cache memories
EP0886216A1 (en) 1997-06-06 1998-12-23 Texas Instruments Incorporated Microprocessor comprising means for storing non-cacheable data
WO2000055734A1 (en) 1999-03-17 2000-09-21 Rambus, Inc. Dependent bank memory controller method and apparatus
US20030221078A1 (en) * 2002-05-24 2003-11-27 Jeddeloh Joseph M. Memory device sequencer and method supporting multiple memory device clock speeds
US20100205350A1 (en) * 2009-02-11 2010-08-12 Sandisk Il Ltd. System and method of host request mapping
US7975109B2 (en) * 2007-05-30 2011-07-05 Schooner Information Technology, Inc. System including a fine-grained memory and a less-fine-grained memory
FR3032814A1 (zh) 2015-02-18 2016-08-19 Upmem

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6734867B1 (en) * 2000-06-28 2004-05-11 Micron Technology, Inc. Cache invalidation method and apparatus for a graphics processing system
US20060090034A1 (en) * 2004-10-22 2006-04-27 Fujitsu Limited System and method for providing a way memoization in a processing environment
JP3938177B2 (ja) * 2004-10-29 2007-06-27 キヤノン株式会社 データ処理装置及びデータ処理装置におけるメモリ割り当て方法
US8253751B2 (en) * 2005-06-30 2012-08-28 Intel Corporation Memory controller interface for micro-tiled memory access
JP2007249314A (ja) * 2006-03-14 2007-09-27 Yaskawa Electric Corp 高速データ処理装置
JP5526697B2 (ja) * 2009-10-14 2014-06-18 ソニー株式会社 ストレージ装置およびメモリシステム
US10020036B2 (en) * 2012-12-12 2018-07-10 Nvidia Corporation Address bit remapping scheme to reduce access granularity of DRAM accesses
US9075557B2 (en) * 2013-05-15 2015-07-07 SanDisk Technologies, Inc. Virtual channel for data transfers between devices
US20160232103A1 (en) * 2013-09-26 2016-08-11 Mark A. Schmisseur Block storage apertures to persistent memory
US9208103B2 (en) * 2013-09-26 2015-12-08 Cavium, Inc. Translation bypass in multi-stage address translation
US9690928B2 (en) * 2014-10-25 2017-06-27 Mcafee, Inc. Computing platform security methods and apparatus
EP3018587B1 (en) * 2014-11-05 2018-08-29 Renesas Electronics Europe GmbH Memory access unit

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5459856A (en) 1991-12-06 1995-10-17 Hitachi, Ltd. System having independent access paths for permitting independent access from the host and storage device to respective cache memories
US5361391A (en) 1992-06-22 1994-11-01 Sun Microsystems, Inc. Intelligent cache memory and prefetch method based on CPU data fetching characteristics
EP0886216A1 (en) 1997-06-06 1998-12-23 Texas Instruments Incorporated Microprocessor comprising means for storing non-cacheable data
WO2000055734A1 (en) 1999-03-17 2000-09-21 Rambus, Inc. Dependent bank memory controller method and apparatus
US6282604B1 (en) 1999-03-17 2001-08-28 Rambus Inc. Memory controller and method for meory devices with mutliple banks of memory cells
US20030221078A1 (en) * 2002-05-24 2003-11-27 Jeddeloh Joseph M. Memory device sequencer and method supporting multiple memory device clock speeds
US7975109B2 (en) * 2007-05-30 2011-07-05 Schooner Information Technology, Inc. System including a fine-grained memory and a less-fine-grained memory
US20100205350A1 (en) * 2009-02-11 2010-08-12 Sandisk Il Ltd. System and method of host request mapping
FR3032814A1 (zh) 2015-02-18 2016-08-19 Upmem
US20180039586A1 (en) 2015-02-18 2018-02-08 Upmem Dram circuit with integrated processor

Also Published As

Publication number Publication date
FR3055715B1 (fr) 2018-10-05
KR20190067171A (ko) 2019-06-14
WO2018046850A1 (fr) 2018-03-15
JP2019531546A (ja) 2019-10-31
US20210349826A1 (en) 2021-11-11
CN109952567A (zh) 2019-06-28
FR3055715A1 (fr) 2018-03-09
KR102398616B1 (ko) 2022-05-16
CN109952567B (zh) 2023-08-22

Similar Documents

Publication Publication Date Title
US11494308B2 (en) Methods and devices for bypassing the internal cache of an advanced DRAM memory controller
US10908821B2 (en) Use of outstanding command queues for separate read-only cache and write-read cache in a memory sub-system
US9286221B1 (en) Heterogeneous memory system
JP5526626B2 (ja) 演算処理装置およびアドレス変換方法
CN109952565B (zh) 内存访问技术
JP2012533124A (ja) ブロックベースの非透過的キャッシュ
US7472227B2 (en) Invalidating multiple address cache entries
US11106609B2 (en) Priority scheduling in queues to access cache data in a memory sub-system
WO2018231898A1 (en) Cache devices with configurable access policies and control methods thereof
CN111201518B (zh) 用于管理能力元数据的设备和方法
US11016904B2 (en) Storage device for performing map scheduling and electronic device including the same
US8726248B2 (en) Method and apparatus for enregistering memory locations
US20190187964A1 (en) Method and Apparatus for Compiler Driven Bank Conflict Avoidance
JP5129023B2 (ja) キャッシュメモリ装置
JP2020046761A (ja) 管理装置、情報処理装置およびメモリ制御方法
US20100257319A1 (en) Cache system, method of controlling cache system, and information processing apparatus
US10083135B2 (en) Cooperative overlay
US11561906B2 (en) Rinsing cache lines from a common memory page to memory
Jang et al. Achieving low write latency through new stealth program operation supporting early write completion in NAND flash memory
CN111742304A (zh) 当调试要在处理电路上执行的程序时访问元数据的方法
EP4328755A1 (en) Systems, methods, and apparatus for accessing data in versions of memory pages
US12007917B2 (en) Priority scheduling in queues to access cache data in a memory sub-system

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO SMALL (ORIGINAL EVENT CODE: SMAL); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

AS Assignment

Owner name: UPMEM, FRANCE

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ROY, JEAN-FRANCOIS;DEVAUX, FABRICE;REEL/FRAME:049228/0548

Effective date: 20190428

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE