WO2022222377A1 - 一种内存控制器、数据读取方法以及内存系统 - Google Patents

一种内存控制器、数据读取方法以及内存系统 Download PDF

Info

Publication number
WO2022222377A1
WO2022222377A1 PCT/CN2021/120427 CN2021120427W WO2022222377A1 WO 2022222377 A1 WO2022222377 A1 WO 2022222377A1 CN 2021120427 W CN2021120427 W CN 2021120427W WO 2022222377 A1 WO2022222377 A1 WO 2022222377A1
Authority
WO
WIPO (PCT)
Prior art keywords
page
cache
data
memory controller
read address
Prior art date
Application number
PCT/CN2021/120427
Other languages
English (en)
French (fr)
Inventor
周轶刚
朱晓明
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022222377A1 publication Critical patent/WO2022222377A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0811Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0877Cache access modes
    • G06F12/0882Page mode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]

Definitions

  • the embodiments of the present application relate to the field of computers, and in particular, to a memory controller, a data reading method, and a memory system.
  • Memory is an indispensable and important component in the server, and the cost accounts for about 30%-40% of the total system cost of the server. Therefore, on the basis of not reducing or slightly reducing the performance, reducing the cost of memory becomes the total cost of ownership of the entire system (Total Cost of Ownership, TCO), memory technology has become a hot technology researched by major server manufacturers and cloud operators. Compressing memory data with a hard compression engine or replacing traditional memory with new media with higher latency but lower cost (such as non-volatile memory) can significantly reduce memory costs, but the accompanying memory access latency increase has a negative impact on application performance.
  • Embodiments of the present application provide a memory controller, a data reading method, and a memory system to solve the problem of prolonged memory access.
  • the first aspect of the embodiments of the present application provides a memory controller
  • the memory controller includes a host-side interface, a first-level cache, a page table buffer, and a second-level cache, wherein the page table buffer A cache entry corresponding to the second-level cache is stored, and the cache entry is used to indicate a data page stored in the second-level cache
  • the memory controller is configured to receive a read instruction through the host-side interface.
  • the read address carried by the read instruction determines that the first-level cache is not hit, and then queries the page table buffer according to the read address. When it is determined that the data page corresponding to the read address has been cached in the second-level cache, from the The data corresponding to the read address is read from the second level cache.
  • the memory controller defined in the embodiment of the present application includes a two-level cache and a page table cache, and records the cached data pages in the second-level cache through the cache entries in the page table cache, so as to effectively utilize the data in the second-level cache and reduce the In the case of reading data from memory, the problem of long delay in reading data from memory is effectively reduced, and the efficiency of data access is improved.
  • the memory controller is further configured to read the data corresponding to the read address from the memory through a memory interface.
  • the memory controller is configured to cache the data page in the L2 cache after reading the data page corresponding to the read address from the memory through the memory interface, and store the data page in the second-level cache.
  • a cache entry corresponding to the read address is added to the page table buffer.
  • the memory controller is further configured to eliminate the target cache entry according to the elimination rule, and write the data page corresponding to the target cache entry back into the memory.
  • the elimination rule may be the LRU or the principle of least cached pages.
  • the principle of least cached pages means that huge pages with the fewest cached data pages are preferentially eliminated.
  • the L2 cache stores decompressed data of data pages
  • the memory stores compressed data of data pages
  • the L2 cache is used to compress the compressed data in the memory. Prefetch is performed and decompressed data corresponding to compressed data is cached.
  • the embodiment of the present application also provides a format of a read address and a format of a cache entry.
  • the read address includes a page tag, a page index, and an intra-page offset
  • the cache entry includes a page tag, a cache tag, and a huge page.
  • An index wherein the huge page index is used to indicate the address of the huge page in the second level cache, and the cache mark is used to indicate whether the data page has been cached in the huge page.
  • the memory controller is further configured to query the page table buffer according to the page tag in the read address, determine whether there is a cache entry corresponding to the page tag in the page table buffer, and if so, further query
  • the cache tag corresponding to the page index of the read address determines whether the data page corresponding to the read address has been cached in the second-level cache.
  • the data corresponding to the read address is read from the cache.
  • the memory controller is further configured to construct a cache address corresponding to the read address, and the cache The address BPA is
  • Huge Page Index is the huge page index
  • Page Index is the page index
  • Page offset is the offset within the page
  • the M is the huge page size
  • the N is the page size.
  • the memory controller is specifically configured to read the data corresponding to the read address from the second level cache according to the cache address.
  • the number of bits of the page identifier is x, and the number of bits of the cache tag is 2 x , where the X is an integer greater than 0.
  • the number of bits of the page tag is 19 bits
  • the number of bits of the page index is 9 bits
  • the number of bits of the intra-page offset is 12 bits
  • the number of bits of the cache tag is 29 bits (512 bits). ).
  • the first level cache is an SRAM
  • the second level cache is a DRAM
  • the page table buffer includes a first-level page table buffer and a second-level page table buffer, wherein the number of cache entries in the first-level page table buffer is smaller than the number of cache entries in the second-level page table buffer The number of cache entries in the page table buffer.
  • the number of cache entries in the first level table buffer is 128, and the number of cache entries in the level two page table buffer is 16k.
  • an embodiment of the present application provides a method for reading data by a memory controller, where the memory controller includes a host-side interface, a first-level cache, a page table buffer, and a second-level cache, wherein the page table
  • the buffer stores cache entries corresponding to the second-level cache, and the cache entries are used to indicate data pages stored in the second-level cache,
  • the method includes:
  • the memory controller receives a read command through the host-side interface, and the read command carries a read address;
  • the memory controller determines a level 1 cache miss according to the read address
  • the memory controller queries the page table buffer according to the read address, and when it is determined that the data page corresponding to the read address has been cached in the second level cache, reads the page table buffer from the second level cache Read the data corresponding to the address.
  • the method further includes:
  • the memory controller When it is determined that the data page corresponding to the read address is not cached in the second level cache, the memory controller reads the data corresponding to the read address from the memory through a memory interface.
  • the method can also include:
  • the memory controller After the memory controller reads the data page corresponding to the read address from the memory through the memory interface, the data page is cached in the L2 cache, and the page table buffer is added with the read address. The cache entry corresponding to the address.
  • the method can also include:
  • the memory controller eliminates the target cache entry according to the elimination rule, and writes the data page corresponding to the target cache entry back into the memory.
  • the decompressed data of the data page is stored in the secondary cache
  • the compressed data of the data page is stored in the memory
  • the secondary cache is used to perform the compressed data in the memory. Prefetch and cache decompressed data corresponding to compressed data.
  • the read address includes a page tag, a page index, and an intra-page offset
  • the cache entry includes a page tag, a cache tag, and a huge page index, wherein the huge page index is used to indicate that the huge page is in the second level.
  • the memory controller queries the page table buffer according to the read address, and when it is determined that the data page corresponding to the read address has been cached in the second-level cache, reads from the second-level cache Obtaining the data corresponding to the read address includes:
  • the memory controller queries the page table buffer according to the page tag in the read address, determines whether there is a cache entry corresponding to the page tag in the page table buffer, and if so, further queries the page table buffer.
  • the cache tag corresponding to the page index of the address determines whether the data page corresponding to the read address has been cached in the second-level cache, and if the data page corresponding to the read address has been cached, read from the second-level cache
  • the data corresponding to the read address is fetched.
  • the method further includes:
  • the memory controller constructs a cache address corresponding to the read address, and the cache address BPA is:
  • Huge Page Index is the huge page index
  • Page Index is the page index
  • Page offset is the offset within the page
  • the memory controller reads data corresponding to the read address from the L2 cache according to the cache address.
  • an embodiment of the present application further provides a memory system, including a memory and the memory controller according to the first aspect.
  • an embodiment of the present application further provides a chip, including a storage medium and hardware processing logic, wherein instructions are stored in the storage medium, and the hardware processing logic is configured to execute the instructions in the storage medium to implement the following The method steps described in any one of the second aspect or any possible implementation manner of the second aspect.
  • an embodiment of the present application further provides a server, including a processor and the memory system according to the third aspect.
  • an embodiment of the present application provides a computer-readable storage medium, in which a computer program or instruction is stored, and when the computer program or instruction is executed by a processor in a server, it is used to implement the second aspect or the first aspect. Operation steps of the method described in any possible implementation manner of the second aspect.
  • an embodiment of the present application provides a computer program product, the computing program product includes instructions, and when the computer program product runs on a server or a terminal, the server or the terminal is caused to execute the instructions, so as to realize the second aspect or the second aspect. Operational steps of the method described in any possible implementation of the aspect.
  • the present application may further combine to provide more implementation manners.
  • FIG. 1 is a schematic structural diagram of a memory controller in an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of a memory system provided by an embodiment of the application.
  • FIG. 3 is a schematic diagram of a format of a read address provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a data cache structure provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of another data cache structure provided by an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of a method for reading data by a memory controller according to an embodiment of the present application.
  • a second-level cache for example, Low Power Double Data Rate SDRAM, LPDDR
  • LPDDR Low Power Double Data Rate SDRAM
  • a second-level cache based on a traditional memory medium with a certain capacity and size is introduced into the serial memory chip to cache the decompressed data.
  • data and introduce a small-sized Static Random-Access Memory (SRAM) as a method to support huge page TLB (Translation Lookaside Buffer, address translation lookaside buffer) indexing, thereby reducing the traditional page table Page Table Walk method It brings multiple memory accesses and improves the efficiency of data reading.
  • SRAM Static Random-Access Memory
  • the embodiments of the present application introduce a method of small-capacity Cache (SRAM), medium-capacity uncompressed memory, large-capacity compressed memory, or PCM medium memory on an external serial memory chip, which not only reduces the cost of memory, but also avoids the need for CPU Access lengthening problem when accessing large-capacity compressed memory.
  • SRAM small-capacity Cache
  • medium-capacity uncompressed memory large-capacity compressed memory
  • PCM medium memory on an external serial memory chip
  • the memory controller 100 includes a host-side interface 101, a first-level cache 102, a page table buffer 103, and a second-level cache 104, wherein , the page table buffer 103 stores cache entries corresponding to the second level cache 104, the cache entries are used to indicate the data pages stored in the second level cache 104, and the memory controller 100 is used for Receive a read command through the host-side interface 101, determine that the first level cache 102 is not hit according to the read address carried by the read command, and then query the page table buffer 103 according to the read address, when it is determined that the read address corresponds to When the data page has been cached in the second level cache 104 , the data corresponding to the read address is read from the second level cache 104 .
  • the memory controller further includes a memory interface 105 .
  • the first-level cache 102 includes cachelines and tags corresponding to cachelines.
  • the memory controller 100 first searches the first-level cache 102 for the data to be read according to the read address. The cacheline of the read hit in the level cache 102.
  • the memory controller also includes serial-parallel conversion logic.
  • the memory controller defined in the embodiment of the present application includes a two-level cache and a page table cache, and records the cached data pages in the second-level cache through the cache entries in the page table cache, so as to effectively utilize the data in the second-level cache and reduce the In the case of reading data from memory, the problem of long delay in reading data from memory is effectively reduced, and the efficiency of data access is improved.
  • FIG. 2 is a schematic structural diagram of a memory system according to an embodiment of the present application.
  • the memory system 200 includes a memory controller 100 and a memory 205 .
  • the memory controller 100 is further configured to read the data corresponding to the read address from the memory 205 through the memory interface 105 .
  • the data read from the memory 205 may be compressed data.
  • the memory controller 100 After reading the compressed data, the memory controller 100 stores the corresponding decompressed data in the secondary cache 104 .
  • the memory controller 100 is configured to cache the data page in the L2 cache 104 after reading the data page corresponding to the read address from the memory 205 through the memory interface 105, and store the data page in the page table buffer. In 103, a cache entry corresponding to the read address is added.
  • the memory controller 100 is further configured to eliminate the target cache entry from the page table buffer 103 according to the elimination rule, and write the data page corresponding to the target cache entry back into the memory 205 .
  • the elimination rule may be the Least Recently Used (Least Recently Used, LRU) or the least cached page principle.
  • LRU Least Recently Used
  • the principle of least cached pages means that huge pages with the fewest cached data pages are preferentially eliminated.
  • the second level cache 104 stores decompressed data of data pages
  • the memory 205 stores compressed data of data pages
  • the second level cache 104 is used to store the compressed data of the data pages in the memory 205 .
  • the compressed data in the prefetch is prefetched, and the decompressed data corresponding to the compressed data is cached.
  • the read address shown includes a page tag, a page tag, a page index, and a page offset.
  • the number of bits of the page label is 19 bits
  • the number of bits of the page index is 9 bits
  • the number of bits of the intra-page offset is 12 bits.
  • FIG. 4 a schematic diagram of a data cache structure provided by an embodiment of the present application, wherein a cacheline (CL) and a CL tag are recorded in the first-level cache, and a cache entry is recorded in the page table buffer.
  • the cache entry includes a page tag Page Tag, a cache mark Buffered Flag and a huge page index Huge Page Index, one cache entry corresponds to a huge page in the secondary cache, and a data page in each huge page corresponds to a compression in the memory Page.
  • the size of the first level cache is 64M
  • the cache entry in the page table buffer the number of bits of the cache mark is 29 bits (512 bits)
  • each 1-bit cache mark corresponds to the A data page, when the data page in the huge page has been cached in the memory prefetched data page, the cache mark can be recorded as 1, otherwise, it can be recorded as 0.
  • the memory controller is further configured to query the page table buffer according to the page tag in the read address, determine whether there is a cache entry corresponding to the page tag in the page table buffer, and if so, further query
  • the cache tag corresponding to the page index of the read address determines whether the data page corresponding to the read address has been cached in the second-level cache.
  • the data corresponding to the read address is read from the cache.
  • the memory controller is further configured to construct a cache address corresponding to the read address, and the cache The address BPA is
  • Huge Page Index is the huge page index
  • Page Index is the page index
  • Page offset is the offset within the page
  • the M is the huge page size
  • the N is the page size.
  • the second-level cache has a capacity of 32G, including 16k huge pages with a size of 2M.
  • the size and number of huge pages can be flexibly adjusted according to the size of the actual physical memory, which is not repeated in this embodiment of the present application.
  • the first-level cache in the memory controller can use a static random-access memory (Static Random-Access Memory, SRAM) with a faster speed, and data is stored in the form of Cache Line.
  • SRAM static random-access memory
  • CL tag tag Use the CL tag tag to index the data in the first level cache.
  • the first-level cache is queried according to the physical address carried by the read instruction. If the corresponding tag is queried, the first-level cache is hit, and the data of the hit cacheline is read out.
  • the L2 cache in the memory controller can use DRAM.
  • the data in the L2 cache is stored in the form of 2M huge pages, and each 2M huge page is composed of 512 4K pages.
  • the memory controller uses the page table buffer to index the second level cache.
  • two levels of the second level cache may be used, that is, the first level page table buffer and the second level page table buffer, wherein the one The number of cache entries in the level 1 page table buffer is smaller than the number of cache entries in the level 2 page table buffer.
  • the number of cache entries in the first level table buffer is 128, and the number of cache entries in the level two page table buffer is 16k.
  • the page tag in the cache entry corresponds to the page tag of the physical address carried by the read instruction.
  • the cache tag may be 512 bits, each bit of the cache tag corresponds to a 4K page in the 2M huge page, and the cache tag is used to indicate whether the corresponding 4k page has been decompressed and cached in the L2 cache.
  • 0 not cached
  • 1 cached.
  • the Huge Page Index in the cache entry represents the actual address of the current 2M huge page in the L2 cache, and the memory controller can determine the address of the corresponding huge page in the L2 cache through the huge page index.
  • the buffered physical address BPA Buffered Physical Address
  • the high-order 19bits of the physical address of the memory are the page tag Page tag, which can be used to query the 2M huge page in the page table buffer, and the 9bits address in the middle corresponds to the specific 2M huge page that needs to be accessed.
  • the address of the 4K page, the lower 12bits Page offset corresponds to the final address in the 4K page.
  • a schematic flowchart of a method for reading data by a memory controller includes:
  • the memory controller receives the physical address of the access memory space sent by the host side, firstly queries the first-level cache, and confirms whether it hits the first-level cache, if not, executes step 602, and returns cacheline data if it hits;
  • the memory controller queries the page table buffer TLB based on the page tag in the physical address, determines the cache entry corresponding to the page tag, and determines whether the cache tag corresponding to the physical address in the cache entry is 1 (exemplarily, the value of is 1, indicating that the data page corresponding to the flag bit has been cached in the second-level cache), then determine the huge page index included in the cache entry, thereby calculating the address of the data to be accessed in the second-level cache, and performing step 603; Whether the cache tag corresponding to the physical address in the entry is 0 (exemplarily, a value of 0 indicates that the data page corresponding to the tag bit is not cached in the second-level cache), then execute step 604;
  • the two levels of TLBs are queried in turn.
  • the memory controller reads data from the L2 cache according to the address of the data to be accessed in the L2 cache;
  • the memory controller reads data from the memory according to the physical address.
  • the memory controller can read from the memory according to the physical address, 2k compressed data, and store the decompressed 4k data in the corresponding 4k data page in the huge page in the secondary cache, and set the 4k data
  • the cache flag in the cache entry corresponding to the page is 1.
  • the memory controller may also eliminate the target cache entry according to the elimination rule, and write the data page corresponding to the target cache entry back into the memory.
  • the elimination rule may be LRU or the principle of least cached pages. The principle of least cached pages means that huge pages with the fewest cached data pages are preferentially eliminated.
  • the embodiments of the present application provide a serial memory controller with two-level cache, which provides a low-cost and low-latency memory access solution by caching decompressed data in the memory controller.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

一种内存控制器、数据读取方法以及内存系统,用于提高数据读取的效率。所述内存控制器(100)包括:主机侧接口(101)、一级缓存(102)、页表缓冲器(103)和二级缓存(104),其中,所述页表缓冲器(103)存储有与所述二级缓存(104)对应的缓存条目,所述缓存条目用于指示存储在所述二级缓存(104)中的数据页。所述内存控制器(100)通过所述主机侧接口(101)接收读指令,所述读指令携带读地址;所述内存控制器(100)根据所述读地址确定一级缓存(102)未命中时,根据所述读地址查询所述页表缓冲器(103),当确定所述读地址对应的数据页已经缓存在所述二级缓存(104)中时,从所述二级缓存(104)中读取所述读地址对应的数据。

Description

一种内存控制器、数据读取方法以及内存系统 技术领域
本申请实施例涉及计算机领域,尤其涉及一种内存控制器、数据读取方法以及内存系统。
背景技术
内存是服务器中不可或缺的重要组件,成本约占服务器整系统成本的30%-40%,因此在不降低或者略微降低性能的基础上,降低内存的成本成为降低整系统总拥有成本(Total Cost of Ownership,TCO)的重要手段,内存技术成为各大服务器厂商及云运营商研究的热点技术。用硬压缩引擎对内存数据进行压缩或者用时延较高但成本更低的新介质(如非易失性内存)来替代传统内存都能够显著降低内存成本,但是随之带来的内存访问时延的增加则对应用性能产生的负面影响。
发明内容
本申请实施例提供了一种内存控制器、数据读取方法以及内存系统,以解决内存访问时延长的问题。
第一方面,本申请实施例第一方面提供了一种内存控制器,所述内存控制器包括主机侧接口、一级缓存、页表缓冲器和二级缓存,其中,所述页表缓冲器存储有与所述二级缓存对应的缓存条目,所述缓存条目用于指示存储在所述二级缓存中的数据页,所述内存控制器用于通过所述主机侧接口接收读指令,根据所述读指令携带的读地址确定一级缓存未命中,然后根据所述读地址查询所述页表缓冲器,当确定所述读地址对应的数据页已经缓存在所述二级缓存中时,从所述二级缓存中读取所述读地址对应的数据。
本申请实施例定义的内存控制器包含两级缓存和页表缓存器,通过页表缓存器中的缓存条目记录二级缓存中已缓存的数据页,从而有效利用二级缓存中的数据,降低从内存中读取数据的情况,从而有效降低了从内存中读取数据时延过长的问题,提高了数据访问的效率。
进一步的,当确定所述读地址对应的数据页未缓存在所述二级缓存中时,所述内存控制器还用于通过内存接口从内存中读取所述读地址对应的数据。
在一种可能的实施方式中,所述内存控制器用于在通过内存接口从内存中读取所述读地址对应的数据页后,将所述数据页缓存在二级缓存中,并在所述页表缓冲器中添加与所述读地址对应的缓存条目。通过将未命中的数据页增加到二级缓存并在页表缓冲器中增加缓存条目,提高了数据页的缓存量,进一步增加了数据访问命中的可能性。
所述内存控制器还用于按照淘汰规则淘汰目标缓存条目,并将所述目标缓存条目对应的数据页写回到所述内存中。需要说明的是,所述淘汰规则可以为LRU或者最少缓存页面原则。所述最少缓存页面原则是指已缓存的数据页最少的巨页被优先淘汰。
在另一种可能的实施方式中,所述二级缓存中存储有数据页的解压缩数据,所述内存中存储有数据页的压缩数据,所述二级缓存用于对内存中的压缩数据进行预取,并缓存压缩数据对应的解压缩数据。
本申请实施例还提供了读地址的格式和缓存条目的格式,示例性的,所述读地址包括页标签、页索引和页内偏移,所述缓存条目包括页标签、缓存标记和巨页索引,其中,所述巨页索引用于指示巨页在二级缓存中的地址,所述缓存标记用于指示数据页在巨页中是否已缓存。
所述内存控制器还用于根据读地址中的页标签,查询所述页表缓冲器,确定所述页表缓冲器中是否存在与所述页标签对应的缓存条目,如果存在,则进一步查询所述读地址的页索引对应的缓存标记,确定所述读地址对应的数据页是否已缓存在所述二级缓存中,如果所述读地址对应的数据页已缓存,则从所述二级缓存中读取所述读地址对应的数据。
在另一种可能的实施方式中,在所述读地址对应的数据页已缓存在所述二级缓存后,所述内存控制器还用于构建所述读地址对应的缓存地址,所述缓存地址BPA为
BPA=Huge Page Index*M+Page Index*N+Page offset
其中,Huge Page Index为巨页索引,Page Index为页索引,Page offset为页内偏移,所述M为巨页大小,所述N为页大小。
所述内存控制器具体用于根据所述缓存地址从所述二级缓存中读取所述读地址对应的数据。
所述页标识的位数为x,则所述缓存标记的位数为2 x,其中,所述X为大于0的整数。
示例性的,所述页标签的位数为19bits,所述页索引的位数为9bits,所述页内偏移的位数为12bits,则所述缓存标记的位数为2 9bits(512bits)。
在一种可能的实施方式中,所述一级缓存为SRAM,所述二级缓存为DRAM。
在一种可能的实施方式中,所述页表缓冲器包括一级页表缓冲器和二级页表缓冲器,其中,所述一级页表缓冲器中缓存条目的数量小于所述二级页表缓冲器中的缓存条目数量。示例性的,所述一级表缓冲器中缓存条目的数量为128,所述二级页表缓冲器中的缓存条目数量为16k。
第二方面,本申请实施例提供了一种内存控制器进行数据读取方法,所述内存控制器包括主机侧接口、一级缓存、页表缓冲器和二级缓存,其中,所述页表缓冲器存储有与所述二级缓存对应的缓存条目,所述缓存条目用于指示存储在所述二级缓存中的数据页,
所述方法包括:
所述内存控制器通过所述主机侧接口接收读指令,所述读指令携带读地址;
所述内存控制器根据所述读地址确定一级缓存未命中;
所述内存控制器根据所述读地址查询所述页表缓冲器,当确定所述读地址对应的数据页已经缓存在所述二级缓存中时,从所述二级缓存中读取所述读地址对应的数据。
在一种可能的实施方式中,所述方法还包括:
当确定所述读地址对应的数据页未缓存在所述二级缓存中时,所述内存控制器通过内存接口从内存中读取所述读地址对应的数据。
进一步的,所述方法还可以包括:
所述内存控制器在通过内存接口从内存中读取所述读地址对应的数据页后,将所述数据页缓存在二级缓存中,并在所述页表缓冲器中添加与所述读地址对应的缓存条目。
再一步的,所述方法还可以包括:
所述内存控制器按照淘汰规则淘汰目标缓存条目,并将所述目标缓存条目对应的数据页写回到所述内存中。
在一种可能的实施方式中,所述二级缓存中存储有数据页的解压缩数据,所述内存中存储有数据页的压缩数据,所述二级缓存用于对内存中的压缩数据进行预取,并缓存压缩数据对应的解压缩数据。
示例性的,所述读地址包括页标签、页索引和页内偏移,所述缓存条目包括页标签、缓存标记和巨页索引,其中,所述巨页索引用于指示巨页在二级缓存中的地址,所述缓存标记用于指示数据页在巨页中是否已缓存。
此时,所述内存控制器根据所述读地址查询所述页表缓冲器,当确定所述读地址对应的数据页已经缓存在所述二级缓存中时,从所述二级缓存中读取所述读地址对应的数据包括:
所述内存控制器根据读地址中的页标签,查询所述页表缓冲器,确定所述页表缓冲器中是否存在与所述页标签对应的缓存条目,如果存在,则进一步查询所述读地址的页索引对应的缓存标记,确定所述读地址对应的数据页是否已缓存在所述二级缓存中,如果所述读地址对应的数据页已缓存,则从所述二级缓存中读取所述读地址对应的数据。
在另一种可能的实施方式中,在确定所述读地址对应的数据页已缓存在所述二级缓存中之后,所述方法还包括:
所述内存控制器构建所述读地址对应的缓存地址,所述缓存地址BPA为:
BPA=Huge Page Index*M+Page Index*N+Page offset
其中,Huge Page Index为巨页索引,Page Index为页索引,Page offset为页内偏移,
所述内存控制器根据所述缓存地址从所述二级缓存中读取所述读地址对应的数据。
第三方面,本申请实施例还提供了一种内存系统,包括内存和如第一方面所述的内存控制器。
第四方面,本申请实施例还提供了一种芯片,包括存储介质和硬件处理逻辑,所述存储介质中存储有指令,所述硬件处理逻辑用于执行所述存储介质中的指令以实施如第二方面或第二方面任一种可能实现方式中任一所述的方法步骤。
第五方面,本申请实施例还提供了一种服务器,包括处理器和如权第三方面所述的内存系统。
第六方面,本申请实施例提供一种计算机可读存储介质,该存储介质中存储有计算机程序或指令,当计算机程序或指令被服务器中的处理器执行时,用于实现第二方面或第二方面任一种可能实现方式中所述的方法的操作步骤。
第七方面,本申请实施例提供一种计算机程序产品,该计算程序产品包括指令,当计算机程序产品在服务器或终端上运行时,使得服务器或终端执行该指令,以实现第二方面或第二方面任一种可能实现方式中所述的方法的操作步骤。
本申请在上述各方面提供的实现方式的基础上,还可以进行进一步组合以提供更多实现方式。
附图说明
图1为本申请实施例中内存控制器的结构示意图;
图2为申请实施例提供的一种内存系统的结构示意图;
图3为本申请实施例提供的一种读地址的格式示意图;
图4为本申请实施例提供的一种数据缓存结构示意图;
图5为本申请实施例提供的另一种数据缓存结构示意图;
图6为本申请实施例提供的一种内存控制器进行数据读取方法流程示意图。
具体实施方式
本申请的说明书和权利要求书及上述附图中的术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
在以内存为中心的计算架构场景,对大容量内存中的数据进行实时压缩/解压缩成为降低内存成本的重要手段,例如,以4K页为单位压缩/解压缩降低了成本,但大大增加了CPU内存访问的时延。虽然可以通过大容量Cache对数据进行缓存,但是同时带来了成本高的问题,且对于大容量且随机访问的场景,效果有限。
本申请实施例通过在串行内存芯片中引入一定容量大小的基于传统内存介质的二级缓存(例如,低功耗双倍数据速率内存Low Power Double Data Rate SDRAM,LPDDR)来缓存解压缩后的数据,并引入小规格的静态随机存取存储器(Static Random-Access Memory,SRAM)作为支持巨页TLB(Translation Lookaside Buffer,地址转换后援缓冲器)索引的方法,从而降低传统页表Page Table Walk方式带来的多次内存访问,提高数据读取的效率。具体的,本申请实施例通过在外置串行内存芯片上引入小容量Cache(SRAM)、中等容量不压缩内存、大容量压缩内存或PCM介质内存的方法,既降低内存的成本,又避免了CPU访问大容量压缩内存时的访问时延长的问题。
如图1所述为本申请实施例提供的一种内存控制器的结构示意图,所述内存控制器100包括主机侧接口101、一级缓存102、页表缓冲器103和二级缓存104,其中,所述页表缓冲器103存储有与所述二级缓存104对应的缓存条目,所述缓存条目用于指示存储在所述二级缓存104中的数据页,所述内存控制器100用于通过所述主机侧接口101接收读指令,根据所述读指令携带的读地址确定一级缓存102未命中,然后根据所述读地址查询所述页表缓冲器103,当确定所述读地址对应的数据页已经缓存在所述二级缓存104中时,从所述二级缓存104中读取所述读地址对应的数据。
进一步的,所述内存控制器还包括内存接口105。
所述一级缓存102包括cacheline和cacheline对应的tag,所述内存控制器100首先根据读地址在一级缓存102中查找待读取的数据,如果在一级缓存102中命中,则直接从一级缓存102中读取命中的cacheline。
进一步的所述内存控制器还包括串并转化逻辑。
本申请实施例定义的内存控制器包含两级缓存和页表缓存器,通过页表缓存器中的缓存条目记录二级缓存中已缓存的数据页,从而有效利用二级缓存中的数据,降低从内存中读取数据的情况,从而有效降低了从内存中读取数据时延过长的问题,提高了数据访问的效率。
如图2为本申请实施例提供的一种内存系统的结构示意图,所述内存系统200包括内存控制器100和内存205。
当确定所述读地址对应的数据页未缓存在所述二级缓存104中时,所述内存控制器100还用于通过内存接口105从内存205中读取所述读地址对应的数据。
具体的,从内存205中读取的数据可以为压缩数据,内存控制器100在读取到压缩数据后,将对应的解压缩数据存储到二级缓存104中。所述内存控制器100用于在通过内存接口105从内存205中读取所述读地址对应的数据页后,将所述数据页缓存在二级缓存104中,并在所述页表缓冲器103中添加与所述读地址对应的缓存条目。
所述内存控制器100还用于按照淘汰规则从页表缓冲器103中淘汰目标缓存条目,并将所述目标缓存条目对应的数据页写回到所述内存205中。需要说明的是,所述淘汰规则可以为最近最少使用(Least Recently Used,LRU)或者最少缓存页面原则。所述最少缓存页面原则是指已缓存的数据页最少的巨页被优先淘汰。
在另一种可能的实施方式中,所述二级缓存104中存储有数据页的解压缩数据,所述内存205中存储有数据页的压缩数据,所述二级缓存104用于对内存205中的压缩数据进行预取,并缓存压缩数据对应的解压缩数据。
如图3所示,为本申请实施例提供的一种读地址的格式示意图,所示读地址包括页标签Page Tag、页索引Page Index和页内偏移Page Offset。示例性的,所述页标签的位数为19bits,所述页索引的位数为9bits,所述页内偏移的位数为12bits。
如图4所示,为本申请实施例提供的一种数据缓存结构示意图,其中,一级缓存中记录有cacheline(CL)和CL标签,页表缓冲器中记录有缓存条目,示例性的,所述缓存条目包括页标签Page Tag、缓存标记Buffered Flag和巨页索引Huge Page Index,一个缓存条目对应二级缓存中的一个巨页,每个巨页中的一个数据页对应内存中的一个压缩页。示例性的,如图5所示,一级缓存大小为64M,页表缓冲器中的缓存条目中,缓存标记的位数为2 9bits(512bits),每1位缓存标记对应巨页中的一个数据页,当巨页中的数据页中已缓存了内存中预取的数据页时,该缓存标记为可以记为1,否则,可以记为0。
所述内存控制器还用于根据读地址中的页标签,查询所述页表缓冲器,确定所述页表缓冲器中是否存在与所述页标签对应的缓存条目,如果存在,则进一步查询所述读地址的页索引对应的缓存标记,确定所述读地址对应的数据页是否已缓存在所述二级缓存中,如果所述读地址对应的数据页已缓存,则从所述二级缓存中读取所述读地址对应的数据。
在另一种可能的实施方式中,在所述读地址对应的数据页已缓存在所述二级缓存后,所述内存控制器还用于构建所述读地址对应的缓存地址,所述缓存地址BPA为
BPA=Huge Page Index*M+Page Index*N+Page offset
其中,Huge Page Index为巨页索引,Page Index为页索引,Page offset为页内偏移,所述M为巨页大小,所述N为页大小。
按照前述图5所述的各个参数的大小,相对应的,M的取值为2M,N的取值为4k。此时二级缓存为32G容量,包含16k个2M大小的巨页。在具体实施中,可以根据实际物理内存的大小灵活调整巨页的大小和数量,本申请实施例不再赘述。
内存控制器中的所述一级缓存可以使用速度较快的静态随机存取存储器(Static Random-Access Memory,SRAM),数据以Cache Line的形式保存。使用CL标签tag对一级缓存的数据进行索引。根据读指令携带的物理地址在一级缓存中查询,如果查询到对应的tag,则命中一级缓存,将命中的cacheline的数据读出。
内存控制器中的二级缓存可以使用DRAM,示例性的,结合图5,二级缓存中的数据以2M大小巨页的形保存,每个2M巨页512个4K的页组成。
内存控制器使用页表缓冲器来对二级缓存进行索引,示例性的,可以采用两级的二级缓存,即,一级页表缓冲器和二级页表缓冲器,其中,所述一级页表缓冲器中缓存条目的数量小于所述二级页表缓冲器中的缓存条目数量。示例性的,所述一级表缓冲器中缓存条目的数量为128,所述二级页表缓冲器中的缓存条目数量为16k。
缓存条目中的页标签对应于读指令携带的物理地址的页标签。
结合图5的举例,缓存标记可以为512位,每一位缓存标记对应2M巨页中的一个4K的页,缓存标记用于表示对应的4k页是否已被解压缓存到二级缓存中。示例性的,0:末缓存,1:已缓存。缓存条目中的巨页索引Huge Page Index表示当前2M巨页在二级缓存中的实际地址,内存控制器可以通过该巨页索引确定对应的巨页在二级缓存中的地址。相对应的,如果主机侧的物理地址如果在二级缓存中,通过计算可以得到在数据在二级缓存中的缓存物理地址BPA(Buffered Physical Address)。BPA的计算公式为:
BPA=Page Index*2M+Index*4K+Page offset
当内存的地址为40bit,则可以寻址1T的内存空间。
结合图3所述的读地址的格式,内存物理地址的高位19bits为页标签Page tag,可以来用查询页表缓存器中的2M巨页,中间的9bits地址对应2M巨页中需要访问的具体的4K页的地址,低位的12bits Page offset对应4K页中的最终地址。
如图6所示,为本申请实施例提供的一种内存控制器进行数据读取方法流程示意图,包括:
601:内存控制器接收主机侧发送的访问内存空间的物理地址,首先查询的一级缓存,确认是否在一级缓存命中,如果未命中则执行步骤602,命中则返回cacheline数据;
602:内存控制器基于物理地址中的页标签page tag查询页表缓存器TLB,确定页标签对应的缓存条目,确定该缓存条目中与物理地址对应的缓存标记是否为1(示例性的,值为1表示该标记位对应的数据页已缓存到二级缓存中),则确定该缓存条目包括的巨页索引,从而计算得到待访问数据在二级缓存中的地址,执行步骤603;如果缓存条目中与物理地址对应的缓存标记是否为0(示例性的,值为0表示该标记位对应的数据页未缓存到二级缓存中),则执行步骤604;
具体的,当页表缓冲器包含两级TLB时,依次查询两级TLB。
603:内存控制器根据待访问数据在二级缓存中的地址从二级缓存中读取数据;
604:内存控制器根据物理地址从内存中读取数据。
其中,结合图5,内存控制器可以根据物理地址从内存中读,2k压缩数据,并将解压后的4k数据存放到二级缓存中巨页中对应的4k数据页中,并设置该4k数据页对应的缓存条目中的缓存标记为1。
进一步的,如果缓存条目中未命中,同样执行前述604的动作。
更进一步的,所述内存控制器还可以按照淘汰规则淘汰目标缓存条目,并将所述目标缓存条目对应的数据页写回到所述内存中。所述淘汰规则可以为LRU或者最少缓存页面原则。所述最少缓存页面原则是指已缓存的数据页最少的巨页被优先淘汰。
本申请实施例提供了一种带两级缓存的串行内存控制器,通过在内存控制器中缓存解压缩数据,提供了低成本且低时延的内存访问方案。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的内存控制器、 方法和内存系统的具体工作过程,可以参考前述实施例中的对应过程,在此不再赘述。

Claims (22)

  1. 一种内存控制器,其特征在于,包括主机侧接口、一级缓存、页表缓冲器和二级缓存,其中,所述页表缓冲器存储有与所述二级缓存对应的缓存条目,所述缓存条目用于指示存储在所述二级缓存中的数据页,
    所述内存控制器用于通过所述主机侧接口接收读指令,所述读指令携带读地址;
    所述内存控制器还用于根据所述读地址确定一级缓存未命中;
    所内存控制器还用于根据所述读地址查询所述页表缓冲器,当确定所述读地址对应的数据页已经缓存在所述二级缓存中时,从所述二级缓存中读取所述读地址对应的数据。
  2. 如权利要求1所述的内存控制器,其特征在于,当确定所述读地址对应的数据页未缓存在所述二级缓存中时,所述内存控制器还用于通过内存接口从内存中读取所述读地址对应的数据。
  3. 如权利要求2所述的内存控制器,其特征在于,
    所述内存控制器用于在通过内存接口从内存中读取所述读地址对应的数据页后,将所述数据页缓存在二级缓存中,并在所述页表缓冲器中添加与所述读地址对应的缓存条目。
  4. 如权利要求3所述的内存控制器,其特征在于,
    所述内存控制器还用于按照淘汰规则淘汰目标缓存条目,并将所述目标缓存条目对应的数据页写回到所述内存中。
  5. 如权利要求1-4任一所述的内存控制器,其特征在于,
    所述二级缓存中存储有数据页的解压缩数据,所述内存中存储有数据页的压缩数据,所述二级缓存用于对内存中的压缩数据进行预取,并缓存压缩数据对应的解压缩数据。
  6. 如权利要求1-5任一所述的内存控制器,其特征在于,
    所述读地址包括页标签、页索引和页内偏移,所述缓存条目包括页标签、缓存标记和巨页索引,其中,所述巨页索引用于指示巨页在二级缓存中的地址,所述缓存标记用于指示数据页在巨页中是否已缓存。
  7. 如权利要求6所述的内存控制器,其特征在于,
    所述内存控制器还用于根据读地址中的页标签,查询所述页表缓冲器,确定所述页表缓冲器中是否存在与所述页标签对应的缓存条目,如果存在,则进一步查询所述读地址的页索引对应的缓存标记,确定所述读地址对应的数据页是否已缓存在所述二级缓存中,如果所述读地址对应的数据页已缓存,则从所述二级缓存中读取所述读地址对应的数据。
  8. 如权利要求7所述的内存控制器,其特征在于,
    在所述读地址对应的数据页已缓存在所述二级缓存后,所述内存控制器还用于构建所述读地址对应的缓存地址,所述缓存地址BPA为
    BPA=Huge Page Index*M+Page Index*N+Page offset
    其中,Huge Page Index为巨页索引,Page Index为页索引,Page offset为页内偏移,
    所述内存控制器具体用于根据所述缓存地址从所述二级缓存中读取所述读地址对应的数据。
  9. 如权利要求7所述的内存控制器,其特征在于,
    所述页标识的位数为x,则所述缓存标记的位数为2x。
  10. 如权利要求1-9任一所述的内存控制器,其特征在于,
    所述一级缓存为SRAM,所述二级缓存为DRAM。
  11. 如权利要求1-10任一所述的内存控制器,其特征在于,
    所述页表缓冲器包括一级页表缓冲器和二级页表缓冲器,其中,所述一级页表缓冲器中缓存条目的数量小于所述二级页表缓冲器中的缓存条目数量。
  12. 一种内存控制器进行数据读取方法,其特征在于,所述内存控制器包括主机侧接口、一级缓存、页表缓冲器和二级缓存,其中,所述页表缓冲器存储有与所述二级缓存对应的缓存条目,所述缓存条目用于指示存储在所述二级缓存中的数据页,
    所述方法包括:
    所述内存控制器通过所述主机侧接口接收读指令,所述读指令携带读地址;
    所述内存控制器根据所述读地址确定一级缓存未命中;
    所述内存控制器根据所述读地址查询所述页表缓冲器,当确定所述读地址对应的数据页已经缓存在所述二级缓存中时,从所述二级缓存中读取所述读地址对应的数据。
  13. 如权利要求12所述的方法,其特征在于,所述方法还包括:
    当确定所述读地址对应的数据页未缓存在所述二级缓存中时,所述内存控制器通过内存接口从内存中读取所述读地址对应的数据。
  14. 如权利要求13所述的方法,其特征在于,所述方法还包括:
    所述内存控制器在通过内存接口从内存中读取所述读地址对应的数据页后,将所述数据页缓存在二级缓存中,并在所述页表缓冲器中添加与所述读地址对应的缓存条目。
  15. 如权利要求14所述的方法,其特征在于,所述方法还包括:
    所述内存控制器按照淘汰规则淘汰目标缓存条目,并将所述目标缓存条目对应的数据页写回到所述内存中。
  16. 如权利要求12-15任一所述的方法,其特征在于,
    所述二级缓存中存储有数据页的解压缩数据,所述内存中存储有数据页的压缩数据,所述二级缓存用于对内存中的压缩数据进行预取,并缓存压缩数据对应的解压缩数据。
  17. 如权利要求12-16任一所述的方法,其特征在于,
    所述读地址包括页标签、页索引和页内偏移,所述缓存条目包括页标签、缓存标记和巨页索引,其中,所述巨页索引用于指示巨页在二级缓存中的地址,所述缓存标记用于指示数据页在巨页中是否已缓存。
  18. 如权利要求17所述的方法,其特征在于,所述内存控制器根据所述读地址查询所述页表缓冲器,当确定所述读地址对应的数据页已经缓存在所述二级缓存中时,从所述二级缓存中读取所述读地址对应的数据包括:
    所述内存控制器根据读地址中的页标签,查询所述页表缓冲器,确定所述页表缓冲器中是否存在与所述页标签对应的缓存条目,如果存在,则进一步查询所述读地址的页索引对应的缓存标记,确定所述读地址对应的数据页是否已缓存在所述二级缓存中,如果所述读地址对应的数据页已缓存,则从所述二级缓存中读取所述读地址对应的数据。
  19. 如权利要求18所述的方法,其特征在于,在确定所述读地址对应的数据页已缓存在所述二级缓存中之后,所述方法还包括:
    所述内存控制器构建所述读地址对应的缓存地址,所述缓存地址BPA为:
    BPA=Huge Page Index*M+Page Index*N+Page offset
    其中,Huge Page Index为巨页索引,Page Index为页索引,Page offset为页内偏移,
    所述内存控制器根据所述缓存地址从所述二级缓存中读取所述读地址对应的数据。
  20. 一种内存系统,其特征在于,包括内存和如权利要求1-11任一所述的内存控制器。
  21. 一种芯片,其特征在于,包括存储介质和硬件处理逻辑,所述存储介质中存储有指令,所述硬件处理逻辑用于执行所述存储介质中的指令以实施如权利要求12-19任一所述的方法。
  22. 一种服务器,其特征在于,包括处理器和如权利要求20所述的内存系统。
PCT/CN2021/120427 2021-04-23 2021-09-24 一种内存控制器、数据读取方法以及内存系统 WO2022222377A1 (zh)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202110441816 2021-04-23
CN202110441816.8 2021-04-23
CN202111082943.XA CN115237585A (zh) 2021-04-23 2021-09-15 一种内存控制器、数据读取方法以及内存系统
CN202111082943.X 2021-09-15

Publications (1)

Publication Number Publication Date
WO2022222377A1 true WO2022222377A1 (zh) 2022-10-27

Family

ID=83666458

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/120427 WO2022222377A1 (zh) 2021-04-23 2021-09-24 一种内存控制器、数据读取方法以及内存系统

Country Status (2)

Country Link
CN (1) CN115237585A (zh)
WO (1) WO2022222377A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116501696A (zh) * 2023-06-30 2023-07-28 之江实验室 适用于分布式深度学习训练预取缓存管理的方法和装置

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115712392A (zh) * 2022-11-15 2023-02-24 中科芯集成电路有限公司 一种基于Buffer的Cache控制器及工作方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101334759A (zh) * 2007-06-28 2008-12-31 国际商业机器公司 访问处理器缓存的方法和系统
CN109582600A (zh) * 2017-09-25 2019-04-05 华为技术有限公司 一种数据处理方法及装置
US20200218665A1 (en) * 2017-07-31 2020-07-09 Arm Limited Address translation cache
CN112527395A (zh) * 2020-11-20 2021-03-19 海光信息技术股份有限公司 数据预取方法和数据处理装置
CN112631962A (zh) * 2019-09-24 2021-04-09 阿里巴巴集团控股有限公司 存储管理装置、存储管理方法、处理器和计算机系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101334759A (zh) * 2007-06-28 2008-12-31 国际商业机器公司 访问处理器缓存的方法和系统
US20200218665A1 (en) * 2017-07-31 2020-07-09 Arm Limited Address translation cache
CN109582600A (zh) * 2017-09-25 2019-04-05 华为技术有限公司 一种数据处理方法及装置
CN112631962A (zh) * 2019-09-24 2021-04-09 阿里巴巴集团控股有限公司 存储管理装置、存储管理方法、处理器和计算机系统
CN112527395A (zh) * 2020-11-20 2021-03-19 海光信息技术股份有限公司 数据预取方法和数据处理装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116501696A (zh) * 2023-06-30 2023-07-28 之江实验室 适用于分布式深度学习训练预取缓存管理的方法和装置
CN116501696B (zh) * 2023-06-30 2023-09-01 之江实验室 适用于分布式深度学习训练预取缓存管理的方法和装置

Also Published As

Publication number Publication date
CN115237585A (zh) 2022-10-25

Similar Documents

Publication Publication Date Title
US10248576B2 (en) DRAM/NVM hierarchical heterogeneous memory access method and system with software-hardware cooperative management
CN104346294B (zh) 基于多级缓存的数据读/写方法、装置和计算机系统
WO2022222377A1 (zh) 一种内存控制器、数据读取方法以及内存系统
JP6505132B2 (ja) メモリ容量圧縮を利用するメモリコントローラならびに関連するプロセッサベースのシステムおよび方法
CN109219804B (zh) 非易失内存访问方法、装置和系统
US7623134B1 (en) System and method for hardware-based GPU paging to system memory
US8335908B2 (en) Data processing apparatus for storing address translations
US11210020B2 (en) Methods and systems for accessing a memory
CN111061655B (zh) 存储设备的地址转换方法与设备
KR20080063512A (ko) 변환 색인 버퍼들(tlbs) 필드의 다중 레벨 갱신
JP2013529815A (ja) メモリアクセスを正確に予測するための、領域に基づく技術
Ouyang et al. SSD-assisted hybrid memory to accelerate memcached over high performance networks
US20240303202A1 (en) Method and apparatus for solving cache address alias
Park et al. Compression support for flash translation layer
CN110389911A (zh) 一种设备内存管理单元的预取方法、装置及系统
EP3475833A1 (en) Pre-fetch mechanism for compressed memory lines in a processor-based system
TWI453584B (zh) 處理非對準式記憶體存取的設備、系統及方法
CN108874691B (zh) 数据预取方法和内存控制器
CN113190499A (zh) 一种面向大容量片上缓存的协同预取器及其控制方法
Liu et al. A High Performance Memory Key-Value Database Based on Redis.
KR20210037216A (ko) 이종 메모리를 이용하여 메모리 주소 변환 테이블을 관리하는 메모리 관리 유닛 및 이의 메모리 주소 관리 방법
KR20120127108A (ko) 메모리 시스템
Benveniste et al. Cache-memory interfaces in compressed memory systems
EP3757804A1 (en) Page tables for granular allocation of memory pages
JP2003281079A5 (zh)

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21937599

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21937599

Country of ref document: EP

Kind code of ref document: A1