CN116450045A - Implementation method, system and medium of hybrid memory system for HBM - Google Patents

Implementation method, system and medium of hybrid memory system for HBM Download PDF

Info

Publication number
CN116450045A
CN116450045A CN202310431062.7A CN202310431062A CN116450045A CN 116450045 A CN116450045 A CN 116450045A CN 202310431062 A CN202310431062 A CN 202310431062A CN 116450045 A CN116450045 A CN 116450045A
Authority
CN
China
Prior art keywords
hbm
data
heat
access request
hybrid memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310431062.7A
Other languages
Chinese (zh)
Inventor
尹吉
黄林鹏
郑圣安
花一帆
陈纬栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202310431062.7A priority Critical patent/CN116450045A/en
Publication of CN116450045A publication Critical patent/CN116450045A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • G06F3/0622Securing storage systems in relation to access
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0685Hybrid storage combining heterogeneous device types, e.g. hierarchical storage, hybrid arrays
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention provides a method, a system and a medium for realizing a hybrid memory system facing an HBM, comprising the following steps: initializing and establishing a hybrid memory controller, and establishing a heat monitoring table and an address conversion table; the hybrid memory system receives an access request; the stored position of the data is redistributed according to the heat of the data, an address conversion table is updated, and the redirection of the access request is completed; and performing data migration and caching according to the address conversion table, and completing the access request. Aiming at the high bandwidth characteristic of the HBM, the invention divides the HBM into a c-HBM area and an m-HBM area, and carries out careful heat classification on the data, thereby fully improving the access speed of the hot data, reducing the cost of data migration and well reducing the overall access delay of a memory system; compared with the traditional memory system, only small-capacity HBM is added, so that the overall access speed of the memory system is greatly improved, and meanwhile, a larger memory space is provided, the memory system is excellent in performance under various workloads, and basically has good market prospect and application value.

Description

Implementation method, system and medium of hybrid memory system for HBM
Technical Field
The invention relates to the technical field of computer system structures, in particular to a method, a system and a medium for realizing a hybrid memory system facing an HBM.
Background
The high bandwidth memory (High Bandwidth Memory, HBM) is a high performance DRAM having up to several times the bandwidth, lower power consumption and slightly higher latency than off-chip memory.
Due to the characteristics, the application of the HBM in the memory system can bring about larger performance improvement. HBMs are expensive due to process reasons, and it is difficult to deploy large-capacity HBMs in memory systems, so that it is necessary to build a hybrid memory system with off-chip memory to obtain a larger memory capacity.
In a workload, a small portion of frequently accessed data is referred to as hot data. The access of the hot data has a large duty ratio in the total access process, and the hot data can be put into the HBM with higher bandwidth to improve the access speed of the hot data, thereby improving the read-write performance of the memory system. The HBM is used as a c-HBM (cache-HBM), and the thermal data in the off-chip memory is cached in the HBM, so that the overall access speed of the memory system can be improved. The caching of data from off-chip memory to the HBM takes up a lot of bandwidth, introduces additional access overhead,
the partial workload needs to occupy a very large memory space, and when the memory space is insufficient, the system uses the hard disk as the virtual memory, so that the data access speed is greatly reduced, and the performance of the memory system is damaged. The HBM is used as m-HBM (memory-HBM), namely, the HBM and the off-chip memory are juxtaposed to form a memory space together, and the address is visible to an operating system, so that the capacity of the memory system can be enlarged.
The existing hybrid memory system basically uses HBM as one of c-HBM or m-HBM, and cannot keep high performance in various working scenes; meanwhile, the heat statistics is rough, the cost generated by data migration is large, and the characteristics of the HBM cannot be fully utilized.
HBM: high Bandwidth Memory A high performance DRAM based on a 3D stack process achieves higher bandwidth with less volume and less power than off-chip memory.
DRAM: dynamic Random Access Memory, a dynamic random access memory, which is a memory widely used in current memory systems. The types of DRAM commonly used in the current memory system are DDR4 SDRAM, DDR5 SDRAM, etc.
Off-chip memory: off-chip Memory, which is not integrated in the processor chip, is widely used at this stage in a form of a Memory module that can be easily installed in an electronic system and replaced.
Cache: the cache, i.e. the structure between two hardware with a large difference in rate, is used to coordinate the difference in data transfer rates between the two.
Particle size: granularity, the unit of accessed data size, commonly used are Page and Block Granularity (Block). Page granularity is a memory page size, determined by the operating system, typically 4KiB or higher; the block granularity is the size of one cache line, typically 64Byte.
Patent document CN114063914a (application number: 202111305042.2) discloses a data management method for DRAM-HBM hybrid memory, comprising: acquiring a data operation request; according to the data operation request, determining an operated object and a corresponding data operation interface; and based on the data operation interface, performing data migration on the operated objects with different data attributes in different storage spaces in the DRAM-HBM hybrid memory.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a method, a system and a medium for realizing a hybrid memory system facing an HBM.
According to the implementation method of the hybrid memory system facing the HBM, the method comprises the following steps:
step S1: initializing and establishing a hybrid memory controller, establishing a heat monitoring table and an address conversion table, and dividing regions on an HBM of the hybrid memory, wherein the regions comprise a c-HBM region and an m-HBM region;
step S2: the hybrid memory system receives the access request and updates the heat monitoring table;
step S3: the stored position of the data is redistributed according to the heat of the data, an address conversion table is updated, and the redirection of the access request is completed;
step S4: and performing data migration and caching according to the address conversion table, and completing the access request.
Preferably, the step S1 includes the steps of:
step S1.1: obtaining the capacity of the HBM and the capacity of the off-chip memory;
step S1.2: setting the number of group links; setting the sizes of c-HBM and m-HBM, and finishing division;
step S1.3: creating a heat statistics module, an address conversion module and a migration control module in the hybrid memory controller;
step S1.4: creating a heat monitoring table and an address conversion table;
step S1.5: and setting the sum of the capacity of the m-HBM and the capacity of the off-chip memory as the size of the hybrid memory, and informing the memory address space to an operating system.
Preferably, the step S2 includes the steps of:
step S2.1: obtaining the accessed memory address from the access request;
step S2.2: dividing data into three types of cold, warm and hot according to access frequency, updating heat information of all data according to block granularity and page granularity according to an algorithm, and writing the updated data into a heat monitoring table;
step S2.3: and counting the data of which the heat information changes, and inputting the data into an address conversion module.
Preferably, the step S3 includes the steps of:
step S3.1: receiving heat change information, judging whether the heat change information is cached to a c-HBM, evicting the c-HBM or carrying out data migration in an m-HBM and an off-chip memory according to the heat change of the data block, and inputting the data migration and the cached request into a migration control module;
step S3.2: updating an address conversion table, and marking the actual storage positions of all memory pages and whether all data blocks of the memory pages are cached to the c-HBM;
step S3.3: the access request is redirected.
Preferably, in the step S3.1:
when the heat of the data in the off-chip memory becomes hot, caching the data to the c-HBM;
when the heat of the data in the c-HBM is changed into cold, the cold is ejected and written back to the off-chip memory, and existing data migration is performed;
when the heat of the data in the c-HBM becomes norm, the data is evicted and written back to the m-HBM, and existing data migration is performed;
when the heat of the data in the m-HBM is changed into cold, the cold is exchanged with the corresponding page in the off-chip memory, and data migration is performed.
Preferably, the step S4 includes the steps of:
step S4.1: if the dirty data block is evicted from the c-HBM, the corresponding write-back and migration operation is completed;
step S4.2: and if the caching is triggered, finishing the caching operation of the data block where the access address is located, and finishing the access request.
Preferably, in the step S4.1:
if the dirty data block is evicted and the write-back address is the off-chip memory, directly writing the data block back to the off-chip memory;
if dirty data are evicted and the write-back address is m-HBM, corresponding data pages in the m-HBM and the off-chip memory are read into the hybrid memory controller, the write-back operation of the data block is completed in the controller, and then 2 data pages are respectively written into the off-chip memory and the m-HBM to complete migration operation.
Preferably, in the step S4.2:
if the access request does not trigger the cache, directly sending the access request to the hybrid memory;
if the access request triggers the cache, the access request is added into a waiting queue, a data block where a target address of the access request is located is read to a migration control module, the access request directly completes read-write operation in a memory controller, and then the data block is written into a c-HBM.
The invention also provides a system for realizing the hybrid memory system facing the HBM, which comprises the following modules:
module M1: initializing and establishing a hybrid memory controller, establishing a heat monitoring table and an address conversion table, and dividing regions on an HBM of the hybrid memory, wherein the regions comprise a c-HBM region and an m-HBM region;
module M2: the hybrid memory system receives the access request and updates the heat monitoring table;
module M3: the stored position of the data is redistributed according to the heat of the data, an address conversion table is updated, and the redirection of the access request is completed;
module M4: and performing data migration and caching according to the address conversion table, and completing the access request.
The invention also provides a computer readable storage medium storing a computer program which when executed by a processor implements the steps of the method for implementing the HBM-oriented hybrid memory system described above.
Compared with the prior art, the invention has the following beneficial effects:
1. aiming at the high bandwidth characteristic of the HBM, the invention provides a hybrid memory system facing the HBM, which divides the HBM into a c-HBM area and an m-HBM area, and carries out careful heat classification on data, thereby fully improving the access speed of hot data, reducing the cost of data migration and well reducing the overall access delay of the memory system;
2. compared with the traditional memory system, the invention only adds the small-capacity HBM, greatly improves the overall access speed of the memory system, simultaneously provides a larger memory space, has excellent performance under various workloads, and basically has good market prospect and application value.
Drawings
Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, given with reference to the accompanying drawings in which:
FIG. 1 is a schematic diagram of a hybrid memory system according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a hybrid memory allocation scheme according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of data mapping in an embodiment of the present invention;
FIG. 4 is a schematic diagram of a data allocation method in an embodiment of the present invention;
FIG. 5 is a schematic diagram of an address translation table in an embodiment of the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the present invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications could be made by those skilled in the art without departing from the inventive concept. These are all within the scope of the present invention.
Example 1:
according to the implementation method of the hybrid memory system facing the HBM, the method comprises the following steps:
step S1: initializing and establishing a hybrid memory controller, establishing a heat monitoring table and an address conversion table, and dividing regions on an HBM of the hybrid memory, wherein the regions comprise a c-HBM region and an m-HBM region;
step S1.1: obtaining the capacity of the HBM and the capacity of the off-chip memory;
step S1.2: setting the number of group links; setting the sizes of c-HBM and m-HBM, and finishing division;
step S1.3: creating a heat statistics module, an address conversion module and a migration control module in the hybrid memory controller;
step S1.4: creating a heat monitoring table and an address conversion table;
step S1.5: and setting the sum of the capacity of the m-HBM and the capacity of the off-chip memory as the size of the hybrid memory, and informing the memory address space to an operating system.
Step S2: the hybrid memory system receives the access request and updates the heat monitoring table;
step S2.1: obtaining the accessed memory address from the access request;
step S2.2: dividing data into three types of cold, warm and hot according to access frequency, updating heat information of all data according to block granularity and page granularity according to an algorithm, and writing the updated data into a heat monitoring table;
step S2.3: and counting the data of which the heat information changes, and inputting the data into an address conversion module.
Step S3: the stored position of the data is redistributed according to the heat of the data, an address conversion table is updated, and the redirection of the access request is completed;
step S3.1: receiving heat change information, judging whether the heat change information is cached to a c-HBM, evicting the c-HBM or carrying out data migration in an m-HBM and an off-chip memory according to the heat change of the data block, and inputting the data migration and the cached request into a migration control module;
when the heat of the data in the off-chip memory becomes hot, caching the data to the c-HBM;
when the heat of the data in the c-HBM is changed into cold, the cold is ejected and written back to the off-chip memory, and existing data migration is performed;
when the heat of the data in the c-HBM becomes norm, the data is evicted and written back to the m-HBM, and existing data migration is performed;
when the heat of the data in the m-HBM is changed into cold, the cold is exchanged with the corresponding page in the off-chip memory, and data migration is performed.
Step S3.2: updating an address conversion table, and marking the actual storage positions of all memory pages and whether all data blocks of the memory pages are cached to the c-HBM;
step S3.3: the access request is redirected.
Step S4: and performing data migration and caching according to the address conversion table, and completing the access request.
Step S4.1: if the dirty data block is evicted from the c-HBM, the corresponding write-back and migration operation is completed;
if the dirty data block is evicted and the write-back address is the off-chip memory, directly writing the data block back to the off-chip memory;
if dirty data are evicted and the write-back address is m-HBM, corresponding data pages in the m-HBM and the off-chip memory are read into the hybrid memory controller, the write-back operation of the data block is completed in the controller, and then 2 data pages are respectively written into the off-chip memory and the m-HBM to complete migration operation.
Step S4.2: and if the caching is triggered, finishing the caching operation of the data block where the access address is located, and finishing the access request.
If the access request does not trigger the cache, directly sending the access request to the hybrid memory;
if the access request triggers the cache, the access request is added into a waiting queue, a data block where a target address of the access request is located is read to a migration control module, the access request directly completes read-write operation in a memory controller, and then the data block is written into a c-HBM.
The invention also provides a computer readable storage medium storing a computer program which when executed by a processor implements the steps of the method for implementing the HBM-oriented hybrid memory system described above.
The invention also provides a realization system of the hybrid memory system facing the HBM, which can be realized by executing the flow steps of the realization method of the hybrid memory system facing the HBM, namely, a person skilled in the art can understand the realization method of the hybrid memory system facing the HBM as the preferred implementation mode of the realization system of the hybrid memory system facing the HBM.
Example 2:
the invention also provides a system for realizing the hybrid memory system facing the HBM, which comprises the following modules:
module M1: initializing and establishing a hybrid memory controller, establishing a heat monitoring table and an address conversion table, and dividing regions on an HBM of the hybrid memory, wherein the regions comprise a c-HBM region and an m-HBM region;
module M2: the hybrid memory system receives the access request and updates the heat monitoring table;
module M3: the stored position of the data is redistributed according to the heat of the data, an address conversion table is updated, and the redirection of the access request is completed;
module M4: and performing data migration and caching according to the address conversion table, and completing the access request.
Example 3:
aiming at the defects in the prior art, the invention aims to provide a method, a system and a medium for realizing a hybrid memory system facing an HBM.
The method for realizing the HBM-oriented hybrid memory system provided by the invention comprises the following steps:
step 1: the method comprises the steps that a hybrid memory controller is initialized and built, a heat monitoring table and an address conversion table are created, and areas are divided on HBM of the hybrid memory, wherein the areas comprise c-HBM areas and m-HBM areas;
step 2: the hybrid memory system receives the access request and updates the heat monitoring table;
step 3: the stored position of the data is redistributed according to the heat of the data, an address conversion table is updated, and the redirection of the access request is completed;
step 4: and performing data migration and caching according to the address conversion table, and completing the access request.
Preferably, the step 1 includes:
step 1.1: obtaining the capacity of the HBM and the capacity of the off-chip memory;
step 1.2: setting the number of group links; setting the sizes of c-HBM and m-HBM, and finishing division;
step 1.3: creating a heat statistics module, an address conversion module and a migration control module in the hybrid memory controller;
step 1.4: creating a heat monitoring table and an address conversion table;
step 1.5: and setting the sum of the capacity of the m-HBM and the capacity of the off-chip memory as the size of the hybrid memory, and informing the memory address space to an operating system.
Preferably, the step 2 includes:
step 2.1: obtaining the accessed memory address from the access request;
step 2.2: dividing data into three types of cold, warm and hot according to access frequency, updating heat information of all data according to block granularity and page granularity according to an algorithm, and writing the updated data into a heat monitoring table;
step 2.3: and counting data with heat information changed, and inputting the information into an address conversion module.
Preferably, the step 3 includes:
step 3.1: the method comprises the steps of receiving heat change information, judging whether the heat change information is cached to a c-HBM, evicting the c-HBM or carrying out data migration in an m-HBM and an off-chip memory according to the heat change of a data block, and inputting data migration and a cached request into a migration control module:
when the heat of the data in the off-chip memory becomes hot, caching the data to the c-HBM;
when the heat of the data in the c-HBM is changed into cold, the cold is ejected and written back to the off-chip memory, and data migration which may exist is performed;
when the heat of the data in the c-HBM becomes norm, evicting and writing back the data to the m-HBM, and performing possible data migration;
when the heat of the data in the m-HBM is changed into cold, exchanging the cold with a corresponding page in the off-chip memory, and performing data migration;
step 3.2: updating an address conversion table, and marking the actual storage positions of all memory pages and whether all data blocks of the memory pages are cached to the c-HBM;
step 3.3: redirecting the access request:
when the data block where the access address is located is cached in the c-HBM, replacing the access address with a corresponding address in the c-HBM;
and when the data block where the access address is located is not cached, inquiring the location of the data page where the data block is located, and replacing the access request with the corresponding address in the actual location of the data page.
Preferably, the step 4 includes:
completing data caching, data migration and access requests, including:
step 4.1: if the dirty data block is evicted and the write-back address is the off-chip memory, directly writing the data block back to the off-chip memory;
if the dirty data are evicted and the write-back address is m-HBM, reading the corresponding data pages in the m-HBM and the off-chip memory into a hybrid memory controller, completing the write-back operation of the data block in the controller, and then respectively writing 2 data pages into the off-chip memory and the m-HBM to complete the migration operation;
step 4.2: if the access request does not trigger the cache, directly sending the access request to the hybrid memory;
if the access request triggers the cache, the access request is added into a waiting queue, a data block where a target address of the access request is located is read to a migration control module, the access request directly completes read-write operation in a memory controller, and then the data block is written into a c-HBM.
The system for realizing the hybrid memory system facing the HBM provided by the invention comprises:
module 1: the method comprises the steps that a hybrid memory controller is initialized and built, a heat monitoring table and an address conversion table are created, and areas are divided on HBM of the hybrid memory, wherein the areas comprise c-HBM areas and m-HBM areas;
module 2: the hybrid memory system receives the access request and updates the heat monitoring table;
module 3: the stored position of the data is redistributed according to the heat of the data, an address conversion table is updated, and the redirection of the access request is completed;
module 4: and performing data migration and caching according to the address conversion table, and completing the access request.
As shown in fig. 1, the HBM-oriented hybrid memory system provided by the present invention includes:
the hybrid memory controller comprises a heat statistics module, an address conversion module and a migration control module. The heat statistics module obtains the accessed address from the access request, calculates the heat of different data blocks, and stores the heat information in a heat monitoring table; the address conversion module calculates the latest memory caching and allocation situation according to the heat monitoring table, and redirects the access request as required; the migration control module obtains the data block to be migrated from the address conversion module, sends an access request to the hybrid memory, and performs data migration.
Hybrid memory, including HBM and off-chip memory. As shown in fig. 2, the HBM is divided into a c-HBM area and an m-HBM area of a fixed size at the memory system initialization stage.
The m-HBM region and off-chip memory together form a memory space, and memory addresses that are visible to the operating system, relatively hot data will be migrated to the m-HBM region at page granularity.
The c-HBM area is used to cache the highest heat data, which is cached to the c-HBM at block granularity. The c-HBM adopts a set associative mapping to cache the data of the m-HBM and the off-chip memory. Specifically, as shown in fig. 3, assuming that the size of the c-HBM is 64MiB, the size of the off-chip memory is 16GiB, and 16-way set association is adopted, the data of the c-HBM and the off-chip memory are divided into 16 sets, each set has 256 pieces of off-chip memory data, and the hottest data in the sets is cached in the corresponding c-HBM.
In one embodiment of the method of the present invention, when an access request issued by an upper computer system is received, in the hybrid memory controller:
the accessed address is obtained from the metadata of the access request and is input into the heat statistics module.
In the heat statistics module:
the heat statistics module divides the data into 3 types according to heat, namely hot, warm and cold, and the frequency of the data being accessed is stored in sequence. A heat monitor table is created at initialization, and a counter is assigned to each data block and page for calculating heat.
Specifically, the heat statistics module calculates heat information of the data through MEA (Majority Element Algorithm) algorithm. Let the number of data blocks marked hot be K and the total number of data blocks be N. The K data blocks marked hot are stored with a set T, which is initialized to an empty set. At the time of one access request, for all N data blocks, its counter is incremented by 1 if in T; if not in T and the number of T is less than K, adding the accessed data block into T and setting the counter to 1; if not in T, and the number of T is equal to K, the counter of all data blocks in T is decremented by 1. After all N data blocks have been processed, the heat statistics module obtains the hottest K data blocks, marks them as hot, and stores this information in the heat monitor table. And then the same method is used to obtain a data block marked as norm, and the data block is also recorded in the memory statistics table.
In the address translation module:
and the address conversion module reallocates the data according to the received heat information.
Specifically, as shown in fig. 4, when a data block in the off-chip memory is marked as hot, it is cached in the c-HBM; when the data block in the c-HBM is marked as cold, evicting it to off-chip memory; when a data block in the c-HBM is marked as norm, it is evicted to the m-HBM; when a data block in the m-HBM is marked cold, it is evicted to off-chip memory. And remains unchanged when the data page in the m-HBM is marked as hot and the data page in off-chip memory is marked as norm.
The data cached in the c-HBM is block-granularity, while the data in the m-HBM is page-granularity. When a data block in the c-HBM is evicted to the m-HBM and the page in which the data block is stored in off-chip memory, the data page needs to be migrated to the m-HBM.
After the data allocation is complete, the access request is redirected.
Specifically, as shown in fig. 5, an address translation table is maintained in the address translation module. Each data page will store a Remapped page number page number identifying the actual storage location of the data, e.g., page 17 data in off-chip memory is migrated to page 3 in the m-HBM, page 17 is marked with 3, and if no migration occurs, with-1; each data page will be marked with M bits as to whether each data block in the page is cached to the c-HBM.
When an access request arrives at an address conversion module and an address conversion table is updated, firstly inquiring whether a data block where an access address is located is cached to a c-HBM, and if so, replacing a target address of the access request with a corresponding address in the c-HBM; if not, the position of the data page is queried, and the target address of the access request is replaced by the corresponding address in the actual position of the data page.
In the migration control module:
results and sum of address translation module processing the access request is input to the migration control module.
If the dirty data block is evicted and the write-back address is the off-chip memory, directly writing the data block back to the off-chip memory; if dirty data is evicted and the write-back address is m-HBM, corresponding data pages in m-HBM and off-chip memory are read into the hybrid memory controller, and (3) finishing the write-back operation of the data block in the controller, and then respectively writing 2 data pages into the off-chip memory and the m-HBM to finish the migration operation.
If the access request does not trigger the cache, directly sending the access request to the hybrid memory; if the access request triggers the cache, the access request is added into a waiting queue, a data block where a target address of the access request is located is read to a migration control module, the access request directly completes read-write operation in a memory controller, and then the data block is written into a c-HBM.
Because the data which is required to be accessed by the access request in the waiting queue is read into the hybrid memory controller, the request can be completed in the controller directly without sending the access request into the hybrid memory, and the access delay is reduced.
In the hybrid memory system, the HBM is divided into the c-HBM and the m-HBM, so that the high bandwidth characteristic of the HBM is fully utilized, and the memory space is enlarged; by dividing the data into 3 types, the hotness identification of the data is finer, the access speed of the hot data is improved, and the cost of data migration is reduced; through migration and metadata storage optimization, the total access delay is reduced, and the overall performance of the hybrid memory system is effectively improved.
The present embodiment will be understood by those skilled in the art as more specific descriptions of embodiment 1 and embodiment 2.
Those skilled in the art will appreciate that the invention provides a system and its individual devices, modules, units, etc. that can be implemented entirely by logic programming of method steps, in addition to being implemented as pure computer readable program code, in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Therefore, the system and various devices, modules and units thereof provided by the invention can be regarded as a hardware component, and the devices, modules and units for realizing various functions included in the system can also be regarded as structures in the hardware component; means, modules, and units for implementing the various functions may also be considered as either software modules for implementing the methods or structures within hardware components.
The foregoing describes specific embodiments of the present invention. It is to be understood that the invention is not limited to the particular embodiments described above, and that various changes or modifications may be made by those skilled in the art within the scope of the appended claims without affecting the spirit of the invention. The embodiments of the present application and features in the embodiments may be combined with each other arbitrarily without conflict.

Claims (10)

1. The implementation method of the hybrid memory system facing the HBM is characterized by comprising the following steps:
step S1: initializing and establishing a hybrid memory controller, establishing a heat monitoring table and an address conversion table, and dividing regions on an HBM of the hybrid memory, wherein the regions comprise a c-HBM region and an m-HBM region;
step S2: the hybrid memory system receives the access request and updates the heat monitoring table;
step S3: the stored position of the data is redistributed according to the heat of the data, an address conversion table is updated, and the redirection of the access request is completed;
step S4: and performing data migration and caching according to the address conversion table, and completing the access request.
2. The method for implementing the HBM-oriented hybrid memory system according to claim 1, wherein said step S1 includes the steps of:
step S1.1: obtaining the capacity of the HBM and the capacity of the off-chip memory;
step S1.2: setting the number of group links; setting the sizes of c-HBM and m-HBM, and finishing division;
step S1.3: creating a heat statistics module, an address conversion module and a migration control module in the hybrid memory controller;
step S1.4: creating a heat monitoring table and an address conversion table;
step S1.5: and setting the sum of the capacity of the m-HBM and the capacity of the off-chip memory as the size of the hybrid memory, and informing the memory address space to an operating system.
3. The method for implementing the HBM-oriented hybrid memory system according to claim 1, wherein said step S2 comprises the steps of:
step S2.1: obtaining the accessed memory address from the access request;
step S2.2: dividing data into three types of cold, warm and hot according to access frequency, updating heat information of all data according to block granularity and page granularity according to an algorithm, and writing the updated data into a heat monitoring table;
step S2.3: and counting the data of which the heat information changes, and inputting the data into an address conversion module.
4. The method for implementing the HBM-oriented hybrid memory system according to claim 1, wherein said step S3 comprises the steps of:
step S3.1: receiving heat change information, judging whether the heat change information is cached to a c-HBM, evicting the c-HBM or carrying out data migration in an m-HBM and an off-chip memory according to the heat change of the data block, and inputting the data migration and the cached request into a migration control module;
step S3.2: updating an address conversion table, and marking the actual storage positions of all memory pages and whether all data blocks of the memory pages are cached to the c-HBM;
step S3.3: the access request is redirected.
5. The method for implementing the HBM-oriented hybrid memory system according to claim 4, wherein in step S3.1:
when the heat of the data in the off-chip memory becomes hot, caching the data to the c-HBM;
when the heat of the data in the c-HBM is changed into cold, the cold is ejected and written back to the off-chip memory, and existing data migration is performed;
when the heat of the data in the c-HBM becomes norm, the data is evicted and written back to the m-HBM, and existing data migration is performed;
when the heat of the data in the m-HBM is changed into cold, the cold is exchanged with the corresponding page in the off-chip memory, and data migration is performed.
6. The method for implementing the HBM-oriented hybrid memory system according to claim 1, wherein said step S4 comprises the steps of:
step S4.1: if the dirty data block is evicted from the c-HBM, the corresponding write-back and migration operation is completed;
step S4.2: and if the caching is triggered, finishing the caching operation of the data block where the access address is located, and finishing the access request.
7. The method for implementing the HBM-oriented hybrid memory system according to claim 6, wherein in step S4.1:
if the dirty data block is evicted and the write-back address is the off-chip memory, directly writing the data block back to the off-chip memory;
if dirty data are evicted and the write-back address is m-HBM, corresponding data pages in the m-HBM and the off-chip memory are read into the hybrid memory controller, the write-back operation of the data block is completed in the controller, and then 2 data pages are respectively written into the off-chip memory and the m-HBM to complete migration operation.
8. The method for implementing the HBM-oriented hybrid memory system according to claim 6, wherein in step S4.2:
if the access request does not trigger the cache, directly sending the access request to the hybrid memory;
if the access request triggers the cache, the access request is added into a waiting queue, a data block where a target address of the access request is located is read to a migration control module, the access request directly completes read-write operation in a memory controller, and then the data block is written into a c-HBM.
9. The implementation system of the hybrid memory system facing the HBM is characterized by comprising the following modules:
module M1: initializing and establishing a hybrid memory controller, establishing a heat monitoring table and an address conversion table, and dividing regions on an HBM of the hybrid memory, wherein the regions comprise a c-HBM region and an m-HBM region;
module M2: the hybrid memory system receives the access request and updates the heat monitoring table;
module M3: the stored position of the data is redistributed according to the heat of the data, an address conversion table is updated, and the redirection of the access request is completed;
module M4: and performing data migration and caching according to the address conversion table, and completing the access request.
10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method for implementing the HBM-oriented hybrid memory system according to any one of claims 1 to 8.
CN202310431062.7A 2023-04-20 2023-04-20 Implementation method, system and medium of hybrid memory system for HBM Pending CN116450045A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310431062.7A CN116450045A (en) 2023-04-20 2023-04-20 Implementation method, system and medium of hybrid memory system for HBM

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310431062.7A CN116450045A (en) 2023-04-20 2023-04-20 Implementation method, system and medium of hybrid memory system for HBM

Publications (1)

Publication Number Publication Date
CN116450045A true CN116450045A (en) 2023-07-18

Family

ID=87123485

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310431062.7A Pending CN116450045A (en) 2023-04-20 2023-04-20 Implementation method, system and medium of hybrid memory system for HBM

Country Status (1)

Country Link
CN (1) CN116450045A (en)

Similar Documents

Publication Publication Date Title
CN103608782B (en) Selective data storage in LSB page face and the MSB page
US10126964B2 (en) Hardware based map acceleration using forward and reverse cache tables
US10169232B2 (en) Associative and atomic write-back caching system and method for storage subsystem
JP3871331B2 (en) Cache controller
US11693775B2 (en) Adaptive cache
US9501419B2 (en) Apparatus, systems, and methods for providing a memory efficient cache
CN105095116A (en) Cache replacing method, cache controller and processor
CN104166634A (en) Management method of mapping table caches in solid-state disk system
CN107423229B (en) Buffer area improvement method for page-level FTL
US10997080B1 (en) Method and system for address table cache management based on correlation metric of first logical address and second logical address, wherein the correlation metric is incremented and decremented based on receive order of the first logical address and the second logical address
CN110795363B (en) Hot page prediction method and page scheduling method of storage medium
JP2003131946A (en) Method and device for controlling cache memory
CN108762671A (en) Mixing memory system and its management method based on PCM and DRAM
CN110532200B (en) Memory system based on hybrid memory architecture
CN100377117C (en) Method and device for converting virtual address, reading and writing high-speed buffer memory
CN110262982A (en) A kind of method of solid state hard disk address of cache
CN111143243A (en) Cache prefetching method and system based on NVM (non-volatile memory) hybrid memory
CN102681792B (en) Solid-state disk memory partition method
US10705977B2 (en) Method of dirty cache line eviction
CN111639037B (en) Dynamic allocation method and device for cache and DRAM-Less solid state disk
CN102354301B (en) Cache partitioning method
CN101853218B (en) Method and system for reading redundant array of inexpensive disks (RAID)
CN104077241A (en) Cache elimination algorithm switch processing method and device
CN108664217B (en) Caching method and system for reducing jitter of writing performance of solid-state disk storage system
CN108829343B (en) Cache optimization method based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination