CN117632043A - CXL memory module, control chip, data processing method, medium and system - Google Patents

CXL memory module, control chip, data processing method, medium and system Download PDF

Info

Publication number
CN117632043A
CN117632043A CN202410103990.5A CN202410103990A CN117632043A CN 117632043 A CN117632043 A CN 117632043A CN 202410103990 A CN202410103990 A CN 202410103990A CN 117632043 A CN117632043 A CN 117632043A
Authority
CN
China
Prior art keywords
data
cache
dram
address
control chip
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410103990.5A
Other languages
Chinese (zh)
Other versions
CN117632043B (en
Inventor
戴瑾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Superstring Academy of Memory Technology
Original Assignee
Beijing Superstring Academy of Memory Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Superstring Academy of Memory Technology filed Critical Beijing Superstring Academy of Memory Technology
Priority to CN202410103990.5A priority Critical patent/CN117632043B/en
Publication of CN117632043A publication Critical patent/CN117632043A/en
Application granted granted Critical
Publication of CN117632043B publication Critical patent/CN117632043B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0625Power saving in storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/325Power saving in peripheral device
    • G06F1/3275Power saving in memory, e.g. RAM, cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/0026PCI express
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The CXL memory module comprises a control chip and at least one group of DRAM chips managed by the control chip, wherein the control chip is provided with a cache, and is configured to read data from the DRAM chips and load the data into the cache, and record the mapping relation between the DRAM address and the cache address of the data; when a data access instruction sent by a host is received, the data in the cache is preferentially accessed, and when a plurality of CPU cores of the host access one address at the same time, only the data is read into the cache from the DRAM chip, and then the data in the cache is accessed, so that repeated reading and writing in the DRAM chip can be avoided, and the power consumption of the CXL memory module is reduced.

Description

CXL memory module, control chip, data processing method, medium and system
Technical Field
The embodiment of the disclosure relates to the technical field of data access, in particular to a CXL memory module, a control chip, a data processing method, a medium and a system.
Background
CXL (Compute Express Link, computing fast link) is a new PCIe (Peripheral Component Interconnect Express, peripheral component interconnect) physical layer based memory interface protocol, based on which CXL memory modules can extend the memory of a computer. The CXL memory module typically includes a CXL control chip and its managed DRAM (Dynamic Random Access Memory ) chips.
In practice, the host usually performs read-write operation on data with continuous addresses in the DRAM chip, and the current read-write protocol of the DRAM can avoid unnecessary repeated read-write in the DRAM chip, thereby playing a role in saving electricity. However, the current host CPU goes to a core, and when the core simultaneously reads and writes a block (bank) address in the DRAM chip, the current read and write mechanism is destroyed, so that a large amount of originally unnecessary repeated read and write operations are generated inside the DRAM chip, and the power consumption of the CXL memory module increases.
Disclosure of Invention
The embodiment of the disclosure provides a CXL memory module, a control chip, a data processing method, a medium and a system.
In a first aspect, an embodiment of the present disclosure provides a CXL memory module, including a control chip and at least one group of DRAM chips managed by the control chip, the control chip being provided with a cache, the control chip being configured to: reading data from the DRAM chip and loading the data into a cache, and recording the mapping relation between the DRAM address and the cache address of the data; when a data access instruction sent by the host is received, the data in the cache is preferentially accessed.
In a second aspect, an embodiment of the present disclosure provides a control chip applied to the CXL memory module in the foregoing embodiment, the CXL memory module including at least one group of DRAM chips, the control chip including a processor and a cache, wherein the processor is configured to: reading data from the DRAM chip, loading the data into a cache, and recording the mapping relation between the DRAM address and the cache address of the data; when a data access instruction sent by the host is received, the data in the cache is preferentially accessed.
In a third aspect, an embodiment of the present disclosure provides a data processing method of a CXL memory module, the CXL memory module including a control chip and at least one group of DRAM chips managed by the control chip, and a cache provided in the control chip, the method including: reading data from the DRAM chip, loading the data into a cache, and recording the mapping relation between the DRAM address and the cache address of the data; when a data access instruction sent by the host is received, the data in the cache is preferentially accessed.
In a fourth aspect, in an embodiment of the present disclosure, a non-transitory computer storage medium is provided, where a computer program is stored in the computer storage medium, and the computer program when executed by a processor implements a data processing method of the CXL memory module in the foregoing embodiment.
In a fifth aspect, an embodiment of the present disclosure provides a computer system, including a host and a CXL memory module in the foregoing embodiment, where the host communicates with the CXL memory module through a CXL interface, so as to implement discovery, configuration, and data transmission of the CXL memory module.
The control chip in the CXL memory module of the embodiment of the disclosure is provided with a cache, can read data from the DRAM chip and load the data into the cache, and records the mapping relation between the DRAM address and the cache address of the data. When the host needs to access the data in the DRAM chip, the control chip can access the data in the cache preferentially. When a plurality of CPU cores of the host machine access one address at the same time, only data is read into the cache from the DRAM chip, and then the data in the cache is accessed, so that repeated reading and writing in the DRAM chip can be avoided, the power consumption of the CXL memory module is reduced, and the data reading and writing speed of the CXL memory module is improved.
Additional features and advantages of the disclosure will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the disclosure. Other advantages of the present disclosure may be realized and attained by the structure particularly pointed out in the written description and drawings.
Drawings
The accompanying drawings are included to provide an understanding of the technical aspects of the present disclosure, and are incorporated in and constitute a part of this specification, illustrate the technical aspects of the present disclosure and together with the embodiments of the disclosure, not to limit the technical aspects of the present disclosure.
FIG. 1 is a schematic diagram of one embodiment of a CXL memory module of the present disclosure;
FIG. 2 is a schematic diagram of a shift register in an embodiment of a CXL memory module according to the present disclosure;
FIG. 3 is a schematic diagram of a shift register shift process in one embodiment of a CXL memory module according to the disclosure;
FIG. 4 is a schematic diagram of a shift register shift process in one embodiment of a CXL memory module according to the disclosure;
FIG. 5 is a schematic diagram of a shift register shift process in one embodiment of a CXL memory module according to the present disclosure;
FIG. 6 is a schematic diagram of a shift register shift process in one embodiment of a CXL memory module according to the disclosure;
FIG. 7 is a schematic diagram of the structure of one embodiment of a control chip of the present disclosure;
FIG. 8 is a flow chart of an embodiment of a method for processing data of a CXL memory module of the present disclosure;
FIG. 9 is a schematic diagram of one embodiment of a computer system of the present disclosure.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the present disclosure more apparent, embodiments of the present disclosure will be described in detail hereinafter with reference to the accompanying drawings. It should be noted that, without conflict, the embodiments of the present disclosure and features of the embodiments may be arbitrarily combined with each other.
Embodiments of the present disclosure are not necessarily limited to the dimensions shown in the drawings, the shapes and sizes of the various components in the drawings do not reflect true proportions. Furthermore, the drawings schematically show ideal examples, and the embodiments of the present disclosure are not limited to the shapes or the numerical values shown in the drawings.
The ordinal numbers such as "first," "second," etc., in this disclosure are provided to avoid intermixing of components and do not indicate any order, number, or importance.
In this disclosure, the terms "mounted," "connected," and "connected" are to be construed broadly, unless otherwise specifically indicated and defined. For example, it may be a fixed connection, a removable connection, or an integral connection; may be a mechanical connection, or an electrical connection; may be directly connected, or indirectly connected through intermediate members, or may be in communication with the interior of two elements. The specific meaning of the terms in this disclosure will be understood by those of ordinary skill in the art as the case may be.
The CXL memory module includes a control chip and a DRAM chip, the control chip may also be referred to as a CXL controller. In the related art, when a host needs to read and write data in a DRAM chip of a CXL memory module, a host CPU (Central Processing Unit ) sends a DRAM page address (i.e., row address) to a control chip of the CXL memory module, so as to instruct the DRAM chip to open an SA (Sense Amplifier) internally and hold the data in the entire DRAM page in the SA. Then, the host CPU sends a column address (i.e., column address) to the control chip to perform read and write operations on the data in the DRAM page. After the read and write operations are completed, the data of the DRAM page is written back to the memory cell. When a plurality of CPU cores perform read-write operation on one piece of data in the DRAM chip at the same time, each core needs to perform read-write operation on the data in the DRAM chip, so that repeated read-write operation inside the DRAM chip is caused.
In view of the foregoing, an embodiment of the present disclosure provides a CXL memory module, as shown in fig. 1, the CXL memory module includes a control chip 110 and at least one group of DRAM chips managed by the control chip 110, wherein the control chip 110 is provided with a cache 120, the control chip 110 is configured to read data from the DRAM chips and load the data into the cache 120, and record a mapping relationship between a DRAM address and a cache address of the data; when receiving the data access instruction sent by the host, the data in the cache 120 is preferentially accessed.
Generally, the control chip 110 is provided with a CXL interface based on the CXL protocol, so that the host CPU interacts with the control chip 110 through the CXL interface. For example, based on the cxl.io protocol, the host CPU may send a data access instruction to the control chip 110, where the data access instruction may include a read data instruction for reading data in the DRAM chip and a write data instruction for writing data in the DRAM chip (i.e., writing data into the DRAM chip); based on the cxl.mem protocol, the host CPU can read data from the cache memory 120 and the DRAM chip, and write external data to the cache memory 120 or the DRAM chip. A DDR (Double Data Rate) controller is also provided on the control chip 110 so that the CPU of the control chip 110 manages DRAM chips, which are generally divided into a plurality of DRAM pages. The control chip 110 may also include other hardware, such as registers for storing the mapping relationships.
The mapping relationship between the cache address of the data and the DRAM address may be recorded and stored in the control chip 110, where the cache address represents the storage address of the data in the cache 120, and the DRAM address represents the storage address of the data in the DRAM chip, and the mapping relationship may be stored in a specific register, or may be stored using a partial cache area. When the control chip 110 receives an access instruction carrying a DRAM address sent by the host CPU, it is determined whether the data to be accessed pointed to by the DRAM address has been loaded into the cache by comparing the DRAM address in the access instruction with the DRAM address in the mapping relationship. Here, loading means writing data into the cache, and may be, for example, reading data in the DRAM chip into the cache, or writing external data into the cache (i.e., writing data in the cache) by the host. As an example, the mapping relationship may take the form of a page table, with each entry in the page table recording a cache address and its corresponding DRAM address. By looking up the DRAM address in the page table, it is possible to confirm whether the data is stored in the cache. As an example, the DRAM address may be an address segment corresponding to a certain area in a DRAM page, or the DRAM address may also be an address of one DRAM page in a DRAM chip.
In this embodiment, when the control chip 110 receives a data access instruction of the host, the data in the cache is preferentially accessed. As an example, the host CPU may send a read data instruction carrying a DRAM address (i.e., the memory address of the data in the DRAM chip) to the control chip 110, and the CPU of the control chip 110 reads all the data in the DRAM page to which the DRAM address points into the cache 120. Subsequently, when the host CPU accesses the data in the DRAM page again, the corresponding data can be directly obtained from the cache 120, without reading and writing the DRAM chip, thereby reducing the number of repeated reading and writing of the DRAM chip.
For another example, when the host CPU needs to perform a write operation on data in the DRAM chip, the write operation may be that the host writes new data to the DRAM chip, or may modify or update a portion of the data in the DRAM chip. If the data pointed by the write operation is already loaded into the cache 120, the write operation can be directly performed on the data in the cache 120; if the data is not loaded into the cache 120, the DRAM page where the data is located may be read into the cache 120 from the DRAM chip, and then the data in the cache 120 may be written. After the write operation is completed, the data may be written back to the DRAM chip from the cache.
In an alternative embodiment, the data after the writing operation is completed may be maintained in the cache 120 for a period of time, so that when the writing operation instruction for the data is received again in the period of time, the writing operation may be directly performed on the data in the cache 120, without going through the read-write processing of the DRAM chip, so as to reduce the number of repeated read-write operations of the DRAM chip.
In this embodiment, the control chip is provided with a cache, and may load data from the DRAM chip into the cache, and record a mapping relationship between a DRAM address of the data and the cache address, so as to access the data in the cache preferentially when the host accesses the data in the DRAM chip. When a plurality of CPU cores of the host machine access one address at the same time, only data is loaded from the DRAM chip to the cache, and then the data in the cache is accessed, so that repeated reading and writing in the DRAM chip can be avoided, the power consumption of the CXL memory module is reduced, and the data reading and writing speed of the CXL memory module is improved.
In some exemplary embodiments, the DRAM chip includes a plurality of DRAM pages; the cache 120 includes a plurality of cache pages, and the control chip is configured to read all data in one DRAM page from the DRAM chip and load into one cache page of the cache.
As shown in FIG. 1, the cache 120 may be divided into a plurality of cache pages (cache pages). The size of the memory capacity of the cache pages may be determined according to the CXL protocol, for example, 4KB may be used so that each cache page may store all the data in one DRAM page.
In some exemplary embodiments, the control chip is further configured to: when the preset condition is met, the data in the cache is cleared and/or written back to the DRAM chip.
As an example, the preset condition may be that a write-back instruction of the host CPU or an instruction for cleaning up the cache is received, the cache space is insufficient or a preset cleaning period is reached, and the like, and different preset conditions may correspond to different processing manners. For example, when a write-back instruction carrying a DRAM address sent by a host CPU is received, the control chip may write back all data in a corresponding DRAM page in the cache to the DRAM chip; when the cache space is insufficient, the control chip can empty part or all of the data which is not subjected to the write operation in the cache, and write the data subjected to the write operation back to the DRAM chip; the control chip can also periodically clean or write back the data in the cache according to a preset period.
In some exemplary embodiments, when the control chip receives a data access instruction sent by the host, the control chip preferentially accesses the data in the cache, including: receiving a data access instruction carrying a first DRAM address, wherein the data access instruction is a data reading instruction or a data writing instruction; if the first cache address corresponding to the first DRAM address is found in the mapping relation, reading data from the cache or writing the data into the cache based on the first cache address.
In this embodiment, the host CPU may send, to the control chip, a data access instruction carrying a first DRAM address, which represents a storage address of data to be accessed in the DRAM chip, through the CXL interface, for example, the first DRAM address may be an address of a certain area in the DRAM chip, for example: a DRAM page address or an address of a small region within a page. The data access instruction herein may be a read data instruction to read data from a DRAM chip or a write data instruction to write data in a DRAM chip. The control chip may compare each DRAM address in the mapping relationship with the first DRAM address to determine whether the data to be accessed has been read into the cache. If the data to be accessed exists in the cache, the data to be accessed in the cache is directly accessed according to a first cache address corresponding to the first DRAM address. Therefore, on one hand, the response speed of the CXL memory module to the host instruction can be improved; on the other hand, the number of times of reading and writing of the DRAM chip can be reduced.
It should be noted that, when the host accesses the data in the DRAM chip, the data in the entire DRAM page (corresponding to one row in the memory array) is usually read at the same time inside the DRAM chip, so the data to be accessed may be part or all of the data in the DRAM page.
In other embodiments, if the first DRAM address is not found in the mapping relationship, it means that the data pointed by the first DRAM address is not read into the cache, at this time, the data to be accessed in the DRAM chip may be directly accessed according to the first DRAM address, or the data is loaded from the DRAM chip into the cache according to the first DRAM address, and then the data in the cache is accessed.
When the data to be accessed does not exist in the cache, the control chip can directly access the data in the DRAM chip according to the first DRAM address in the data access instruction. For random read-write events that happen by accident to the host CPU, this way of processing can avoid unnecessary processing steps. In addition, the control chip can read the data to be accessed into the cache and process the data to be accessed in the cache. By adopting the processing mode, the data to be accessed can be repeatedly processed in the cache later, so that the repeated read-write times of the DRAM chip are reduced.
In other exemplary embodiments, the control chip reads data from the DRAM chip and loads the data into the cache, records a mapping relationship between a DRAM address and a cache address of the data, and includes: responding to a data access instruction, and searching a first DRAM address in a mapping relation; if the mapping relation does not have the first DRAM address, determining whether the access times for the first DRAM address in a preset time period is larger than a preset time threshold; if the access times are greater than the times threshold, reading all data in the DRAM pages pointed by the first DRAM address from the DRAM chip and loading the data into the cached cache pages.
As an example, when the control chip receives the data access instruction, it may be determined whether the data to be accessed pointed to by the first DRAM address has been loaded into the cache by looking up the mapping relationship. If the mapping relation does not have the first DRAM address, the data to be accessed pointed by the first DRAM address is not loaded into the cache, and at this time, the number of times of access to the first DRAM address in a preset time period can be counted. If the access frequency of the first DRAM address is greater than the preset frequency threshold, all data in the DRAM page pointed by the first DRAM address can be read from the DRAM chip and loaded into the cache page, so that the data to be accessed in the cache can be directly accessed later.
In this embodiment, after the control chip does not find the first DRAM address in the mapping relationship, before or after counting the number of accesses, the control chip may directly access the data to be accessed in the DRAM chip, or may access the data to be accessed in the cache when the condition is met. Or if the data access instruction is a read data instruction, the data to be accessed can be returned to the host CPU through the transmission queue of the CXL interface while the data to be accessed is loaded into the cache.
If the access times are smaller than or equal to the times threshold, the data in the DRAM chip are not required to be loaded into the cache, and the data to be accessed in the DRAM chip are directly accessed according to the first DRAM address.
In this embodiment, the first DRAM address is searched in the mapping relationship, and it may be determined whether the data to be accessed pointed to by the first DRAM address has been loaded into the cache. The access times aiming at the first DRAM address can represent the processing frequency of the data, and the data with higher processing frequency is read into the cache and then is processed, so that the repeated read-write times of the DRAM chip can be greatly reduced; and for data with lower processing frequency, the data is directly processed in the DRAM chip, so that the processing steps can be simplified, and the processing efficiency can be improved. The method is favorable for more accurately identifying random read-write events and improving the flexibility of CXL memory module for processing data.
The preset time period and the frequency threshold can be set according to the application scene, the performance parameter or experience of the CXL memory module. As an example, the preset time period may be 1 day, and the number of times threshold may be 1, and the effect that may be achieved at this time is: if the data in the DRAM is accessed for the first time every day, the control chip will not read the data into the cache, so that the extra processing steps and the cache pressure caused by the random read-write event are avoided.
In some optional implementations of this embodiment, in a case where the data access instruction is a read data instruction and the first DRAM address does not exist in the mapping relationship, the control chip may first read data from the DRAM chip based on the first DRAM address and return the data to the host, and then determine whether the number of accesses is greater than the number threshold.
In this embodiment, in order to reduce the time delay when the host reads data from the CXL memory chip, the data may be returned to the host directly from the DRAM chip in response to the data reading instruction of the host, and then the number of accesses may be counted, which is helpful to increase the response speed of the CXL memory module.
In some exemplary embodiments, the control chip further includes a shift register set configured to store a mapping relationship of the cache address and the DRAM address.
In this embodiment, a shift register set is used to store the mapping relationship between the cache address and the DRAM address, and each register may store one cache address and its corresponding DRAM address. The arrangement sequence of the data in the mapping relation can be changed through the shift operation of the shift register, so that the control chip can conveniently adjust the ordering of the data in the mapping relation according to the requirement. The priority of the data is represented by the ordering of the data in the mapping relation, so that the control chip can conveniently process the data in the cache according to the priority of the data.
As an example, when a certain data in the cache (generally, data in one cache page) is not accessed for a long time (may be a read operation or a write operation), the control chip may shift the cache address and the DRAM address corresponding to the data from the register in front to the register in back by using the shift process of the shift register, and subsequently when the control chip cleans the data in the cache, the cache addresses stored in the register may be sequentially read in order from back to front, and the corresponding data in the cache may be sequentially cleaned, so that the data without the read/write operation for a long time may be preferentially cleaned.
In some optional implementations of this embodiment, each register in the shift register set stores mapping information for storing one data, the mapping information including a cache address, a DRAM address, and a write operation identifier, wherein the write operation identifier characterizes whether a write operation has been performed on the data in the cache page.
In the present embodiment, one data represents data in one cache page/DRAM page. Correspondingly, the cache address represents the address of the cache page in the cache, and the DRAM address represents the address of the data in the DRAM chip.
As shown in fig. 2, the shift register set may include a plurality of registers arranged in sequence, each of the registers may include a plurality of flip-flops, each flip-flop stores one bit of binary information, wherein a first or last flip-flop is used as a write operation identification bit, and the rest of the flip-flops are used for storing a cache address and a DRAM address of data. For example, a first trigger of 0 in the register indicates that the write operation is invalid, i.e., the data in the cache page has not been subjected to the write operation; the first trigger of 1 indicates that the write operation is valid, i.e., the data in the cache page is written.
In this embodiment, whether the data in the cache is written corresponds to different processing modes, for example, after the data in the cache is written, the data in the corresponding cache page needs to be written back to the DRAM chip later, and the data not written back is not needed. The control chip determines whether the data in the buffer memory is subjected to the writing operation or not by identifying the writing operation identification in the shift register, so as to determine a processing mode matched with the data.
In some optional implementations of this embodiment, the control chip is further configured to: when the writing operation is executed on the data in the cache, the writing operation identification corresponding to the cache page where the data is located is updated.
For example, when writing operation is performed on data in the cache, a writing operation identifier in a register where a cache address of the data is located may be set to be valid, which characterizes that the data in a cache page is subjected to writing operation processing, so that it may be ensured that the writing operation identifier is synchronous with a processing procedure of the data, and further, an improper processing manner is avoided to be adopted on the data in the cache.
In some optional implementations of this embodiment, the shift register set may store the mapping relationship between the cache address and the DRAM address of the data in the following manner: and sequentially storing the mapping relation information of each data and the writing operation identification thereof into each register of the shift register group according to the sequence of the data read-in cache, wherein the arrangement sequence of each register in the shift register group is opposite to the sequence of the data read-in cache.
By adopting the storage mode in the embodiment, the sequence of reading the data into the cache can be reflected through the sequence of the register in the shift register group. Taking fig. 2 as an example, the mapping information may include a write operation identification and an address pair (i.e., a cache address and a DRAM address) representation. Registers 210, 220, and 230 store address pair 2, address pair 1, and address pair 3, respectively, and the respective corresponding write operation identifiers, where the sequence of reading the data corresponding to address pair 1, address pair 2, and address pair 3 in the cache is address pair 3, address pair 1, and address pair 2 in sequence.
In some embodiments, the mapping relationship between the DRAM address and the cache address of the control chip record data comprises: performing shift operation on the shift register set to vacate a first register of the shift register set; and writing the DRAM address and the cache address of the data into the first register, and setting a writing operation identifier in the first register.
In this embodiment, the shift operation refers to shifting the entire mapping relation information stored in one register to another register, thereby changing the storage position of the mapping relation information of the data in the shift register group (i.e., the order of the registers storing the mapping relation information in the shift register group). The first register represents the first register in the shift register set. The ordering of the registers may characterize the priority of the data stored in the cache, and the priority of the data may be adjusted by a shift operation of the shift register set, so that the control chip processes the data in the cache according to the priority.
In a specific example, when new data is loaded into the cache (for example, data may be read from the DRAM chip, or external data may be written into the cache when the host CPU performs a write operation on the data in the cache), the control chip performs the following operations: determining mapping relation information corresponding to the new data; shifting the mapping relation information stored in each register in the shift register group backwards to vacate the first register; and storing the mapping relation information of the new data into a first register.
As exemplarily described with reference to fig. 2, 3 and 4, the mapping relation information stored in the shift register group is shown in fig. 2. When new data is loaded into the cache, the control chip may determine the DRAM page address where the new data is located and the cache page address where the new data is stored, generate a new address pair 4, and set the write operation identifier of the new address pair 4 to be invalid. Then, address pair 2, address pair 1 and address pair 3, and the respective write operation identifiers are sequentially shifted backward, and at this time, the mapping relationship information stored in each register in the shift register set is shown in fig. 3, registers 220, 230 and 240 store address pair 2, address pair 1 and address pair 3, and the respective write operation identifiers, respectively, while register 210 is in an idle state. After that, the new address pair 4 and the new write operation identifier thereof are stored in the register 210, and the shift register set after the shift operation is completed is shown in fig. 4, and the registers 210, 220, 230 and 240 store the new address pair 4, the address pair 2, the address pair 1 and the address pair 3, respectively.
In this example, the mapping relationship information corresponding to the data newly loaded into the cache may be stored in the first register through the shift operation, thereby giving higher priority to the new data.
In another specific example, when the data in the cache is written, the control chip may further perform the following operation on the shift register group: determining a target register in which target mapping relation information of the data is located, and emptying the target register; shifting the mapping relation information stored in the register located before the target register in the shift register group backwards to empty the first register; and storing the target mapping relation information into a first register.
In this example, the control chip may compare the first DRAM address in the data access instruction with the DRAM addresses in the address pairs stored in the respective registers in the shift register set, i.e., determine the destination register.
As illustrated in connection with fig. 2, 5 and 6, the mapping relationship information stored in each register of the shift register group is shown in fig. 2. When the data in the cache page corresponding to address pair 1 is accessed (which may be a read operation or a write operation), the control chip may determine the register 220 as a target register, extract the target mapping relationship information (i.e., address pair 1 and its write operation identifier) therefrom, then shift the address pair 2 and its write operation identifier back, and store them in the register 220, where the storage states of the respective registers are as shown in fig. 5. Thereafter, the target mapping information of interest is stored in the register 210, and the storage state of each register after the shift is completed is as shown in fig. 6.
In this example, the mapping relationship information of the accessed data may be stored in the head register by the shift operation, thereby giving higher priority to the data where the read-write operation has occurred.
In some exemplary embodiments, the control chip is further configured to: when the number of the idle cache pages in the cache is smaller than a preset storage threshold value, determining whether the cache pages to be processed and the corresponding DRAM pages thereof and the data in the cache pages to be processed have an over-write operation according to the mapping relation information stored in the last non-idle register in the shift register group; and if the data in the cache page to be processed is not subjected to the write operation, the data in the cache page to be processed and the non-idle register are emptied.
In this embodiment, the storage threshold may be set according to the size of the cache space, a specific application scenario, performance of the host, or task requirements, and the smaller the storage threshold, the larger the data amount stored in the cache, the higher the utilization rate of the cache. Typically, the storage threshold may be 1 or 2.
In some optional implementations of this embodiment, if the data in the to-be-processed cache page is written, the data in the to-be-processed cache page is written back to the corresponding DRAM page, and the data in the non-idle register is cleared. If the data in the cache page to be processed is subjected to over-write operation, the data in the cache page to be processed is written back to the DRAM chip.
Continuing with the exemplary illustration of FIG. 2, register 240 is the last register in the shift register set, but does not store mapping information. Register 230 is the last non-idle register in the shift register set. Assuming that the storage threshold is 2, when the number of free cache pages is less than 2, the control chip may determine the cache pages to be processed and the corresponding DRAM pages thereof according to the mapping relationship information stored in the register 230, and then determine whether an over-write operation occurs to the cache pages to be processed according to the write operation identifier. If the write operation is identified as invalid, i.e., the pending cache page has not been written, the pending cache page and the data stored in register 230 are flushed. If the write operation flag is valid, that is, the pending cache page is over-written, all data in the pending cache page is written back to the corresponding DRAM page, and the mapping relationship information stored in the register 230 is cleared.
In this embodiment, the order of each register in the shift register group may represent the priority of the data stored in each cache page, and the priority of the data corresponding to the mapping relationship information stored in the shift register group is lower than that stored in the later register. When the allowance of the buffer space is smaller, the control chip can determine a buffer page to be processed according to the buffer address in the mapping relation information stored in the last non-idle register, and then identify the writing operation identifier to determine whether the data in the buffer page to be processed is subjected to the writing operation; if the data in the cache page to be processed is not subjected to the write operation, the data in the cache page to be processed can be emptied. Therefore, the data with lower priority can be cleaned, and the available storage space of the cache and the utilization rate of the data can be considered.
The embodiment of the disclosure further provides a control chip, which is applied to the CXL memory module in any of the embodiments, the CXL memory module including at least one group of DRAM chips, the control chip including a processor and a cache, wherein the processor is configured to: reading data from the DRAM chip, loading the data into a cache, and recording the mapping relation between the DRAM address and the cache address of the data; when a data access instruction sent by the host is received, the data in the cache is preferentially accessed.
As shown in fig. 7, the control chip may include a processor 710, a buffer 720, a CXL interface for connection with a host CPU, a DDR controller for connection with a DRAM chip, and other hardware that may be registers.
In this embodiment, the control chip may load data from the DRAM chip into the cache, and record a mapping relationship between a DRAM address and a cache address of the data, so as to access the data in the cache preferentially when the host accesses the data in the DRAM chip. When a plurality of CPU cores of the host machine access one address at the same time, only data is loaded into the cache from the DRAM chip, and then the data in the cache is accessed, so that repeated reading and writing in the DRAM chip can be avoided, the power consumption of the CXL memory module is reduced, and the data reading and writing speed of the CXL memory module is improved.
In some alternative implementations of the present embodiment, the DRAM chip includes a plurality of DRAM pages, the cache includes a plurality of cache pages, and the processor is configured to read all data in one DRAM page from the DRAM chip and load into one cache page of the cache.
In some embodiments, the processor is further configured to: when the preset condition is met, the data in the cache is cleared and/or written back to the DRAM chip.
As an example, the preset condition may be that a write-back instruction of the host CPU or an instruction for cleaning up the cache is received, the cache space is insufficient or a preset cleaning period is reached, and the like, and different preset conditions may correspond to different processing manners. For example, when a write-back instruction carrying a DRAM address sent by a host CPU is received, the control chip may write back all data in a corresponding DRAM page in the cache to the DRAM chip; when the cache space is insufficient, the control chip can empty part or all of the data which is not subjected to the write operation in the cache, and write the data subjected to the write operation back to the DRAM chip; the control chip can also periodically clean or write back the data in the cache according to a preset period.
In some embodiments, when the processor receives a data access instruction sent by the host, the processor preferentially accesses the data in the cache, including: receiving a data access instruction carrying a first DRAM address, wherein the data access instruction is a data reading instruction or a data writing instruction; if the first cache address corresponding to the first DRAM address is found in the mapping relation, reading data from the cache or writing the data into the cache based on the first cache address.
In this embodiment, the processor may determine whether the data pointed by the data access instruction is loaded into the cache by comparing the first DRAM address with each of the DRAM addresses in the mapping relationship. If the first DRAM address is found in the mapping relation, the data in the cache is accessed according to the first cache address corresponding to the first DRAM, so that the response speed of the CXL memory module to the host instruction can be improved, and the repeated read-write times of the DRAM chip can be reduced.
In other embodiments, if the first DRAM address is not found in the mapping relationship, it means that the data pointed by the first DRAM address is not read into the cache, at this time, the data to be accessed in the DRAM chip may be directly accessed according to the first DRAM address, or the data is loaded from the DRAM chip into the cache according to the first DRAM address, and then the data in the cache is accessed.
When the data to be accessed does not exist in the cache, the control chip can directly access the data in the DRAM chip according to the first DRAM address in the data access instruction. For random read-write events that happen by accident to the host CPU, this way of processing can avoid unnecessary processing steps. In addition, the control chip can read the data to be accessed into the cache and process the data to be accessed in the cache. By adopting the processing mode, the data to be accessed can be repeatedly processed in the cache later, so that the repeated read-write times of the DRAM chip are reduced.
In some embodiments, the processor reads data from the DRAM chip and loads the data into the cache, and records a mapping relationship of DRAM addresses and cache addresses of the data, including: responding to a data access instruction, and searching a first DRAM address in a mapping relation; if the mapping relation does not have the first DRAM address, determining whether the access times for the first DRAM address in a preset time period is larger than a preset time threshold; if the access times are greater than the times threshold, reading all data in the DRAM pages pointed by the first DRAM address from the DRAM chip and loading the data into the cached cache pages.
In this embodiment, after the control chip does not find the first DRAM address in the mapping relationship, before or after counting the number of accesses, the control chip may directly access the data to be accessed in the DRAM chip, or may access the data to be accessed in the cache when the condition is met. Or if the data access instruction is a read data instruction, the data to be accessed can be returned to the host CPU through the transmission queue of the CXL interface while the data to be accessed is loaded into the cache.
If the access times are smaller than or equal to the times threshold, the data in the DRAM chip are not required to be loaded into the cache, and the data to be accessed in the DRAM chip are directly accessed according to the first DRAM address.
In this embodiment, the processor searches the mapping relation for the first DRAM address, and may determine whether the data to be accessed pointed to by the first DRAM address has already been loaded into the cache. The access times aiming at the first DRAM address can represent the processing frequency of the data, and the data with higher processing frequency is read into the cache and then is processed, so that the repeated read-write times of the DRAM chip can be greatly reduced; and for data with lower processing frequency, the data is directly processed in the DRAM chip, so that the processing steps can be simplified, and the processing efficiency can be improved. The method is favorable for more accurately identifying random read-write events and improving the flexibility of CXL memory module for processing data.
The preset time period and the frequency threshold can be set according to the application scene, the performance parameter or experience of the CXL memory module. As an example, the preset time period may be 1 day, and the number of times threshold may be 1, and the effect that may be achieved at this time is: if the data in the DRAM is accessed for the first time every day, the control chip will not read the data into the cache, so that the extra processing steps and the cache pressure caused by the random read-write event are avoided.
In some embodiments, where the data access instruction is a read data instruction and the first DRAM address is not present in the mapping relationship, the processor may first read data from the DRAM chip based on the first DRAM address and return to the host, and then determine whether the number of accesses is greater than the number of times threshold.
In this embodiment, in order to reduce the time delay when the host reads data from the CXL memory chip, the data may be returned to the host directly from the DRAM chip in response to the data reading instruction of the host, and then the number of accesses is counted, which is helpful to increase the response speed of the CXL memory module.
In some embodiments, the control chip further comprises a shift register set; and, the processor records a mapping relationship between a DRAM address and a cache address of the data, comprising: and storing mapping relation information of one data by utilizing each register in the shift register group, wherein the mapping relation information comprises a cache address, a DRAM address and a write operation identifier, and the write operation identifier represents whether write operation is performed on the data in a cache page pointed by the cache address.
As an example, the mapping relationship may be stored using a shift register group in the following manner: and sequentially storing the mapping relation information of each data and the writing operation identification thereof into each register of the shift register group according to the sequence of the data read-in cache, wherein the arrangement sequence of each register in the shift register group is opposite to the sequence of the data read-in cache. In this way, the sequence of reading the data into the buffer can be reflected by the sequence of the registers in the shift register set.
In this embodiment, a shift register set is used to store the mapping relationship between the cache address and the DRAM address, and each register may store one cache address and its corresponding DRAM address. The arrangement sequence of the data in the mapping relation can be changed through the shift operation of the shift register group, so that the control chip can conveniently adjust the ordering of the data in the mapping relation according to the requirement. The priority of the data is represented by the ordering of the data in the mapping relation, so that the control chip can conveniently process the data in the cache according to the priority of the data.
In some embodiments, the processor records a mapping relationship of a DRAM address and a cache address of the data, further comprising: performing shift operation on the shift register set to vacate a first register of the shift register set; and writing the DRAM address and the cache address of the data into the first register, and setting a writing operation identifier in the first register.
The shift operation refers to transferring the mapping relation information stored in one register as a whole to another register, thereby changing the storage position of the mapping relation information of the data in the shift register group (i.e., the order of the registers storing the mapping relation information in the shift register group). The first register represents the first register in the shift register set. The ordering of the registers may characterize the priority of the data stored in the cache, and the priority of the data may be adjusted by a shift operation of the shift register set, so that the control chip processes the data in the cache according to the priority.
In some embodiments, the processor is further configured to: when the number of the idle cache pages in the cache is smaller than a storage threshold value, determining whether the cache pages to be processed and the corresponding DRAM pages thereof and the data in the cache pages to be processed have over-write operation according to the mapping relation information and the write operation identification which are arranged in the last non-idle register in the shift register group; and if the data in the cache page to be processed is not subjected to the write operation, the cache page to be processed and the non-idle register are emptied. If the data in the cache page to be processed is subjected to over-write operation, the data in the cache page to be processed is written back to the DRAM chip.
In some optional implementations of this embodiment, if the data in the to-be-processed cache page is written, the data in the to-be-processed cache page is written back to the corresponding DRAM page, and the data in the non-idle register is cleared.
In this embodiment, the order of each register in the shift register group may represent the priority of the data stored in each cache page, and the priority of the data corresponding to the mapping relationship information stored in the shift register group is lower than that stored in the later register. When the allowance of the cache space is smaller, the processor can determine a cache page to be processed according to the cache address in the mapping relation information stored in the last non-idle register, and then identify the writing operation identifier to determine whether the data in the cache page to be processed is subjected to the writing operation; if the data in the cache page to be processed is not subjected to the write operation, the data in the cache page to be processed can be emptied. Therefore, the data with lower priority can be cleaned, and the available storage space of the cache and the utilization rate of the data can be considered.
Based on the CXL memory module in the foregoing embodiment, the embodiment of the disclosure further provides a data processing method for the CXL memory module, as shown in fig. 8, the method includes the following steps.
Step 810, reading data from the DRAM chip and loading the data into a cache, and recording the mapping relation between the DRAM address and the cache address of the data.
In this embodiment, the CXL memory module includes a control chip and at least one group of DRAM chips managed by the control chip, and a cache is provided in the control chip.
Step 820, when receiving the data access instruction sent by the host, accessing the data in the cache preferentially.
In this embodiment, data is loaded into the cache from the DRAM chip, and the mapping relationship between the DRAM address and the cache address of the data is recorded, so that when the host accesses the data in the DRAM chip, the host accesses the data in the cache preferentially. When a plurality of CPU cores of the host machine access one address at the same time, only data is loaded into the cache from the DRAM chip, and then the data in the cache is accessed, so that repeated reading and writing in the DRAM chip can be avoided, the power consumption of the CXL memory module is reduced, and the data reading and writing speed of the CXL memory module is improved.
In some alternative implementations of the present embodiment, the DRAM chip includes a plurality of DRAM pages, the cache includes a plurality of cache pages, and the step 810 may load the data in the DRAM chip into the cache as follows: all data in one DRAM page is read from the DRAM chip and loaded into one cache page of the cache.
In some embodiments, the method further comprises: and when the preset condition is met, clearing and/or writing the data in the cache back to the DRAM chip.
As an example, the preset condition may be that a write-back instruction of the host CPU or an instruction for cleaning up the cache is received, the cache space is insufficient or a preset cleaning period is reached, and the like, and different preset conditions may correspond to different processing manners. For example, when a write-back instruction carrying a DRAM address sent by a host CPU is received, the control chip may write back all data in a corresponding DRAM page in the cache to the DRAM chip; when the cache space is insufficient, the control chip can empty part or all of the data which is not subjected to the write operation in the cache, and write the data subjected to the write operation back to the DRAM chip; the control chip can also periodically clean or write back the data in the cache according to a preset period
In some embodiments, step 820 described above includes: receiving a data access instruction carrying a first DRAM address, wherein the data access instruction is a data reading instruction or a data writing instruction; if the first cache address corresponding to the first DRAM address is found in the mapping relation, reading data from the cache or writing the data into the cache based on the first cache address.
In this embodiment, by comparing the first DRAM address with each of the DRAM addresses in the mapping relationship, it is determined whether the data pointed by the data access instruction is loaded into the cache. If the first DRAM address is found in the mapping relation, the data in the cache is accessed according to the first cache address corresponding to the first DRAM, so that the response speed of the CXL memory module to the host instruction can be improved, and the repeated read-write times of the DRAM chip can be reduced.
In other embodiments, if the first DRAM address is not found in the mapping relationship, it means that the data pointed by the first DRAM address is not read into the cache, at this time, the data to be accessed in the DRAM chip may be directly accessed according to the first DRAM address, or the data is loaded from the DRAM chip into the cache according to the first DRAM address, and then the data in the cache is accessed.
When the data to be accessed does not exist in the cache, the data in the DRAM chip can be directly accessed according to the first DRAM address in the data access instruction. For random read-write events that happen by accident to the host CPU, this way of processing can avoid unnecessary processing steps. In addition, the control chip can read the data to be accessed into the cache and process the data to be accessed in the cache. By adopting the processing mode, the data to be accessed can be repeatedly processed in the cache later, so that the repeated read-write times of the DRAM chip are reduced.
In some embodiments, step 820 described above includes: responding to a data access instruction, and searching a first DRAM address in a mapping relation; if the mapping relation does not have the first DRAM address, determining whether the access times for the first DRAM address in a preset time period is larger than a preset time threshold; if the access times are greater than the times threshold, reading all data in the DRAM pages pointed by the first DRAM address from the DRAM chip and loading the data into the cached cache pages.
In this embodiment, if the data access instruction is a read data instruction, the data to be accessed may be loaded into the cache, and may also be returned to the host CPU through the transmit queue of the CXL interface.
If the access times are smaller than or equal to the times threshold, the data in the DRAM chip are not required to be loaded into the cache, and the data to be accessed in the DRAM chip are directly accessed according to the first DRAM address.
In this embodiment, the processor searches the mapping relation for the first DRAM address, and may determine whether the data to be accessed pointed to by the first DRAM address has already been loaded into the cache. The access times aiming at the first DRAM address can represent the processing frequency of the data, and the data with higher processing frequency is read into the cache and then is processed, so that the repeated read-write times of the DRAM chip can be greatly reduced; and for data with lower processing frequency, the data is directly processed in the DRAM chip, so that the processing steps can be simplified, and the processing efficiency can be improved. The method is favorable for more accurately identifying random read-write events and improving the flexibility of CXL memory module for processing data.
In some embodiments, in the case where the data access instruction is a read data instruction and the first DRAM address is not present in the mapping relationship, data may be read from the DRAM chip based on the first DRAM address and returned to the host, and then it may be determined whether the number of accesses is greater than the number of times threshold.
In this embodiment, in order to reduce the time delay when the host reads data from the CXL memory chip, the data may be returned to the host directly from the DRAM chip in response to the data reading instruction of the host, and then the number of accesses is counted, which is helpful to increase the response speed of the CXL memory module.
In some embodiments, the control chip further comprises a shift register set; and, the step 820 may record the mapping relationship between the DRAM address and the cache address of the data by: and storing mapping relation information of one data by utilizing each register in the shift register group, wherein the mapping relation information comprises a cache address, a DRAM address and a write operation identifier, and the write operation identifier represents whether write operation is performed on the data in a cache page pointed by the cache address.
In this embodiment, a shift register set is used to store the mapping relationship between the cache address and the DRAM address, and each register may store one cache address and its corresponding DRAM address. The arrangement sequence of the data in the mapping relation can be changed through the shift operation of the shift register group, so that the control chip can conveniently adjust the ordering of the data in the mapping relation according to the requirement. The priority of the data is represented by the ordering of the data in the mapping relation, so that the control chip can conveniently process the data in the cache according to the priority of the data.
In some embodiments, the step 820 further includes: performing shift operation on the shift register set to vacate a first register of the shift register set; and writing the DRAM address and the cache address of the data into the first register, and setting a writing operation identifier in the first register.
The shift operation refers to transferring the mapping relation information stored in one register as a whole to another register, thereby changing the storage position of the mapping relation information of the data in the shift register group (i.e., the order of the registers storing the mapping relation information in the shift register group). The first register represents the first register in the shift register set. The ordering of the registers may characterize the priority of the data stored in the cache, and the priority of the data may be adjusted by a shift operation of the shift register set, so that the control chip processes the data in the cache according to the priority.
In some embodiments, the method further comprises: when the number of the idle cache pages in the cache is smaller than a storage threshold value, determining whether the cache pages to be processed and the corresponding DRAM pages thereof and the data in the cache pages to be processed have over-write operation according to the mapping relation information and the write operation identification which are arranged in the last non-idle register in the shift register group; and if the data in the cache page to be processed is not subjected to the write operation, the cache page to be processed and the non-idle register are emptied.
In some optional implementations of this embodiment, if the data in the to-be-processed cache page is written, the data in the to-be-processed cache page is written back to the corresponding DRAM page, and the data in the non-idle register is cleared.
In this embodiment, the order of each register in the shift register group may represent the priority of the data stored in each cache page, and the priority of the data corresponding to the mapping relationship information stored in the shift register group is lower than that stored in the later register. When the allowance of the cache space is smaller, the processor can determine a cache page to be processed according to the cache address in the mapping relation information stored in the last non-idle register, and then identify the writing operation identifier to determine whether the data in the cache page to be processed is subjected to the writing operation; if the data in the cache page to be processed is not subjected to the write operation, the data in the cache page to be processed can be emptied. Therefore, the data with lower priority can be cleaned, and the available storage space of the cache and the utilization rate of the data can be considered.
The embodiment of the disclosure also provides a non-transitory computer storage medium, and the computer readable storage medium stores a computer program, and when the computer program is executed by a processor, the data processing method of the CXL memory module in the foregoing embodiment is implemented.
The embodiment of the disclosure further provides a computer system, as shown in fig. 9, where the computer system includes a host 910 and the CXL memory module 920 in any of the foregoing embodiments, where the host 910 communicates with the CXL memory module 920 through a CXL interface, so as to implement discovery, configuration, and data transmission of the CXL memory module 920.
In this embodiment, the CXL memory module 920 may read the data in the DRAM chip into the cache, and when the host 910 needs to access the data in the CXL memory module 920, the data in the cache is preferentially read and written, so that the number of repeated read and write operations of the DRAM chip may be reduced.
Those of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.

Claims (28)

1. The CXL memory module comprises a control chip and at least one group of DRAM chips managed by the control chip, and is characterized in that the control chip is provided with a cache;
the control chip is configured to: reading data from the DRAM chip and loading the data into the cache, and recording the mapping relation between the DRAM address and the cache address of the data; when a data access instruction sent by a host is received, the data in the cache is preferentially accessed.
2. The CXL memory module of claim 1, wherein the DRAM chip comprises a plurality of DRAM pages, the cache comprising a plurality of cache pages;
the control chip reads data from the DRAM chip and loads the data into the cache, and the control chip comprises: all data of one DRAM page is read from the DRAM chip and loaded into one cache page of the cache.
3. The CXL memory module of claim 2, wherein the control chip is further configured to: and when the preset condition is met, clearing and/or writing the data in the cache back to the DRAM chip.
4. The CXL memory module of claim 2, wherein when the control chip receives a data access instruction sent by the host, the control chip preferentially accesses the data in the cache, comprising:
Receiving a data access instruction carrying a first DRAM address, wherein the data access instruction is a read data instruction or a write data instruction;
and if the first cache address corresponding to the first DRAM address is found in the mapping relation, reading data from the cache or writing the data into the cache based on the first cache address.
5. The CXL memory module of claim 4, wherein the control chip reads data from the DRAM chip and loads the data into the cache memory, records a mapping relationship between a DRAM address and a cache address of the data, comprising:
responding to the data access instruction, and searching the first DRAM address in the mapping relation;
if the first DRAM address does not exist in the mapping relation, determining whether the access frequency of the first DRAM address in a preset time period is greater than a preset frequency threshold;
and if the access times are greater than the times threshold, reading all data in the DRAM pages pointed by the first DRAM address from the DRAM chip and loading the data into the cached pages.
6. The CXL memory module of claim 5, wherein the control chip configured to determine whether the number of accesses to the first DRAM address within the predetermined time period is greater than a predetermined number of times threshold comprises:
And under the condition that the data access instruction is a data reading instruction and the first DRAM address does not exist in the mapping relation, firstly reading data from the DRAM chip based on the first DRAM address and returning the data to a host, and then determining whether the access times are larger than the times threshold.
7. The CXL memory module of any one of claims 1 to 6, wherein the control chip further comprises a shift register bank configured to store the mapping relationship.
8. The CXL memory module of claim 7, wherein each register in the shift register bank is configured to store mapping information for one data, the mapping information comprising a cache address, a DRAM address, and a write operation identifier, wherein the write operation identifier characterizes whether a write operation has been performed on the data in the cache page to which the cache address points.
9. The CXL memory module of claim 8, wherein the control chip records a mapping relationship between DRAM addresses and cache addresses of the data, comprising:
performing shift operation on the shift register set to free up a first register of the shift register set;
And writing the DRAM address and the cache address of the data into the first register, and setting a writing operation identifier in the first register.
10. The CXL memory module of claim 9, wherein the control chip is further configured to:
when the number of the idle cache pages in the cache is smaller than a storage threshold value, determining whether the cache pages to be processed and the corresponding DRAM pages thereof and the data in the cache pages to be processed have an over-write operation according to the mapping relation information and the write operation identifier which are stored in the last non-idle register in the shift register group;
if the data in the cache page to be processed is not subjected to write operation, the cache page to be processed and the non-idle register are emptied; and if the data in the cache page to be processed is subjected to over-write operation, writing the data in the cache page to be processed back to the DRAM chip.
11. A control chip for use in the CXL memory module of any one of claims 1 to 10, the CXL memory module comprising at least one group of DRAM chips, the control chip comprising a processor and a cache, wherein the processor is configured to:
Reading data from the DRAM chip, loading the data into the cache, and recording the mapping relation between the DRAM address and the cache address of the data;
when a data access instruction sent by a host is received, the data in the cache is preferentially accessed.
12. The control chip of claim 11, wherein the processor is further configured to: and when the preset condition is met, clearing and/or writing the data in the cache back to the DRAM chip.
13. The control chip of claim 11, wherein when the processor receives a data access instruction sent by the host, the processor preferentially accesses the data in the cache, and the method comprises:
receiving a data access instruction carrying a first DRAM address, wherein the data access instruction is a read data instruction or a write data instruction;
and if the first cache address corresponding to the first DRAM address is found in the mapping relation, reading data from the cache or writing the data into the cache based on the first cache address.
14. The control chip of claim 13, wherein the processor reads data from the DRAM chip and loads the data into the cache and records a mapping relationship between a DRAM address and a cache address of the data, comprising:
Responding to the data access instruction, and searching the first DRAM address in the mapping relation;
if the first DRAM address does not exist in the mapping relation, determining whether the access frequency of the first DRAM address in a preset time period is greater than a preset frequency threshold;
and if the access times are greater than the times threshold, reading all data in the DRAM pages pointed by the first DRAM address from the DRAM chip and loading the data into the cached pages.
15. The control chip of claim 14, wherein the processor being configured to determine whether the number of accesses to the first DRAM address within a preset time period is greater than a preset number threshold comprises:
and under the condition that the data access instruction is a data reading instruction and the first DRAM address does not exist in the mapping relation, firstly reading data from the DRAM chip based on the first DRAM address and returning the data to a host, and then determining whether the access times are larger than the times threshold.
16. The control chip according to one of claims 11 to 15, characterized in that the control chip further comprises a shift register set; the method comprises the steps of,
The processor records the mapping relation between the DRAM address and the cache address of the data, and comprises the following steps: and storing mapping relation information of one data by utilizing each register in the shift register group, wherein the mapping relation information comprises a cache address, a DRAM address and a write operation identifier, and the write operation identifier represents whether to execute over-write operation on the data in a cache page pointed by the cache address.
17. The control chip of claim 16, wherein the processor records a mapping relationship between a DRAM address and a cache address of data, further comprising:
performing shift operation on the shift register set to free up a first register of the shift register set;
and writing the DRAM address and the cache address of the data into the first register, and setting a writing operation identifier in the first register.
18. The control chip of claim 17, wherein the processor is further configured to:
when the number of the idle cache pages in the cache is smaller than a storage threshold value, determining whether the cache pages to be processed and the corresponding DRAM pages thereof and the data in the cache pages to be processed have an over-write operation according to the mapping relation information and the write operation identifier which are stored in the last non-idle register in the shift register group;
If the data in the cache page to be processed is not subjected to write operation, the cache page to be processed and the non-idle register are emptied; and if the data in the cache page to be processed is subjected to over-write operation, writing the data in the cache page to be processed back to the DRAM chip.
19. A data processing method of a CXL memory module, the CXL memory module including a control chip and at least one group of DRAM chips managed by the control chip, and the control chip being provided with a cache, the method comprising:
reading data from the DRAM chip and loading the data into the cache, and recording the mapping relation between the DRAM address and the cache address of the data;
when a data access instruction sent by a host is received, the data in the cache is preferentially accessed.
20. The method of claim 19, wherein the method further comprises: and when the preset condition is met, clearing and/or writing the data in the cache back to the DRAM chip.
21. The method of claim 19, wherein when receiving a data access command sent by a host, preferentially accessing the data in the cache, comprising:
Receiving a data access instruction carrying a first DRAM address, wherein the data access instruction is a read data instruction or a write data instruction;
and if the first cache address corresponding to the first DRAM address is found in the mapping relation, reading data from the cache or writing the data into the cache based on the first cache address.
22. The method of claim 21, wherein reading data from the DRAM chip and loading the data into the cache, recording a mapping of a DRAM address and a cache address of the data, comprises:
responding to the data access instruction, and searching the first DRAM address in the mapping relation;
if the first DRAM address does not exist in the mapping relation, determining whether the access frequency of the first DRAM address in a preset time period is greater than a preset frequency threshold;
and if the access times are greater than the times threshold, reading all data in the DRAM pages pointed by the first DRAM address from the DRAM chip and loading the data into the cached pages.
23. The method of claim 22, wherein determining whether the number of accesses to the first DRAM address within a predetermined period of time is greater than a predetermined number of times threshold if the first DRAM address is not present in the mapping relationship comprises:
And under the condition that the data access instruction is a data reading instruction and the first DRAM address does not exist in the mapping relation, firstly reading data from the DRAM chip based on the first DRAM address and returning the data to a host, and then determining whether the access times are larger than the times threshold.
24. The method according to one of claims 19 to 23, wherein the control chip further comprises a shift register bank; the method comprises the steps of,
the mapping relation between the DRAM address and the cache address of the recorded data comprises the following steps: and storing mapping relation information of one data by utilizing each register in the shift register group, wherein the mapping relation information comprises a cache address, a DRAM address and a write operation identifier, and the write operation identifier represents whether to execute over-write operation on the data in a cache page pointed by the cache address.
25. The method of claim 24, wherein recording the mapping of the DRAM address and the cache address of the data further comprises:
performing shift operation on the shift register set to free up a first register of the shift register set;
and writing the DRAM address and the cache address of the data into the first register, and setting a writing operation identifier in the first register.
26. The method of claim 25, wherein the method further comprises: when the number of the idle cache pages in the cache is smaller than a storage threshold value, determining whether the cache pages to be processed and the corresponding DRAM pages thereof and the data in the cache pages to be processed have an over-write operation according to the mapping relation information and the write operation identifier which are stored in the last non-idle register in the shift register group;
if the data in the cache page to be processed is not subjected to write operation, the cache page to be processed and the non-idle register are emptied; and if the data in the cache page to be processed is subjected to over-write operation, writing the data in the cache page to be processed back to the DRAM chip.
27. A non-transitory computer storage medium storing a computer program, wherein the computer program when executed by a processor implements a data processing method of the CXL memory module of any one of claims 19 to 26.
28. A computer system comprising a host and the CXL memory module of any one of claims 1 to 10, the host communicating with the CXL memory module via a CXL interface for discovery, configuration, and data transfer to the CXL memory module.
CN202410103990.5A 2024-01-25 2024-01-25 CXL memory module, control chip, data processing method, medium and system Active CN117632043B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410103990.5A CN117632043B (en) 2024-01-25 2024-01-25 CXL memory module, control chip, data processing method, medium and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410103990.5A CN117632043B (en) 2024-01-25 2024-01-25 CXL memory module, control chip, data processing method, medium and system

Publications (2)

Publication Number Publication Date
CN117632043A true CN117632043A (en) 2024-03-01
CN117632043B CN117632043B (en) 2024-05-28

Family

ID=90032500

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410103990.5A Active CN117632043B (en) 2024-01-25 2024-01-25 CXL memory module, control chip, data processing method, medium and system

Country Status (1)

Country Link
CN (1) CN117632043B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117827548A (en) * 2024-03-06 2024-04-05 北京超弦存储器研究院 Data backup method, CXL controller, CXL module and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023019537A1 (en) * 2021-08-20 2023-02-23 Intel Corporation Apparatuses, methods, and systems for device translation lookaside buffer pre-translation instruction and extensions to input/output memory management unit protocols
WO2023165543A1 (en) * 2022-03-02 2023-09-07 华为技术有限公司 Shared cache management method and apparatus, and storage medium
WO2023186143A1 (en) * 2022-03-31 2023-10-05 华为技术有限公司 Data processing method, host, and related device
WO2023227004A1 (en) * 2022-05-24 2023-11-30 华为技术有限公司 Memory access popularity statistical method, related apparatus and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023019537A1 (en) * 2021-08-20 2023-02-23 Intel Corporation Apparatuses, methods, and systems for device translation lookaside buffer pre-translation instruction and extensions to input/output memory management unit protocols
WO2023165543A1 (en) * 2022-03-02 2023-09-07 华为技术有限公司 Shared cache management method and apparatus, and storage medium
WO2023186143A1 (en) * 2022-03-31 2023-10-05 华为技术有限公司 Data processing method, host, and related device
WO2023227004A1 (en) * 2022-05-24 2023-11-30 华为技术有限公司 Memory access popularity statistical method, related apparatus and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
老狼: "澜起科技发布全球首款 CXL 内存扩展控制器芯片,该芯片都有哪些值得关注的亮点?", Retrieved from the Internet <URL:https://www.zhihu.com/question/531720207/answer/2521601976> *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117827548A (en) * 2024-03-06 2024-04-05 北京超弦存储器研究院 Data backup method, CXL controller, CXL module and storage medium
CN117827548B (en) * 2024-03-06 2024-05-24 北京超弦存储器研究院 Data backup method, CXL controller, CXL module and storage medium

Also Published As

Publication number Publication date
CN117632043B (en) 2024-05-28

Similar Documents

Publication Publication Date Title
CN101446924B (en) Method and system for storing and obtaining data
US7076598B2 (en) Pipeline accessing method to a large block memory
CN117632043B (en) CXL memory module, control chip, data processing method, medium and system
CN112262365B (en) Latency indication in a memory system or subsystem
US9053019B2 (en) Non-volatile memory device, a data processing device using the same, and a swapping method used by the data processing and non-volatile memory devices
US11188262B2 (en) Memory system including a nonvolatile memory and a volatile memory, and processing method using the memory system
CN107329704B (en) Cache mirroring method and controller
EP2397945A1 (en) Programming method and device for a buffer cache in a solid-state disk system
CN103345368B (en) Data caching method in buffer storage
CN104239229A (en) Data storage device and data reading method for flash memory
KR20130112210A (en) Page replace method and memory system using the same
US20200104072A1 (en) Data management method and storage controller using the same
JP2009163647A (en) Disk array device
CN105980992A (en) Controller, flash memory device, method for identifying data block stability and method for storing data on flash memory device
US11550508B2 (en) Semiconductor storage device and control method thereof
US8370564B2 (en) Access control device, information processing device, access control program and access control method
CN110275678B (en) STT-MRAM-based solid state memory device random access performance improvement method
CN110543433A (en) Data migration method and device of hybrid memory
US6202134B1 (en) Paging processing system in virtual storage device and paging processing method thereof
CN115826882B (en) Storage method, device, equipment and storage medium
CN111367474B (en) Embedded memory oriented FAT file system post-allocation method and system
JP4095840B2 (en) Cache memory management method
CN115576863A (en) Data reading and writing method, storage device and storage medium
CN115878311A (en) Computing node cluster, data aggregation method and related equipment
WO2020001665A2 (en) On-chip cache and integrated chip

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant